Agentic AI for PHM Pipeline Development: A Human-Supervised Case Study in the PHME 2026 Data Challenge

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jul 3, 2026
Takanobu Minami Jay Lee

Abstract

Developing an effective prognostics and health management (PHM) prediction pipeline requires iterative decisions on data structure, degradation behavior, feature representation, evaluation metrics, and failure cases. This paper reports a human-supervised Agentic AI workflow for PHM pipeline development through a case study in the PHME 2026 Data Challenge, which focused on remaining useful life (RUL) prediction for a subway door servomotor system.

In the proposed workflow, a human researcher supervised the overall exploration, while AI-assisted roles supported planning, experiment specification, implementation, review, case-wise error analysis, and knowledge accumulation. Through this process, we developed a case-aware RUL prediction pipeline combining health-indicator-based estimation, operating-condition-aware correction, training-data-based calibration, temporal consistency checks, and submission-format verification. The final submission achieved an official challenge score of 0.9983.

This case study suggests that Agentic AI can help structure, accelerate, and document iterative machine learning research, while human supervision remains essential for selecting directions, managing risks, and interpreting results.

How to Cite

Minami, T. ., & Lee, J. (2026). Agentic AI for PHM Pipeline Development: A Human-Supervised Case Study in the PHME 2026 Data Challenge. PHM Society European Conference, 9(1), 1–8. https://doi.org/10.36001/phme.2026.v9i1.4990
Abstract 0 |

##plugins.themes.bootstrap3.article.details##

Keywords

Agentic AI, RUL Prediction, PHM

References
Berghout, T., & Benbouzid, M. (2022). A systematic guide for predicting remaining useful life with machine learning. Electronics, 11(7), 1125.

Ferreira, C., & Gonçalves, G. (2022). Remaining useful life prediction and challenges: A literature review on the use of machine learning methods. Journal of Manufacturing Systems, 63, 550–562.

Huang, Q., Vora, J., Liang, P., & Leskovec, J. (2024). MLAgentBench: Evaluating language agents on machine learning experimentation. In Proceedings of the International Conference on Machine Learning (ICML).

Lee, J., & Su, H. (2024). A unified industrial large knowledge model framework in Industry 4.0 and smart manufacturing. International Journal of AI for Materials and Design, 1(2), 41–47.

Lei, Y., Li, N., Guo, L., Li, N., Yan, T., & Lin, J. (2018). Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mechanical Systems and Signal Processing, 104, 799–834.

Liu, Y., Wen, J., & Wang, G. (2025). A comprehensive overview of remaining useful life prediction: From traditional literature review to scientometric analysis. Machine Learning with Applications, 100704.

PHME26. (2026). PHM Europe 2026 Conference Data Challenge. Retrieved from https://data.phmsociety.org/phm-europe-2026-conference-data-challenge/

Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: Language agents with verbal reinforcement learning. In Advances in Neural Information Processing Systems.

Soualhi, M., Nguyen, K. T., Medjaher, K., Nejjari, F., Puig, V., Blesa, J., ... Marlasca, F. (2023). Dealing with prognostics uncertainties: Combination of direct and recursive remaining useful life estimations. Computers in Industry, 144, 103766.

Wu, F., Wu, Q., Tan, Y., & Xu, X. (2024a). Remaining useful life prediction based on deep learning: A survey. Sensors, 24(11), 3454.

Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E., Jiang, L., Zhang, X., Zhang, S., Liu, J., Awadallah, A. H., White, R. W., Burger, D., & Wang, C. (2024b). AutoGen: Enabling next-gen LLM applications via multi-agent conversation. In Proceedings of the Conference on Language Modeling (COLM).

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing reasoning and acting in language models. In Proceedings of the International Conference on Learning Representations (ICLR).
Section
Data Challenge Papers