Data-Driven Prognostics and Diagnostics of Industrial Machinery — A Turbofan Engine Case Study



Published Sep 4, 2023
Russell Graves Peeyush Pankaj Vineet J Kuruvilla Rachel Johnson Michio Inoue


A machine’s Remaining Useful Life (RUL) is the expected life or usage time remaining before the machine requires repair or replacement. In data-driven methods, typical RUL estimation is performed using models trained with health condition indicator values derived from measured system data. A significant challenge in developing an RUL estimation model is transforming large, multivariate, noisy sensor datasets into useful format(s) that make the data analysis and processing pipeline efficient and extract valuable condition indicators from the data. This work uses the N-CMAPSS dataset to explore options and implications for efficiently organizing and storing large time-series datasets to support prognostics and diagnostics applications. We extend the work to demonstrate a predictive maintenance workflow and solution to (1) detect and classify faults in a turbofan engine and (2) estimate the RUL once we detect performance degradation.

Under data engineering, we investigate the impact of various file formats and file types on memory and execution time when dealing with large datasets like N-CMAPSS. We analyze, pre-process, and extract/engineer critical features from the transformed dataset by leveraging our understanding of gas turbines' operation (e.g., Brayton Cycle). We also analyze the performance of various engine submodules for different flight phases (climb, cruise, and descent). This work also explains an approach to down-sample the time series data without losing information relevant to our goals. Using the health condition indicators derived and synthesized in the data engineering stage, we train machine learning models for diagnostics (differentiate between healthy operation and seven different types of faults in the turbofan engine) and prognostics (RUL estimation).  

Abstract 135 | PDF Downloads 114



RUL estimation, N-CMAPSS, diagnostics

MathWorks User Stories, ‘Mondi Implements Statistics-Based Health Monitoring and Predictive Maintenance for Manufacturing Processes with Machine Learning’, ndi-implements-statistics-based-health-monitoringand-predictive-maintenance-for-manufacturingprocesses-with-machine-learning.html (Accessed on Oct 27th, 2022).

MathWorks Stories, ‘Baker Hughes Develops Predictive Maintenance Software for Gas and Oil Extraction Equipment Using Data Analytics and Machine Learning’, _stories/baker-hughes-develops-predictivemaintenance-software-for-gas-and-oil-extractionequipment-using-data-analytics-and-machinelearning.html (Accessed on Oct 27th, 2022).

Manuel Arias Chao, Chetan Kulkarni, Kai Goebel, and Olga Fink. (2021) Aircraft Engine Run-to-Failure Dataset under Real Flight Conditions for Prognostics and Diagnostics. Data, 6(1):5, 2021.

Saxena, A.; Goebel, K.; Simon, D.; Eklund, N.(2008), Damage propagation modeling for aircraft engine runto-failure simulation. In Proceedings of the 2008 International Conference on Prognostics and Health Management, Denver, CO, USA, 6–9 October 2008 pp. 1–9.

MATLAB. version 9.14.0 (R2023a). Natick, Massachusetts: The MathWorks Inc.; 2023

MATLAB R2023a Documentation, ex.html (Accessed on 07 May 2023)

Ideal Brayton Cycle,, NASA (Accessed on 05 April 2022)

MathWorks, Classification Learner – Choose Classifier Options, (Accessed on 07 May 2023)

MathWorks, RUL Estimation using RUL Estimator Models, (Accessed on 07 May 2023)

MathWorks, Multi-Class Detection Using Simulated Data, (Accessed on 15 July 2023)
Special Session Papers