Comparison of binary classifiers for data-driven prognosis of jet engines health
A reliable prognosis is crucial to manage asset health and predict maintenance needs of large civil jet engines, which in turn contribute to enhanced aircraft airworthiness, longer time on wing and optimized lifecycle costs. With the accumulation of large amount of data over the last decade, one can relate the number of components serviced during a maintenance visit to the history of parameters inside and outside the engine (temperatures, pressure, shaft rotation speeds, vibration levels, etc.). While established statistical models had been developed for small samples, more recent computer-intensive statistical techniques from the field of Machine Learning (ML) can handle more complex datasets. In particular, binary classifiers constitute an attractive option to predict the probability of servicing the components of a given jet engine at the next maintenance visit. This paper demonstrates the validity of such data-driven methods on an industrial case study involving failures of thousands of compressor blades in aeronautical turbomachines. The prediction accuracy obtained with the ML techniques presents a significant improvement over the state-of-the-art. Moreover, the performance of six binary classifiers with different characteristics - logistic regression, support vector machines, classification trees, random forests, gradient boosted trees and neural networks - was compared according to four qualitative and quantitative criteria. Results show that there is no clear winner, although ensemble models based on trees (random forests and boosted trees) offer a good overall compromise while neural networks offer the best absolute performance. In the industrial world, the business objectives, the environment in which the models are deployed and the users’ skills should dictate the choice of the most adequate statistical technique.
How to Cite
predictive maintenance, data-driven prognosis, jet engine health, binary classifier
Caesarendra, W., Widodo, A., & Yang, B.-S. (2010) Application of relevance vector machine and logistic regression for machine degradation assessment. Mechanical Systems and Signal Processing, 24(4), pp. 1161-1171. doi:10.1016/j.ymssp.2009.10.011
Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3), 27.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, vol. 27 (8), pp. 861-874. doi:10.1016/j.patrec.2005.10.010
Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5): pp. 1189–1232. doi:10.1214/aos/1013203451
Goebel, K., Saha, B., & Saxena, A. (2008). A Comparison of Three Data-Driven Techniques for Prognostics, Proceedings of the 62nd Meeting of the Society For Machinery Failure Prevention Technology (MFPT) (119-131), May 6-8, Virginia Beach, VA.
Grogger, J. & Carson, R. (1991). Models for truncated counts. Journal of Applied Econometrics, 6(3), pp. 225-238. doi: 10.2307/2096628
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer Verlag
Huang, J., Lu, J., & Ling, C.X. (2003). Comparing naive Bayes, decision trees, and svm with auc and accuracy. ICDM ’03: Proceedings of the Third IEEE International Conference on Data Mining, November 19-22, Melbourne, FL.
Jardine, A. K. S., Lin, D. & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical System and Signal Processing, vol. 20, pp. 1483-1510. doi: 10.1016/j.ymssp.2005.09.012
Jordan, W.E. (1972). Failure Modes, Effects, and Criticality Analyses. Proceedings of the Annual Reliability and Maintainability Symposium (30-37), January 25-27, San Francisco, CA.
Kalbfleisch, J. D., & Prentice, R. L. (2011). The statistical analysis of failure time data. Hoboken, NJ: John Wiley & Sons, Inc.
Kim, H-E., Tan, A C. C., Mathew, J., Kim, E Y. H., & Choi, B-K. (2008). Machine prognostics based on health state estimation using SVM . In Gao, J., Lee, J., Ma, L., & Mathew, J. (Eds.) Proceedings Third World Congress on Engineering Asset Management and Intelligent Maintenance Systems Conference (834-845). Beijing, China
Lawless, J. F. (2003). Statistical Models and Methods for Lifetime Data (2nd ed.). Hoboken, NJ: John Wiley and Sons, Inc. ISBN 0471372153.
Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34(1), pp. 1-14. doi: 10.1080/00401706.1992.10485228
Meeker, W.Q., & Escobar, L.A. (1998). Statistical Methods for Reliability Data. Hoboken, NJ: John Wiley and Sons, Inc. ISBN: 978-0-471-14328-4
Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In Shavlik, J. (Eds.), Proceedings of ICML-98 (445-453). San Francisco, CA: Morgan Kaufmann.
Rafiee, J., Arvani, F., Harifi, A. & Sadeghi, M.H. (2007). Intelligent condition monitoring of a gearbox using artificial neural network. Mechanical Systems and Signal Processing, 21(4), pp. 1746-1754. doi:10.1016/j.ymssp.2006.08.005
Saxena, A., Goebel, K., Simon, D., & Eklund, N. (2008). Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation, 1st International Conference on Prognostics and Health Management (PHM08) (1-9), October 6-9, Denver, CO.
Vapnik, V. (1996). The Nature of Statistical Learning Theory, New York: Springer
The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:
As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.
First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.