A Novel Ensemble Clustering for Operational Transients Classification with Application to a Nuclear Power Plant Turbine



Published Nov 3, 2020
Sameer Al-Dahidi Francesco Di Maio Piero Baraldi Enrico Zio Redouane Seraoui


The objective of the present work is to develop a novel approach for combining in an ensemble multiple base clusterings of operational transients of industrial equipment, when the number of clusters in the final consensus clustering is unknown. A measure of pairwise similarity is used to quantify the co-association matrix that describes the similarity among the different base clusterings. Then, a Spectral Clustering technique of literature, embedding the unsupervised K-Means algorithm, is applied to the coassociation matrix for finding the optimum number of clusters of the final consensus clustering, based on Silhouette validity index calculation. The proposed approach is developed with reference to an artificial case
study, properly designed to mimic the signal trend behavior of a Nuclear Power Plant (NPP) turbine during shut-down. The results of the artificial case have been compared with those achieved by a state-of-art approach, known as Clusterbased Similarity Partitioning and Serial Graph Partitioning and Fill-reducing Matrix Ordering Algorithms (CSPAMETIS). The comparison shows that the proposed approach is able to identify a final consensus clustering that classifies the transients with better accuracy and robustness compared to the CSPA-METIS approach. The approach is, then, validated on an industrial case concerning 149 shut-down transients of a NPP turbine.

Abstract 730 | PDF Downloads 192



Unsupervised Learning, Ensemble Clustering, Final Consensus Clustering, Spectral Clustering, Operational Transients, Nuclear Power Plant (NPP) turbine shut-down

Ahuja, S., & Dhanya, C. T. (2012). Regionalization of Rainfall Using RCDA Cluster Ensemble Algorithm in India. Journal of Software Engineering and Applications, vol. 5 (8), pp. 568-573. doi:
Al-Dahidi, S., Baraldi, P., Di Maio, F., & Zio, E. (2014). A novel fault detection system taking into account uncertainties in the reconstructed signals. Annals of Nuclear Energy, vol. 73, pp. 131–144. doi:10.1016/j.anucene.2014.06.036
Al-Dahidi, S. (2014). The Use of Self Organizing Maps for Diagnosing Faults in Motor Bearings. Safety and Reliability: Methodology and Applications- Proceedings of the European Safety and Reliability Conference, ESREL 2014 (895-902), September 14-18, Wroclaw, Poland.
Ayad, H. G., & Kamel, M. S. (2010). On voting-based consensus of cluster ensembles. Pattern Recognition, vol. 43(5), pp. 1943-1953.
Baraldi, P., Di Maio, F., & Zio, E. (2012). Unsupervised clustering for fault diagnosis. Proceedings of Prognostics and System Health Management Conference (PHM‐2012 IEEE Conference) (1-9), May 23-25, Beijing, China.
Baraldi, P., Di Maio, F., Rigamonti, M., Zio, E., & Seraoui, R. (2013a). Unsupervised clustering of vibration signals for identifying anomalous conditions in a nuclear turbine. Special Issue RACR2013, on the Journal of Intelligent and Fuzzy Systems (JIFS). doi: 10.3233/IFS-141459
Baraldi, P., Di Maio, F., Rigamonti, M., Zio, E., & Seraoui, R. (2013b). Clustering for unsupervised fault diagnosis in nuclear turbine shut-down transients. Mechanical Systems and Signal Processing, Available online 16 January 2015. doi: 10.1016/j.ymssp.2014.12.018
Baraldi, P., Di Maio, F., & Zio, E. (2013c). Unsupervised Clustering for Fault Diagnosis in Nuclear Power Plant Components. International Journal of Computational Intelligence Systems, vol. 6 (4), pp. 764-777.
Barnard, S. T., & Simon, H. D. (1994). Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency: Practice and Experience, vol. 6(2), pp. 101-117.
Baruah, P., & Chinnam, R. B. (2005). HMMs for diagnostics and prognostics in machining processes. International Journal of Production Research, vol. 43(6), pp. 1275-1293.
Betta, G., Liguori, C., Paolillo, A., Pietrosanto, A. (2002). A DSP-based FFT-Analyzer for the fault diagnosis of rotating machine based on vibration analysis. IEEE Transactions on instrumentation and measurements, vol. 51(6), pp. 1316-1322.
Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, New York.
Bhavaraju K. M., Kankar, P. K., Sharma, S. C., & Harsha, S. P. (2010). A Comparative Study on Bearings Faults Classification by Artificial Neural Networks and Self- Organizing Maps using Wavelets. International Journal of Engineering Science and Technology, vol. 2(5), pp. 1001-1008.
Bocaniala, C.D., Sa Da Costa, J., & Palade, V. (2004). A novel fuzzy classification solution for fault diagnosis. Journal of Intelligent and Fuzzy Systems, vol. 15 (3-4), pp. 195-205.
Bolotin, V.V., & Shipkov, A.A. (1998). A model of the environmentally affected growth of fatigue cracks. Journal of Applied Mathematics and Mechanics, vol. 62(2), pp. 289-296. doi:10.1016/S0021-8928(98)00037-9
Bui, T., & Jones, C. (1993). A Heuristic for Reducing Fill- In in Sparse Matrix Factorization. In 6th SIAM Conference Parallel Processing for Scientific Computing (445–452), March 22-24, Norfolk, Virginia, USA.
Chakaravathy, S. V., & Ghosh, J. (1996). Scale based clustering using a radial basis function network. IEEE Transactions on Neural Networks, vol. 2(5), pp. 1250– 61. doi: 10.1109/72.536318
Chaovalit, P., & Zhou, L. (2005). Movie review mining: A comparison between supervised and unsupervised classification approaches. In System Sciences, 2005. HICSS'05. Proceedings of the 38th Annual Hawaii International Conference on (112c-112c). IEEE, January 3-6, Big Island, Hawaii. doi: 10.1109/HICSS.2005.445
Charrad, M., Lechevallier, Y., Ahmed, M. B., Saporta, G. (2010). On the Number of Clusters in Block Clustering Algorithms. In 23rd International FLAIRS Conference (392-397), May 19-21, Florida, USA.
Chatterjee, S., & Mukhopadhyay, A. (2013). Clustering Ensemble: A Multiobjective Genetic Algorithm based Approach. Procedia Technology, vol. 10, pp. 443-449. doi:10.1016/j.protcy.2013.12.381
Chen, K. (2007). Trends in neural computation. Springer. Datta, A., Mavroidis, C., & Hosek, M. (2007). A Role of Unsupervised Clustering for Intelligent Fault Diagnosis. In ASME 2007 International Mechanical Engineering Congress and Exposition. , vol. 9: Mechanical Systems and Control, pp. 687-695. doi:10.1115/IMECE2007-43492.
Davies, D.L., & Bouldin, D.W. (1979). A cluster separation measure. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. PAMI-1(2), pp. 224-227. doi: 10.1109/TPAMI.1979.4766909
Di Maio, F., Hu, J., Tse, P., Pecht, M., Tsui, K., & Zio, E. (2012). Ensemble-approaches for clustering health status of oil sand pumps. Expert Systems with Applications, vol. 39(5), pp. 4847-4859.
Di Maio, F., Nicola, G., Zio, E., & Yu, Y. (2014). Ensemble-based sensitivity analysis of a Best Estimate Thermal Hydraulics model: Application to a Passive Containment Cooling System of an AP1000 Nuclear Power Plant. Annals of Nuclear Energy, vol. 73, November 2014, pp. 200-210. doi:10.1016/j.anucene.2014.06.043
Dimitriadou, E., Weingessel, A., & Homik, K. (2001). Voting-merging: an ensemble method for clustering. In Proc. 2001 International Conference Artificial Neural Networks (ICANN'01) (217-224), August 21–25, Vienna, Austria. doi : 10.1007/3-540-44668-0_31
Dudoit, S., & Fridlyand, J. (2003). Bagging to improve the accuracy of a clustering procedure. Bioinformatics, vol. 19(9), pp. 1090-1099.
Fern, X. Z., & Lin, W. (2008). Cluster ensemble selection. Statistical Analysis and Data Mining, vol. 1(3), pp. 128-141.
Figueiredo, M. A., & Jain, A. K. (2002). Unsupervised learning of finite mixture models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24(3), pp. 381-396. doi: 10.1109/34.990138
Fred, A. L., & Jain, A. K. (2005). Combining multiple clusterings using evidence accumulation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27(6), pp. 835-850.
Ghaemi, R., Sulaiman, M. N., Ibrahim, H., & Mustapha, N. (2009). A survey: clustering ensembles techniques. World Academy of Science, Engineering and Technology, vol. 50, pp. 636-645.
Ghaemi, R., bin Sulaiman, N., Ibrahim, H., & Mustapha, N. (2011). A review: accuracy optimization in clustering ensembles using genetic algorithms. Artificial Intelligence Review, vol. 35(4), pp. 287-318.
Gonçalves, L. F., Bosa, J. L., Balen, T. R., Lubaszewski, M. S., Schneider, E. L., & Henriques, R. V. (2011). Fault detection, diagnosis and prediction in electrical valves using self-organizing maps. Journal of Electronic Testing, vol. 27(4), pp. 551-564.
Greene, D., & Cunningham, P. (2007). Constraint selection by committee: An ensemble approach to identifying informative constraints for semi-supervised clustering. In Machine Learning: ECML 2007, pp. 140-151. Springer Berlin Heidelberg.
Hartigan, J. (1975). CLUSTERING ALGORITHMS. New York, Wiley.
Iqbal, A. M., Moh'd, A., & Khan, Z. (2012). Semisupervised clustering ensemble by voting. In: Proceeding of the International Conference on Information and Communication Systems (ICICS
2009) (1–5), December 8-10, Macau, China.
Jardine, A.K., Lin, D., & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, vol. 20(7), pp. 1483–1510. doi:10.1016/j.ymssp.2005.09.012
Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, vol. 32(3), pp. 241-254.
Karypis, G., & Kumar, V. (1995). METIS - Unstructured Graph Partitioning and Sparse Matrix Ordering System, Version 2.0 (Technical report).
Karypis, G., Aggarwal, R., & Kumar, V., Shekhar, S. (1997). Multilevel Hypergraph Partitioning: Applications in VLSI Design, In Proc. ACM/IEEE Design Automation Conference, pages 526-529.
Karypis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing, vol. 20(1), pp. 359-392.
Legány, C., Juhász, S., & Babos, A. (2006). Cluster validity measurement techniques. Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases (388-393), December 16-18, Tenerife, Canary Islands, Spain
Leguizamón, S., Pelgrum, H., & Azzali, S. (1996). Unsupervised Fuzzy C-means classification for the determination of dynamically homogeneous areas. Revista SELPER, vol. 12(12), pp. 20-24.
Li, Y. S., & Chen, K. C. (2011). Graph partition and identification of cluster number in data analysis. In Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication (ICUIMC '11) (5), vol. 62(5), February 21-23, Seoul, korea. doi=10.1145/1968613.1968688.
Lin, Y., Chen, M., & Zhou, D. (2013). Online probabilistic operational safety assessment of multi-mode engineering systems using Bayesian methods. Reliability Engineering & System Safety, vol. 119, pp. 150-157. doi:10.1016/j.ress.2013.05.018
Mohar, B. (1997). Some Applications of Laplace Eigenvalues of Graphs. Graph Symmetry: Algebraic Methods and Applications, vol. 497, pp. 225-275. doi: 10.1007/978-94-015-8937-6_6
Muller, A., Suhner, M. C., & Iung, B. (2008). Formalisation of a new prognosis model for supporting proactive maintenance implementation on industrial system. Reliability Engineering & System Safety, vol. 93(2), pp. 234-253. doi:10.1016/j.ress.2006.12.004
Ng, A.Y., Jordan, M.I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems (NIPS), vol. 14, pp. 849-856.
Onanena, R., Oukhellou, L., come, E., Jemei, S., Candusso, D., Hissel, D., & Aknin, P. (2013). Fuel Cell Health Monitoring Using Self Organizing Maps. Chemical Engineering Transactions, vol. 33, pp. 1021-1026. doi: 10.3303/CET1333171
Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65. doi:10.1016/0377-0427(87)90125-7
Serir, L., Ramasso, E., & Zerhouni, N. (2012). Evidential evolving Gustafson–Kessel algorithm for online data streams partitioning using belief function theory. International journal of approximate reasoning, vol. 53(5), pp. 747-768. doi:10.1016/j.ijar.2012.01.009
Serir, L., Ramasso, E., Nectoux, P., & Zerhouni, N. (2013). E2GKpro: An evidential evolving multi-modeling approach for system behavior prediction with applications. Mechanical Systems and Signal Processing, vol. 37(1), pp. 213-228. doi:10.1016/j.ymssp.2012.06.023
Siegel, D., & Lee, J. (2011). An Auto-Associative Residual Processing and K-means Clustering Approach for Anemometer Health Assessment. International Journal of Prognostics and Health Management, vol. 2(2) 014, pp. 1-12. ISSN 2153-2648
Strehl, A., & Ghosh, J. (2002). Cluster ensembles-a knowledge reuse framework for combining partitions. The Journal of Machine Learning Research, vol. 3, pp. 583-617. doi: 10.1162/153244303321897735
Salvador, A. (2002). Faults diagnosis in industrial processes with a hybrid diagnostic system. In MICAI 2002: Advances in Artificial Intelligence, vol. 2313, pp. 536- 545. Springer Berlin Heidelberg. doi: 10.1007/3-540-46016-0_56
Su, M. C., & Chou, C. H. (2001). A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Transactions on pattern analysis and machine intelligence, vol. 23(6), pp. 674-680.
Topchy, A., Jain, P., & Punch, W. (2004). A Mixture Model for Clustering Ensembles. Proceedings of the 2004 SIAM International Conference on Data Mining (379- 390), April 22-24, Florida. doi: 10.1137/1.9781611972740.35
Topchy, A., Jain, A. K., & Punch, W. (2005). Clustering ensembles: Models of consensus and weak partitions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27(12), pp. 1866-1881.
Van Wijk, J., & Van Selow, E. (1999). Cluster and calendar based visualization of time series data. Proceedings of IEEE Symposium on Information Visualization (4-9), October 24-29, San Francisco, CA. doi: 10.1109/INFVIS.1999.801851
Vega-Pons, S., & Ruiz-Shulcloper, J. (2011). A survey of clustering ensemble algorithms. International Journal of Pattern Recognition and Artificial Intelligence, vol. 25(03), pp. 337-372.
Vlachos, M., Lin, J., Eamonn K., & Dimitrios G. (2003). A wavelet-based anytime algorithm for k-means clustering of time series. In Proc. Workshop on Clustering High Dimensionality Data and Its Applications (23-30), San Francisco, CA.
Von Luxburg, U. (2007). A Tutorial on Spectral Clustering. Statistics and Computing, vol. 17(4), pp. 395-416.
Wang, T., Yu, J., Siegel, D., & Lee, J. (2008). A similarity based prognostics approach for remaining useful life estimation of engineered systems. In Prognostics and Health Management, 2008. PHM 2008. International Conference on, (1-6). IEEE., October 6-9, Denver, CO. doi: 10.1109/PHM.2008.4711421
Wang, T. (2010). Trajectory Similarity Based Prediction for Remaining Useful Life Estimation. Doctoral dissertation. University of Cincinnati, U.S. http://gradworks.umi.com/3432353.pdf
Wu, F., & Lee, J. (2011). Information Reconstruction Method for Improved Clustering and Diagnosis of Generic Gearbox Signals. International Journal of the Prognostics and Health Management Society, vol. 2(1) 004, 9 pages. ISSN 2153-2648
Xiufeng, G., & Changzheng, X. (2010). K-means Multiple Clustering Research Based on Pseudo Parallel Genetic Algorithm. In Information Technology and Applications (IFITA), 2010 International Forum on (1, pp. 30-33). IEEE, July 16-18, Kunming. doi : 10.1109/IFITA.2010.186
Zhou, S., Zhang, J., & Wang, S. (2004). Fault diagnosis in industrial processes using principal component analysis and hidden Markov model. In American Control Conference, 2004. Proceedings of the 2004, vol. 6, pp. 5680-5685. IEEE.
Zhao, Z., & Liu, H. (2007). Spectral feature selection for supervised and unsupervised learning. Proceedings of the 24th international conference on Machine learning (1151-1157), June 20-24, Corvalis, Oregon.
Technical Papers