An essential requirement in any data analysis is to have a response variable representing the aim of the analysis. Much academic work is based on laboratory or simulated data, where the experiment is controlled, and the ground truth clearly defined. This is seldom the reality for equipment performance in an industrial environment and it is common to find issues with the response variable in industry situations. We discuss this matter using a case study where the problem is to detect an asset event (failure) using data available but for which no ground truth is available from historical records. Our data frame contains measurements of 14 sensors recorded every minute from a process control system and 4 current motors on the asset of interest over a three year period. In this situation the ``how to'' label the event of interest is of fundamental importance. Different labelling strategies will generate different models with direct impact on the in-service fault detection efficacy of the resulting model. We discuss a data-driven approach to label a binary response variable (fault/anomaly detection) and compare it to a rule-based approach. Labelling of the time series was performed using dynamic time warping followed by agglomerative hierarchical clustering to group events with similar event dynamics. Both data sets have significant imbalance with 1,200,000 non-event data but only 150 events in the rule-based data set and 64 events in the data-driven data set. We study the performance of the models based on these two different labelling strategies, treating each data set independently. We describe decisions made in window-size selection, managing imbalance, hyper-parameter tuning, training and test selection, and use two models, logistic regression and random forest for event detection. We estimate useful models for both data sets. By useful, we understand that we could detect events for the first four months in the test set. However as the months progressed the performance of both models deteriorated, with an increasing number of false positives, reflecting possible changes in dynamics of the system. This work raises questions such as ``what are we detecting?'' and ``is there a right way to label?'' and presents a data driven approach to support labelling of historical events in process plant data for event detection in the absence of ground truth data.
time-series clustering, imbalanced data, process plant data, event detection, labelling strategies, absence of ground truth data
Astfalck, L., Hodkiewicz, M., Keating, A., Cripps, E., & Pecht, M. (2016). A modelling ecosystem for prognostics. In D. Larsen & K. Reichard (Eds.), Proceedings of the annual conference of the prognostics and health management society 2016 (pp. 273–281). Prognostics and Health Management Society. (Annual Conference of the Prognostics and Health Management Society 2016 ; Conference date: 03-10-2016 Through 08-10-2016)
Cai, W., Zhao, J., & Zhu, M. (2020). A real time methodology of cluster-system theory-based reliability estimation using k-means clustering. Reliability Engineering & System Safety, 202, 107045. doi: https://doi.org/10.1016/j.ress.2020.107045
Carvalho, T. P., Soares, F. A. A. M. N., Vita, R., da P. Francisco, R., Basto, J. P., & Alcala, S. G. S. (2019). A systematic literature review of machine learning methods applied to predictive maintenance. Computers&Industrial Engineering, 137, 106024. doi: https://doi.org/10.1016/j.cie.2019.106024
Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). Nbclust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, Articles, 61(6), 1–36. doi: 10.18637/jss.v061.i06
Dunn, J. C. (1973). A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3), 32-57. doi: 10.1080/01969727308546046
Erhan, L., Ndubuaku, M., Di Mauro, M., Song,W., Chen, M., Fortino, G., . . . Liotta, A. (2021). Smart anomaly detection in sensor systems: A multi-perspective review. Information Fusion, 67, 64-79. doi: https://doi.org/10.1016/j.inffus.2020.10.001
Ferreira, H. M., & de Sousa, A. C. (2020). Remaining useful life estimation of bearings: Meta-analysis of experimental procedure. International Journal of Prognostics and Health Management, 11(2). doi: https://doi.org/10.36001/ijphm.2020.v11i2.2922
Jung, Y., Park, H., Du, D.-Z., & Drake, B. L. (2003). A decision criterion for the optimal number of clusters in hierarchical clustering. Journal of Global Optimization, 25(1), 91–111. doi: 10.1023/A:1021394316112
Kim, K., Parthasarathy, G., Uluyol, O., Foslien, W., Sheng, S., & Fleming, P. (2011, 08). Use of SCADA Data for Failure Detection in Wind Turbines (Vols. ASME 2011 5th International Conference on Energy Sustainability, Parts A, B, and C). doi: 10.1115/ES2011-54243
Lee, J., Qiu, H., Yu, G., & Lin, J. (2007). Bearing data set. https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository. (NASA Ames Prognostics Data Repository)
Lei, Y., Li, N., Guo, L., Li, N., Yan, T., & Lin, J. (2018). Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mechanical Systems and Signal Processing, 104, 799-834. doi: https://doi.org/10.1016/j.ymssp.2017.11.016
Li, H. (2015). On-line and dynamic time warping for time series data mining. International Journal of Machine Learning and Cybernetics, 6(1), 145–153. doi: 10.1007/s13042-014-0254-0
Listou Ellefsen, A., Bjørlykhaug, E., Æsøy, V., Ushakov, S., & Zhang, H. (2019). Remaining useful life predictions for turbofan engine degradation using semisupervised deep architecture. Reliability Engineering & System Safety, 183, 240-251. doi: https://doi.org/10.1016/j.ress.2018.11.027
Michaud, D. (2015). Rake classifier. www.911metallurgist.com/blog/rake-classifier. 911 Metallurgist. (Accessed: 2021-Apr-08)
Molina, R., Unsworth, K., Hodkiewicz, M., & Adriasola, E. (2013). Are managerial pressure, technological control and intrinsic motivation effective in improving data quality? Reliability Engineering & System Safety, 119, 26–34.
Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing, 2(3). (arXiv:1003.4083v1)
Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: an overview. WIREs Data Mining and Knowledge Discovery, 2(1), 86–97. doi: https://doi.org/10.1002/widm.53
Nectoux, P., Gouriveau, R., Medjaher, K., Ramasso, E., Chebel-Morello, B., Zerhouni, N., & Varnier, C. (2012). Pronostia: An experimental platform for bearings accelerated degradation tests. In Ieee international conference on prognostics and health management, phm’12. (pp. 1–8).
Reder, M., Y¨ur¨us¸en, N. Y., & Melero, J. J. (2018). Datadriven learning framework for associating weather conditions and wind turbine failures. Reliability Engineering & System Safety, 169, 554–569. doi: https://doi.org/10.1016/j.ress.2017.10.004
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53–65. doi: https://doi.org/10.1016/0377-0427(87)90125-7
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE transactions on Acoustics, Speech, and Signal Processing, 26(1), 43–49. doi: 10.1109/TASSP.1978.1163055
Sambasivan, N., Kapania, S., Highfill, H., Akrong, D., Paritosh, P. K., & Aroyo, L. M. (2021). “Everyone wants to do the model work, not the data work: Data cascades in high-stakes ai”. In Human factors in computing systems (chi) (pp. 1–15).
Saxena, A., & Goebel, K. (2008). Turbofan engine degradation simulation data set. NASA Ames Prognostics Data Repository, 1551–3203.
Steinley, D., & Brusco, M. J. (2007). Selection of variables in cluster analysis: An empirical comparison of eight procedures. Psychometrika, 73(1), 125. doi: 10.1007/s11336-007-9019-y
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411–423. doi: https://doi.org/10.1111/1467-9868.00293
Unsworth, K., Adriasola, E., Johnston-Billings, A., Dmitrieva, A., & Hodkiewicz, M. (2011). Goal hierarchy: Improving asset data quality by improving motivation. Reliability Engineering & System Safety, 96(11), 1474–1481.