Unsupervised Probabilistic Anomaly Detection Over Nominal Subsystem Events Through a Hierarchical Variational Autoencoder
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
This work develops a versatile approach to discover anomalies in operational data for nominal (i.e., non-parametric) subsystem event signals using unsupervised Deep Learning techniques. Firstly, it builds a neural convolutional framework
to extract both intrasubsystem and intersubsystem patterns. This is done by applying banks of voxel filters on the charted
data. Secondly, it generalizes the learned embedded regularity of a Variational Autoencoder manifold by merging latent
space-overlapping deviations with non-overlapping synthetic irregularities. Contingencies like novel data, model
drift, etc., are therefore seamlessly managed by the proposed data-augmented approach. Finally, it creates a smooth diagnosis probabilistic function on the ensuing low-dimensional distributed representation. The resulting enhanced solution warrants analytically strong tools for a critical industrial environment. It also facilitates its hierarchical integrability, and provides visually interpretable insights of the degraded condition hazard to increase the confidence in its predictions. This strategy has been validated with eight pairwise-interrelated subsystems from high-speed trains. Its outcome also leads to further reliable explainability from a causal perspective.
##plugins.themes.bootstrap3.article.details##
unsupervised, anomaly, detection, variational, autoencoder
Biscione, V., and Bowers, J. S. (2021). Convolutional Neural Networks Are Not Invariant to Translation, but They Can Learn to Be. Journal of Machine Learning Research, 22(229), 1–28.
Bishop, C. M. (Ed.). (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc.
Bredon, G. E. (1995). Topology & Geometry. Springer-Verlag.
Cao, S., Li, J., Nelson, K.P., and Kon, M.A. (2022). Coupled VAE: Improved Accuracy and Robustness of a Variational Autoencoder. Entropy, 24(423), 1–25.
Chaman, A., and Dokmanic, I. (2021). Truly shift-invariant convolutional neural networks. Proc. of the IEEE / CVF Computer Vision and Pattern Recognition Conference, 3773–3783.
Chawla, N. V., and Bowyer, K. W. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chen, P., Chen, G., and Zhang, S. (2019). Log Hyperbolic Cosine Loss Improves Variational Auto-Encoder. Proc. of the International Conference on Learning Representations, 1–15.
Dangut, M. D., Skaf, Z., and Jennions, I. (2020). Rare Failure Prediction Using an Integrated Auto-encoder and Bidirectional Gated Recurrent Unit Network. IFAC PapersOnLine, 53(3), 276–282.
Doersch, C. (2016). Tutorial on Variational Autoencoders. arXiv:1606.05908 [stat.ML], 1–23.
Du, M., Li, F., Zheng, G., and Srikumar, V. (2017). DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. Proc. of the ACM Conference on Computer and Communications Security, 1285–1298.
Duda, R. O., Hart, P. E., and Stork, D. G. (Ed.). (2001). Pattern Classification. Wiley-Interscience.
Eid, A., Clerc, G., Mansouri, B., and Roux, S. (2021). A Novel Deep Clustering Method and Indicator for Time Series Soft Partitioning. Energies, 14(5530), 1–19.
Elattar, H. M., Elminir, H. K., and Riad, A. M. (2016). Prognostics: a literature review. Complex & Intelligent Systems, 2(2), 125–154.
Erhan, D., Manzagol, P.-A., Bengio, Y., Bengio, S., and Vincent, P. (2009). The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training. Proc. of the 12th International Conference on Artificial Intelligence and Statistics, 153–160.
Farzad, A., and Gulliver, A. (2020). Unsupervised log message anomaly detection. ICT Express, 6, 229–237.
Fefferman, C., Mitter, S., and Narayanan, H. (2016). Testing the Manifold Hypothesis. Journal of the American Mathematical Society, 29(4), 983–1049.
Feng, L., Shu, S., Lin, Z., Lv, F., Li, L., and An, B. (2020). Can Cross Entropy Loss Be Robust to Label Noise? Proc. of the 29th International Joint Conference on Artificial Intelligence, 2206–2212.
Fink, O., Wang, Q., Svensén, M., Dersin, P., Lee, W.-J., and Ducoffe, M. (2020). Potential, challenges and future directions for deep learning in prognostics and health management applications. Engineering Applications of Artificial Intelligence, 92(103678), 1–15.
Forest, F., Lebbah, M., Azzag, H., and Lacaille, J. (2019). Deep Embedded SOM: Joint Representation Learning and Self-Organization. Proc. of the 27th European Symposium on Artificial Neural Networks, 1–6.
Gelman, A. (2021). Reflections on Breiman’s Two Cultures of Statistical Modeling. Observational Studies, 7(1), 95–98.
Glymour, C., Zhang, K., and Spirtes, P. (2019). Review of Causal Discovery Methods Based on Graphical Models. Frontiers in Genetics, 10(524), 1–15.
Gosset, W. S. (1908). The probable error of a mean. Biometrika, 6(1), 1–25.
Hancock, J. T., and Khoshgoftaar, T. M. (2020). Survey on categorical data for neural networks. Journal of Big Data, 7(28), 1–41.
Hernán, M. A., and Robins, J. M. (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC.
Hu, X., Eklund, N., and Goebel, K. (2007). A Data Fusion Approach for Aircraft Engine Fault Diagnostics. Proc. of ASME Turbo Expo, 1(GT2007-27941), 767–775.
Huang, B., Di, Y., Jin, C., and Lee, J. (2017). Review of Data-driven Prognostics and Health Management Techniques: Lessions Learned from PHM Data Challenge Competitions. Proc. of the Conference of the Machine Failure Prevention Technology Society, 1–17.
Huh, D. (2011). Synthetic Embedding-based Data Generation Methods for Student Performance. arXiv:2101.00728 [cs.LG], 1–19.
Im, D. J., Ahn, S., Memisevic, R., and Bengio, Y. (2017). Denoising criterion for variational auto-encoding framework. Proc. of the 31st AAAI Conference on Artificial Intelligence, 2059–2065.
ISO. (2003). Condition monitoring and diagnostics of machine systems: Data processing, communication and presentation (Tech. Rep. No. 13374-1:2003). International Organization for Standardization.
Janzing, D., Balduzzi, D., Grosse-Wentrup, M., and Schölkopf, B. (2013). Quantifying causal influences. The Annals of Statistics, 41(5), 2324–2358.
Kadra, A., Lindauer, M., Hutter, F., and Grabocka, J. (2021). Well-tuned Simple Nets Excel on Tabular Datasets. Proc. of the 35th Conference on Neural Information Processing Systems, 1–14.
Kanazawa, A., Sharma, A., and Jacobs, D. (2014). Locally Scale-Invariant Convolutional Neural Networks. Proc. of the Twenty-eighth Conference on Neural Information Processing Systems: Deep Learning and Representation Learning Workshop, 1–11.
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., and Krishnan, D. (2020). Supervised Contrastive Learning. Proc. of the 34th Conference on Neural Information Processing Systems, 1–23.
Kim, S., Choi, K., Choi, H.-S., Lee, B., and Yoon, S. (2022). Towards a Rigorous Evaluation of Time-series Anomaly Detection. Proc. of the 36th AAAI Conference on Artificial Intelligence, 7194–7201.
Kingma, D. P., and Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends (R) in Machine Learning, 1–89.
Kingma, D. P., Rezende, D. J., Mohamed, S., and Welling, M. (2014). Semi-supervised learning with deep generative models. Advances in Neural Information Processing Systems, 4, 3581–3589.
Lai, G., Chang,W.-C., Yang, Y., and Liu, H. (2018). Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks. Proc. of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, 1–11.
Lejeune, M. (2010). Statistique, La théorie et ses applications. Springer Verlag France.
Martinez, M., and Stiefelhagen, R. (2018). Taming the Cross Entropy Loss. Proc. of the German Conference on Pattern Recognition, 628–637.
Michau, G., and Fink, O. (2019). Unsupervised Fault Detection in Varying Operating Conditions. Proc. of the IEEE International Conference on Prognostics and Health Management, 1–11.
Molnar, C. (2019). Interpretable Machine Learning. Leanpub.
Patrini, G., Rozza, A., Menon, A., Nock, R., and Qu, L. (2017). Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9.
Qiu, H., Eklund, N., Hu, X., Yan, W., and Iyer, N. (2008). Anomaly Detection using Data Clustering and Neural Networks. Proc. of the International Joint Conference on Neural Networks, 3627–3633.
Rodriguez Garcia, G., Michau, G., Ducoffe, M., Sen Gupta, J., and Fink, O. (2021). Temporal signals to images: Monitoring the condition of industrial assets with deep learning image processing algorithms. Proc. of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability, 1–13.
Runge, J., Bathiany, S., Bollt, E. et al. (2019). Inferring causation from time series in Earth system sciences. Nature Communications, 10(2553), 1–13.
Runge, J., Nowack, P., Kretschmer, M., Flaxman, S., and Sejdinovic, D. (2019). Detecting and quantifying causal associations in large nonlinear time series datasets. Science Advances, 5(eaau4996), 1–15.
Sammouri, W., Cˆome, E., Oukhellou, L., Aknin, P., and Fonlladosa, C.-E. (2014). Pattern recognition approach for the prediction of infrequent target events in floating train data sequences within a predictive maintenance framework. Proc. of the IEEE 17th International Conference on Intelligent Transportation Systems, 918–923.
Sarbu, S., and Malagò, L. (2019). Variational autoencoders trained with q-deformed lower bounds. Proc. of the International Conference on Learning Representations, 1–7.
Schölkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., and Bengio, Y. (2021). Towards Causal Representation Learning. Proc. of the IEEE, 109(5), 612–634.
Sejnowski, T. J. (2018). The Deep Learning Revolution. The MIT Press.
Shahid, N., and Ghosh, A. (2019). TrajecNets: Online Failure Evolution Analysis in 2D Space. International Journal of Prognostics and Health Management, 29, 1–17.
Sicks, R., Korn, R., and Schwaar, S. (2020). A lower bound for the ELBO of the Bernoulli Variational Autoencoder. arXiv:2003.11830 [cs.LG], 1–20.
Simard, P. Y., Steinkraus, D., and Platt, J. C. (2003). Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. Proc. of the Seventh International Conference on Document Analysis and Recognition, 958–962.
Spirtes, P., Glymour, C., and Scheines, R. (2001). Causation, Prediction, and Search. MIT Press. Strickland, E. (2022). Are You Still Using Real Data to Train Your AI? IEEE Spectrum.
Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer.
Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P. A. (2008). Extracting and composing robust features with denoising autoencoders. Proc. of the 25th International Conference on Machine Learning, 2059–2065.
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11, 3371–3408.
Wu, R., and Keogh, E. (2021). Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress. IEEE Transactions on Knowledge and Data Engineering, 1–9.
Xie, Y. J., Tsui, K. L., Xie, M., and Goh, T. N. (2010). Monitoring Time-between-Events for Health Management. Proc. of the IEEE Prognostics and System Health Management Conference, MU3117, 1–8.
Yang, M., Liu, F., Chen, Z., Shen, X., Hao, J., and Wang, J. (2020). CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models. arXiv:2004.08697 [cs.LG], 1–21.
Yong, B. X., Pearce, T., and Brintrup, A. (2020). Bayesian Autoencoders: Analysing and Fixing the Bernoulli likelihood for Out-of-Distribution Detection. Proc. of the 37th International Conference on Machine Learning, 1–9.
Zaman, N., Apostolou, E., Li, Y., and Oister, K. (2022). Explainable AI for RAMS. Proc. of the Annual Reliability and Maintainability Symposium, 1–7.
Zecevic, M., Dhami, D. S., Veli˘ckovi´c, P., and Kersting, K. (2021). Relating Graph Neural Networks to Structural Causal Models. arXiv:2109.04173 [cs.LG], 1–29.