This work presents a novel data-centric solution for fault diagnostics and failure prognostics consisting of a data-augmentation method which is well suited for non-stationary mutivariate time-series data. The method, based on time-varying autoregressive processes, can be employed to extract key information from a limited number of samples and generate new artificial samples in a way that benefits the development of diagnostics and prognostics solutions. The proposed approach is tested based on three real-world datasets associated with failure diagnostics problems using two types of machine learning methods. Results indicate the proposed method improves performance in all tested cases.
How to Cite
data augmentation, data-centric, non-stationary, time-varying, autoregressive
Arundo Analytics. (2023). TSAug, A Python module for time series augmentation. Retrieved from https://tsaug.readthedocs.io/en/stable/ (Accessed: May 10, 2023)
Bates, D., &Watts, D. (2007). Nonlinear regression analysis and its applications (2nd ed.). Hoboken, NJ: Wiley.
Biggio, L., & Kastanis, I. (2020). Prognostics and health management of industrial assets: Current progress and road ahead. Front. Artif. Intell., 3, 1-24.
Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: Forecasting and control (5th ed.). Hoboken, NJ: Wiley.
Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., . . . Varoquaux, G. (2013). API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD workshop: Languages for data mining and machine learning (pp. 108–122). Prague, Czech Republic.
Case School of Engineering Bearing Data Center. (2023). Case Western Reserve University (CWRU) Motor Bearing Dataset. Retrieved from https://engineering.case.edu/ bearingdatacenter (Accessed: May 10, 2023)
Chen, Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., & Batista, G. (n.d.). The UCR Time Series Classification Archive. Retrieved from www.cs.ucr.edu/ ˜eamonn/time series data/ (Accessed: May 10, 2023)
Christ, M., Braun, N., Neuffer, J., & Kempa-Liehr, A. W. (2018). Time series feature extraction on basis of scalable hypothesis tests (tsfresh – a Python package). Neurocomputing, 307, 72-77.
Christ, M., Braun, N., Neuffer, J., & Kempa-Liehr, A. W. (2023). tsfresh - A Python package. Retrieved from https://tsfresh.readthedocs.io/ (Accessed: May 10, 2023)
de Boor, C. (2001). A practical guide to splines (revised ed.). New York, NY: Springer.
Deistler, M., & Scherrer, W. (2022). Time series models (1st ed.). Vienna, Austria: Springer.
de Oliveira, F. A. C., Niemi, A., Garc´ıa-Ortiz, A., & Torres, F. S. (2023). Partial camera obstruction detection using single value image metrics and data augmentation. In International Conference on System Reliability and Safety (ICSRS 2023) (p. 292-299). Venice, Italy.
de Souza, D. B., Chanussot, J., & Favre, A.-C. (2014). On selecting relevant intrinsic mode functions in empirical mode decomposition: An energy-based approach. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014) (p. 325-329). Florence, Italy.
de Souza, D. B., Chanussot, J., Favre, A.-C., & Borgnat, P. (2012). A modified time-frequency method for testing wide-sense stationarity. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2012) (p. 3409-3412). Kyoto, Japan.
de Souza, D. B., Chanussot, J., Favre, A.-C., & Borgnat, P. (2014). A new nonparametric method for testing stationarity based on trend analysis in the time marginal distribution. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014) (p. 320-324). Florence, Italy.
de Souza, D. B., Chanussot, J., Favre, A.-C., & Borgnat, P. (2018). A nonparametric test for slowly-varying nonstationarities. Signal Process., 143, 241-252.
de Souza, D. B., Chanussot, J., Favre, A.-C., & Borgnat, P. (2019). An improved stationarity test based on surrogates. IEEE Signal Process. Lett., 26(10), 1431-1435.
de Souza, D. B., Kuhn, E. V., & Seara, R. (2019). A timevarying autoregressive model for characterizing nonstationary processes. IEEE Signal Process. Lett., 26(1), 134-138.
Fawaz, H. I. (2020). Timeseries classification from scratch. Retrieved from https://keras.io/examples/ timeseries/timeseries classification from scratch/ (Accessed: May 10, 2023)
Garan, M., Tidriri, K., & Kovalenko, I. (2022). A data-centric machine learning methodology: Application on predictive maintenance of wind turbines. Energies, 15(3), 1- 21.
He, J., Guan, X., Peng, T., Liu, Y., Saxena, A., Celaya, J., & Goebel, K. (2013). A multi-feature integration method for fatigue crack detection and crack length estimation in riveted lap joints using lamb waves. Smart Mater. Struct., 22(10), 105007.
IEEE WCCI. (2008). IEEE World Congress on Computational Intelligence. Hong Kong. Retrieved from https://ieeexplore.ieee.org/ document/4762304
Jiang, X., & Ge, Z. (2021). Data Augmentation Classifier for Imbalanced Fault Classification. IEEE Trans. Autom. Sci., 18(3), 1206-1217.
Kay, S. (2008). A new nonstationarity detector. IEEE Trans. Signal Process., 56(4), 1440-1451. Keras. (2022). KerasTuner. Retrieved from https:// keras.io/keras tuner/ (Accessed: May 10, 2023)
Kim, S., Choi, J.-H., & Kim, N. H. (2021). Challenges and opportunities of system-level prognostics. Sensors, 21(22), 1-25.
Kim, S., Kim, N. H., & Choi, J.-H. (2020). Prediction of remaining useful life by data augmentation technique based on dynamic time warping. Mech. Syst. Signal Process., 136, 106486.
Kwak, M., & Lee, J. (2023). Diagnosis-based domain adaptive design using designable data augmentation and bayesian transfer learning: Target design estimation and validation. Appl. Soft Comput., 143, 110459.
Leao, B. P., Fradkin, D., Lan, T., & Wang, J. (2021). Unleashing the power of industrial big data through scalable manual labeling. In NeurIPS Data-Centric AI Workshop (p. 1-5).
Li, H., Zhang, Z., & Zhang, C. (2023). Data augmentation via variational mode reconstruction and its application in few-shot fault diagnosis of rolling bearings. Measurement, 217, 113062.
Manolakis, D. G., Ingle, V. K., & Kogon, S. M. (2005). Random variables, vectors, and sequences. In Statistical and adaptive signal processing: Spectral estimation, signal modeling, adaptive filtering and array processing (p. 75-147). London, UK: Artech House.
Matei, I., Zhenirovskyy, M., de Kleer, J., & Feldman, A. (2018). Classification-based diagnosis using synthetic data from uncertain models. In Annual Conference of the Prognostics and Health Management Society (PHM 2018) (p. 1-8). Philadelphia, PA.
Montgomery, D., Peck, E., & Vining, G. G. (2006). Introduction to linear regression analysis (5th ed.). Hoboken, NJ: Wiley.
Niedzwiecki, M. (2000). Identification of time-varying processes (1st ed.). New York, NY: Wiley.
Pachori, R. B., & Sircar, P. (2008). EEG signal analysis using FB expansion and second-order linear TVAR process. Signal Process., 88(2), 415-420.
Peng, T., He, J., Xiang, Y., Liu, Y., Saxena, A., Celaya, J., & Goebel, K. (2015). Probabilistic fatigue damage prognosis of lap joint using bayesian updating. J. Intell. Mater. Syst. Struct., 26(8), 965-979.
Shen, B., Yao, L., Jiang, X., Yang, Z., & Zeng, J. (2023). Time series data augmentation classifier for industrial process imbalanced fault diagnosis. In IEEE Data Driven Control and Learning Systems Conference (DDCLS 2023) (p. 1392-1397). Xiangtan, China.
Sodsri, C. (2003). Time-varying autoregressive modelling for nonstationary acoustic signal and its frequency analysis (Unpublished doctoral dissertation). Pennsylvania State University.
Taghiyarrenani, Z., & Berenji, A. (2022). Noise-robust representation for fault identification with limited data via data augmentation. In European Conference of the Prognostics and Health Management Society (PHME 2022) (p. 473-479). Turin, Italy.
Wang, D., Dong, Y., Wang, H., & Tang, G. (2023). Limited Fault Data Augmentation With Compressed Sensing for Bearing Fault Diagnosis. IEEE Sens. J., 23(13), 14499-14511.
Wang, Z., Yan,W., & Oates, T. (2017). Time series classification from scratch with deep neural networks: A strong baseline. In International Joint Conference on Neural Networks (IJCNN 2017) (p. 1578-1585). Anchorage, AK.
Yang, A., Lu, C., Yu, W., Hu, J., Nakanishi, Y., & Wu, M. (2023). Data Augmentation Considering Distribution Discrepancy for Fault Diagnosis of Drilling Process With Limited Samples. IEEE Trans. Ind. Electron., 70(11), 11774-11783.
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:
As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.
First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.