DAGGER: Data AuGmentation GEneRative Framework for Time-Series Data in Data-Driven Smart Manufacturing Systems

Nicholas Hemleben; Daniel Ospina-Acero; David Blank; Andrew VanFossen; Frank Zahiri; Mrinal Kumar

doi:10.36001/phmconf.2023.v15i1.3483

DAGGER: Data AuGmentation GEneRative Framework for Time-Series Data in Data-Driven Smart Manufacturing Systems

PDF

Published Oct 26, 2023

DOI https://doi.org/10.36001/phmconf.2023.v15i1.3483

Nicholas Hemleben Daniel Ospina-Acero David Blank Andrew VanFossen Frank Zahiri Mrinal Kumar

Abstract

As industries transition into the Industry 4.0 paradigm, the relevance and interest in concepts like
Digital Twin (DT) are at an all-time high. DTs offer direct avenues for industries to make more
accurate predictions, rational decisions, and informed plans, ultimately reducing costs, increasing
performance and productivity. Adequate operation of DTs in the context of smart manufacturing relies
on an evolving data-set relating to the real-life object or process, and a means of dynamically updating
the computational model to better conform to the data. This reliance on data is made more explicit when
physics-based computational models are not available or difficult to obtain in practice, as it's the
case in most modern manufacturing scenarios. For data-based model surrogates to "adequately" represent
the underlying physics, the number of training data points must keep pace with the number of degrees of
freedom in the model, which can be on the order of thousands. However, in niche industrial scenarios
like the one in manufacturing applications, the availability of data is limited (on the order of a few
hundred data points, at best), mainly because a manual measuring process typically must take place for
a few of the relevant quantities, e.g., level of wear of a tool. In other words, notwithstanding the
popular notion of big-data, there is still a stark shortage of ground-truth data when examining, for
instance, a complex system's path to failure. In this work we present a framework to alleviate this
problem via modern machine learning tools, where we show a robust, efficient and reliable pathway to
augment the available data to train the data-based computational models. Small sample size data is a key limitation in performance in machine learning, in particular with
very high dimensional data. Current efforts for synthetic data generation typically involve either
Generative Adversarial Networks (GANs) or Variational AutoEncoders (VAEs). These, however, are are
tightly related to image processing and synthesis, and are generally not suited for sensor data
generation, which is the type of data that manufacturing applications produce. Additionally, GAN
models are susceptible to mode collapse, training instability, and high computational costs when used
for high dimensional data creation. Alternatively, the encoding of VAEs greatly reduces dimensional
complexity of data and can effectively regularize the latent space, but often produces poor
representational synthetic samples. Our proposed method thus incorporates the learned latent space
from an AutoEncoder (AE) architecture into the training of the generation network in a GAN. The
advantages of such scheme are twofold: \textbf{(\textit{i})} the latent space representation created
by the AE reduces the complexity of the distribution the generator must learn, allowing for quicker
discriminator convergence, and \textbf{(\textit{ii})} the structure in the sensor data is better
captured in the transition from the original space to the latent space. Through time statistics (up to
the fifth moment), ARIMA coefficients and Fourier series coefficients, we compare the synthetic data
from our proposed AE+GAN model with the original sensor data. We also show that the performance of
our proposed method is at least comparable with that of the Riemannian Hamiltonian VAE, which is a
recently published data augmentation framework specifically designed to handle very small high
dimensional data sets.

How to Cite

Hemleben, N., Ospina-Acero, D., Blank, D., VanFossen, A., Zahiri, F., & Kumar, M. (2023). DAGGER: Data AuGmentation GEneRative Framework for Time-Series Data in Data-Driven Smart Manufacturing Systems. Annual Conference of the PHM Society, 15(1). https://doi.org/10.36001/phmconf.2023.v15i1.3483

Abstract 637 | PDF Downloads 348

Keywords

Synthetic data, Generative modeling, Generative adversarial networks, Variational autoencoders, Digital Twins

References

Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214–223).

Astrom, K., & Murray, R. (2010). ¨ Feedback systems: An introduction for scientists and engineers. Princeton University Press.

Bank, D., Koenigstein, N., & Giryes, R. (2020). Autoencoders. arXiv preprint arXiv:2003.05991.Chadebec, C., & Allassonniere, S. (2021). Data augmentation with variational autoencoders and manifold sampling.

In Deep generative models, and data augmentation, labelling, and imperfections (pp. 184–192). Springer.
Chadebec, C., Mantoux, C., & Allassonniere, S. (2020). Geometry-aware hamiltonian variational auto-encoder. arXiv preprint arXiv:2010.11518, 0-44.

Chadebec, C., Thibeau-Sutre, E., Burgos, N., & Allassonniere, S. (2022). Data augmentation in high dimensional low sample size setting using a geometry-based variational autoencoder. IEEE Transactions on Pattern
Analysis and Machine Intelligence.

Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., & Bharath, A. A. (2018). Generative adversarial networks: An overview. IEEE signal processing magazine, 35(1), 53–65.

Demir, S., Mincev, K., Kok, K., & Paterakis, N. G. (2021). Data augmentation for time series regression: Applying transformations, autoencoders and adversarial networks to electricity price forecasting. Applied Energy,
304, 117695.

Diez-Olivan, A., Del Ser, J., Galar, D., & Sierra, B. (2019). Data fusion and machine learning for industrial prognosis: Trends and perspectives towards industry 4.0. Information Fusion, 50, 92–111.

Doersch, C. (2021). Tutorial on variational autoencoders. Figueira, A., & Vaz, B. (2022). Survey on synthetic data generation, evaluation methods and gans. Mathematics, 10(15). doi: 10.3390/math10152733 Goodfellow, I. (2016). Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160.

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Bengio, Y. (2014). Generative adversarial nets. In

Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, & K. Weinberger (Eds.), Advances in neural information processing systems (Vol. 27). Curran Associates, Inc.

Gui, J., Sun, Z., Wen, Y., Tao, D., & Ye, J. (2021). A review on generative adversarial networks: Algorithms, theory, and applications. IEEE transactions on knowledge and data engineering.

Hutter, F., Lucke, J., & Schmidt-Thieme, L. (2015). Beyond ¨manual tuning of hyperparameters. KI-Kunstliche Intelligenz, 29(4), 329–337.

Iglesias, G., Talavera, E., Gonzalez-Prieto, ´ A., Mozo, A., & ´ Gomez-Canaval, S. (2023). Data augmentation techniques in time series domain: a survey and taxonomy.Neural Computing and Applications, 35(14), 10123–10145.

Khanuja, H. K., & Agarkar, A. A. (2023). Towards gan challenges and its optimal solutions. Generative Adversarial Networks and Deep Learning: Theory and Applications.

Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv.

Rezende, D.J., Mohamed, S., Wierstra, & D. (2014). Stochastic backpropagation and approximate inference in deep
generative models. International conference on machine learning, 1278–1286.

Shao, H., Yao, S., Sun, D., Zhang, A., Liu, S., Liu, D., Abdelzaher, T. (2020). Controlvae: Controllable variational autoencoder. In International conference on machine learning (pp. 8655–8664).

Shmelkov, K., Schmid, C., & Alahari, K. (2018). How good is my gan? In Proceedings of the european conference on computer vision (eccv) (pp. 213–229).

Smith, K. E., & Smith, A. O. (2020). Conditional gan for timeseries generation.

Teubert, C. (2022). Milling wear data set. Retrieved from https://data.nasa.gov/Raw-Data/ Milling-Wear/vjv9-9f3x (Dataset)

Wright, L., & Davidson, S. (2020). How to tell the difference between a model and a digital twin. Advanced Modeling and Simulation in Engineering Sciences, 7(1), 1–13.

Yang, Z., Li, Y., & Zhou, G. (2023). Ts-gan: Time-series gan for sensor-based health data augmentation. ACM Transactions on Computing for Healthcare, 4(2), 1– 21.

Issue

Vol. 15 No. 1 (2023): Proceedings of the Annual Conference of the PHM Society 2023

Section

Technical Research Papers

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:

As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.

First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Abstract

How to Cite

##plugins.themes.bootstrap3.article.details##

Most read articles by the same author(s)