Generic Hybrid Models for Prognostics of Complex Systems

Hybrid models combining physical knowledge and machine learning show promise for obtaining accurate and robust prognostic models. However, despite the increased interest in hybrid models in recent years, the proposed solutions tend to be domain-specific. As a result, there is no compelling strategy of what, where, and how physics-derived knowledge can be integrated into deep learning models depending on the available representation of physical knowledge and the quality of data for the development of prognostic models for complex systems. This Ph.D. project aims to develop a general strategy for hybridizing prognostic models by exploring multiple methods to incorporate physical knowledge at various stages of the learning algorithm. The project will prioritize expert knowledge as the primary source of information, while domain-specific knowledge will serve as an additional feature when applicable.


PROBLEM STATEMENT
Efficient maintenance of complex systems, such as aircraft or power plants, is critical for preventing failures and ensuring optimal operability.Hence, traditional fixed-time interval maintenance strategies have been replaced with conditionbased or predictive maintenance strategies by resorting to advanced monitoring technologies and Prognostics and Health Management (PHM) methods to estimate the system's current and future health states.
On the one hand, PHM has made significant progress using model-based, i.e., physics-based approaches for system health inference (Daigle & Goebel, 2013).However, these approaches have limitations, as physical degradation Kristupas Bajarunas et al.This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
processes are only well understood for simple components, which has hindered their practical application.On the other hand, deep learning-based solutions have shown promise in several PHM applications, as they can automatically infer the dynamics of a system without prior knowledge (Fink et al., 2020).However, deep learning solutions' accuracy and generalization capacities can be compromised as their solutions may violate physical constraints.Moreover, deep learning models are generally not robust to situations of limited data availability and have a black-box nature, which makes their predictions not easily interpretable.This limits their practical application in safety-critical domains, where a clear understanding of the model's reasoning and decision-making is required.
To address these challenges, hybrid models combining physical and data-based models have emerged as promising techniques for obtaining accurate, robust, and interpretable models for PHM.While hybrid models have been applied to solve a variety of tasks concerning complex systems, research in the field of prognostics is still limited.
The widespread adoption of hybrid models for prognostics in the industry is impeded by the fact that such models are often developed for specific systems, assuming specific physics representations available during model development.As a result, there is no clear guidance on how to transfer hybrid methodologies developed for different systems.Moreover, prior works do not generally evaluate the robustness of the solution to changes in data quality or the fidelity of physics, which can significantly impact the model's performance.Therefore, for the industry's wide adoption of hybrid prognostic models, a compelling strategy is needed on what, where, and how physics-derived knowledge can be integrated into deep learning models, depending on the fidelity of physical knowledge and data available for the development of prognostic models.Moreover, a general strategy must be adaptable to a diverse range of complex systems, where the condition monitoring data quality and prior knowledge of physics may vary in fidelity and representation.

EXPECTED CONTRIBUTIONS
The Ph.D. will focus on answering the following research question: How to develop a robust hybrid framework for prognostics applicable to multiple complex systems under varying data quality and fidelity of physics?We will investigate a wide range of complex systems, such as turbofan engines, batteries, and bearings.To achieve this goal, we put forth a two-fold hypothesis.

1) Integration of knowledge at multiple stages of the learning algorithm will improve hybrid model performance
The implementation of hybrid solutions for prognostics varies due to the specific characteristics of complex systems and the diverse technical knowledge available to different PHM stakeholders.Models differ based on what, where, and how physics-derived knowledge is integrated into deep learning models.Figure 1 presents a hybrid method classification table designed specifically for prognostics.
While for a specific system, the source and representation of knowledge are often considered fixed, the integration of knowledge into learning algorithms can be diverse.The research community has adopted three hybridization strategies for incorporating prior knowledge into learning algorithms for PHM (Rueden et al., 2019).Observational bias is used to augment the input data or input features to reflect the underlying physics (Arias Chao et al., 2022).Inductive bias alters the architecture of a learning algorithm so as to explicitly guarantee that the model predictions comply with given physical knowledge (Nascimento et al., 2021).Learning bias modifies the learning process of an algorithm in order to let it converge to a solution manifold that is consistent with the underlying physics (Cofre-Martel et al., 2021).
Previous research in prognostics primarily focuses on using a single method to integrate prior knowledge into learning algorithms.However, studies outside of prognostics have shown that integrating multiple biases at various stages of the learning algorithm can be advantageous.For instance, inductive and learning biases were utilized to predict the power generation of multiple wind turbines (Park & Park, 2019).Incorporating multiple biases into the learning algorithm allows for additional knowledge to be used, which can improve performance.
2) Hybrid models should rely on expert knowledge since it is available more often than system-specific scientific knowledge To implement a hybrid PHM approach, it is necessary to draw on additional prior knowledge beyond the available condition monitoring data.This prior knowledge can come from either scientific or expert sources.Scientific knowledge is typically characterized by its formal nature, while expert knowledge is less formal and is based on the general knowledge and experience of those working in the PHM field.
However, current hybridization methods tend to disregard valuable expert knowledge available to maintenance stakeholders.For example, experts may have knowledge about the degradation problem structure, e.g., the dependency of the failure time on current health and future operative conditions.They may also have non-formalized knowledge about correlations between sensor data or knowledge about the shape of the degradation curve, such as monotonicity.This expert knowledge is rarely integrated into the current hybridization methods along with scientific knowledge (Kim, Choi, & Kim, 2022).As a result, by neglecting this valuable expert knowledge, current hybridization methods may fail to achieve good generalizability across multiple systems, especially when system-specific scientific knowledge is not equivalent in representation and fidelity.To address this, a general hybridization framework should primarily rely on expert knowledge.By incorporating expert knowledge alongside available scientific knowledge, we aim at developing a more robust and effective hybrid method that can be applied across a wide range of complex systems.

RESEARCH PLAN
The research plan consists of four parts that aid in answering the main research question through sub-questions.

Comparative Analysis
First, we would like to answer the question: Which hybridization method is most robust to the fidelity of physics knowledge considered and data availability?To answer the question, we propose to perform a comparative analysis to evaluate current methods integrating physics into deep learning algorithms for prognostics, considering observational, inductive, and learn-ing biases.Firstly, we will identify case studies where hybrid prognostic methods have been proposed and, for each case study, determine the type of knowledge integrated, the representation of knowledge, and where the knowledge is integrated into the machine learning pipeline.As a result, we will provide a clear survey of the current hybridization techniques in the context of PHM.Secondly, we will evaluate the strengths and weaknesses of each method regarding their ability to withstand changes in data quality and fidelity of physics.We will alter data quality by truncating end-of-life data, removing specific fault modes and operative conditions, and generating out-of-distribution scenarios.We will implement scenarios with varying physics fidelity by adding noise and bias to the physics knowledge and observing the resulting impact on prognostic performance.

Unsupervised HI discovery
Next, we investigate the questions: How can expert knowledge be used to discover the degradation of a system, and how can information about degradation be used to improve prognostic performance?To address these questions, we focus on the problem of unsupervised health index (HI) inference from sensor readings under realistic scenarios.Previous research has demonstrated the benefits of accurately determining the HI, which can lead to better performance of prognostic models (Lövberg, 2021).However, the existing methodologies for determining HI in complex systems are mostly semisupervised and rely on assumptions that may not hold in realworld scenarios.In particular, the existing methods usually involve using a reference set of healthy sensor readings or run-to-failure data to infer HI.Moreover, most of the existing unsupervised methods only consider scenarios where the sensor readings are clearly dominated by degradation, which is of limited applicability.We hypothesize that reliable unsupervised HI inference can be achieved in scenarios where the operating conditions largely mask the effect of degradation by relying on expert knowledge about degradation and integrating this knowledge as learning and inductive bias.

Controllable physics-informed data generation
We also seek to answer the question: How does incorporating observational bias in the form of synthetic data improve prognostic accuracy in situations of limited failures or truncated data?In several industrial applications, current hybrid methods for prognostics are generally insufficient to compensate for the lack of representative condition monitoring data.Often, the collected data does not represent all possible fault modes and operative conditions and is truncated before failure.To overcome this challenge, we aim to develop a controllable, physics-informed data generation process that improves prognostic performance.To achieve this objective, we build on the unsupervised HI discovery and focus on incor-porating observational bias in the form of synthetic data into the prognostics model.

General Hybrid framework for PHM
Finally, we aim to extend the hybrid framework by adding additional expert knowledge about the system in combination with system-specific knowledge to develop a robust hybrid framework applicable to various complex systems.The hybrid framework will primarily rely on expert knowledge and will have the option to integrate system-specific knowledge when it is available.For instance, we hypothesize that the topology of system components is widely available and can be represented by a graph.If more detailed interactions between components are known, then detailed physics can be embedded into the nodes or edges of the graph.Hence, graph neural networks will be used as the learning algorithm.In addition to inductive bias, we will investigate the use of additional knowledge by modifying the objective function to reflect correlations between sensor data and the degradation curve's shape.Ultimately, the hybrid framework will incorporate all learning biases, including its structure, inputs, and objective function influenced by prior knowledge.

Physics-informed AE for health index discovery
We demonstrate the possibility of integrating multiple sources of expert knowledge into a learning algorithm by developing an unsupervised hybrid model for HI discovery.We first introduce a new graphical representation that illustrates the relationship between sensor readings, operating conditions, and degradation in a typical system (Figure 2).We demonstrate how this representation can inform the design of an autoencoder's architecture for the purpose HI discovery (Figure 3).Finally, we incorporate an extra soft constraint based on expert knowledge of the degradation process to guide the autoencoder to uncover the degradation in its bottleneck layer.

Figure 2. Graphical representation of degradation
We evaluate the effectiveness of our approach on realistic data scenarios commonly seen in the industry.These scenarios include uncertain amounts of healthy data during training, significant variation in the initial health state of each unit, different distributions of operating conditions, and situations where most of the data is healthy.We compare our proposed approach with the state-of-the-art method, the residual approach, that models the normal healthy behavior of the sys-tem and discovers HI by calculating the reconstruction error of predictions.The residual approach usually requires health state labels to select appropriate data for training or makes assumptions about the system's health state.
Figure 3. AE derived from the graphical representation Our results show that the proposed method outperforms the residual approach in most experiments, especially when the initial health state of each unit differs (Figure 4).These findings demonstrate the potential of our method to provide more accurate and robust HIs for prognostic models.
A paper regarding unsupervised HI discovery was submitted to the PHM23 conference.

Cyclic Generative Adversarial Networks for controllable data generation
Creating a prognostic model in the absence of a representative set of run-to-failure data can be a highly challenging task.To compensate for the lack of representativeness, we developed a controllable and physics-informed data generation process that can be used to improve prognostic performance.Inspired by the work done in (Chu et al., 2021), we have adapted the cyclic Generative Adversarial Network (cGAN) structure for the problem of a controllable data generation process for PHM (Figure 5).
Our developed model has a specific focus on prognostics, as it has the capability to predict remaining useful life (RUL) based on sensor readings while also having the ability to generate sensor readings conditioned on a certain RUL value.
Our proposed solution directly incorporates prognostic predictions into the data generation process to find an explicit relationship between RUL and the degradation of a system.And by incorporating degradation into the data generation process, we can generate sensor readings informed by the system's degradation behavior.

Figure 4 .
Figure 4. Discovered HI of test unit 10. hi re(H) health index of residual approach assuming H healthy observations, hi p health index of the proposed approach,

Table 1 .
Results of individual model training versus combined framework