Adaptive Prognostics: A reliable RUL approach

Prognostic methodologies have found increasing use the last decade and provide a platform for remaining useful life (RUL) predictions of engineering systems utilizing condition monitoring data. Of particular interest is the reliable RUL prediction of engineering assets that either underperform or outperform due to unexpected phenomena that might occur during the operational life. These assets are often referred as outliers and the prediction of their RUL is a challenging task. The challenge is to accurately predict the RUL of an outlier without taking into account outlier’s condition monitoring data in the training process but just in the testing process. As a result, the lifetime of the testing asset is shorter (left outlier) or longer (right outlier) than the training process’ lifetimes. 
This study addresses this challenge by proposing a new adaptive model; the Similarity Learning Hidden Semi Markov Model (SLHSMM), which is an extension of the Non-Homogenous Hidden Semi Markov Model (NHHSMM). The SLHSMM uses a similarity function, such as Minkowski distances, in order firstly to quantify the similarity between the testing asset and each training asset and secondly to adapt the trained parameters of the NHHSMM. To demonstrate the effectiveness of the proposed adaptive methodology, composite structures have been used as a validation engineering asset. In particular, the training data set consists of strain data collected from open-hole carbon–epoxy specimens, which were subjected to fatigue loading only, while the testing data set consists of strain data collected from specimens that were subjected to fatigue and in-situ impact loading, which can be considered as an unexpected phenomenon and unseen event regarding the training process. 
Utilizing the aforementioned strain data the SLHSMM RUL predictions and the NHHSMM RUL predictions were compared, so as to verify that the SLHSMM provides better prognostics than the NHHSMM. SLHSMM provides better predictions in comparison to the NHHSMM for all the test cases, demonstrating its capability to adapt to unexpected phenomena and integrate unforeseen data to the prognostics course.


INTRODUCTION
Engineering systems, particularly composite structures, typically function in dynamic environments with varying operational conditions, such as loads, resulting in fluctuations in condition monitoring (CM) data.The service life of composite structures is intricately linked to various factors, including their operational and maintenance procedures, as well as the often unpredictable environmental and operational conditions.Unexpected phenomena can arise during the lifetime of these structures, which were not accounted for during the design phase.To illustrate, consider the aviation industry, where events like birdstrikes, hail, or tool drops can occur at any point during an aircraft's service life.These events fall under the category of unexpected phenomena, potentially causing damage that wasn't foreseen during the design phase.The implications of such unexpected occurrences on the integrity of a structural component can be severe.As a common practice, once these events are recorded, aircraft operations are halted, and inspection and repair actions are initiated, incurring unplanned costs.In this scenario, a Remaining Useful Life (RUL) prediction model would assess the impact of the unexpected event and provide an updated prediction.
However, the existing state-of-the-art RUL prediction models, whether model-based (MB) or data-driven (DD), may not be ideally suited for such scenarios.MB models struggle because they can't realistically incorporate every potential unexpected phenomenon into their physical laws.On the other hand, traditional DD models have a significant limitation -they are most efficient at predicting degradation processes when the testing data closely resemble the conditions in which the training data were collected.In cases like foreign object impacts on composite structures, the accuracy of RUL predictions relies heavily on whether the training data include relevant information about such impacts.Collecting comprehensive training data to cover every possible testing scenario is not realistic.
Therefore, there's a pressing need to develop RUL models with real-time adaptive capabilities.These models must offer more accurate RUL predictions for engineering systems and structures that may perform exceptionally due to unforeseen phenomena during their service life.
Several adaptive prognostic models have been proposed in the last 15 years.Orchard et al. (2009) employed two different approaches to implement outer feedback correction loops within particle filter algorithms.These loops integrated information about short-term prediction errors to enhance the overall performance of the prognostic framework.Nevertheless, certain crucial initialization parameters, such as the number of prediction steps (k) and the variance vector of the kernel noise [p q]T, needed to be predefined.These approaches were tested using data from a simulated fault test conducted on a critical component of a rotorcraft transmission system.The results demonstrated that the incorporation of outer feedback correction loops significantly improved the precision and accuracy of the predicted RUL.Daroogheh et al. (2015) introduced a hybrid prognosis model that integrates particle filters and neural networks for gas turbine engines.It is worth noting that the combination of particle filters and neural networks is a common choice in the literature due to their availability in many commercial and open-source programming languages, coupled with their relatively straightforward implementation compared to other algorithms.The authors developed this hybrid prediction model by extending particle filters to forecast observations in the future.This forecast utilizes a neural network approach as a nonlinear time series prediction method.Neural networks are adaptively trained based on newly received data when discrepancies between forecasted observations from the network and real observations increase from one test data set to another.Nevertheless, this hybrid prognosis model lacks the provision of confidence intervals.Sbarufatti et al. (2017) introduced a model for battery prognostics that combines particle filters with radial basis function neural networks (RBFNNs).This model exhibits adaptive characteristics as the RBFNNs are trained online.Specifically, the neural network parameters are identified in real-time by the particle filters as new observations of the battery terminal voltage become available.The RBFNNs algorithm has shown effectiveness in delivering prognostic predictions across normal and aging scenarios.Prior to employing RBFNNs, artificial noise was introduced to the dataset to replicate realistic online voltage measurements, mirroring real-world conditions rather than controlled environments.Selecting suitable noise variances poses a challenging task, as excessively small values can impede effective state-space exploration, while excessively large values can hinder efficient state estimation.Si et al. (2017) employed a Wiener-process-based model coupled with a recursive filter algorithm for RUL predictions.A state space model continually updates drift coefficients, treated as random variables, and an expectation maximization (EM) algorithm re-estimates all unknown parameters as new data becomes available.The proposed model was employed to estimate the RUL of gyros in an inertial navigation system.However, Wiener models assume a linear connection between the degradation process of the studied system and the operational time, which may not always hold true.
Additionally, Khan et al (2018) proposed an adaptive degradation prognostic model that utilizes particle filters alongside a neural network degradation model for predicting the RUL of turbofan jet engines.RUL predictions were generated using two different algorithms for benchmarking: the nominal RBFNNs with particle filters and the similaritybased prognostics.RUL predictions from both algorithms exhibited volatility, but notably, the similarity-based approach lacked support for predicting RUL confidence intervals, a crucial output for algorithm robustness.Furthermore, the proposed prognostic model necessitates the initialization of the random walk step size (σa).Selecting σa is not straightforward, as a large value promotes rapid convergence but results in high fluctuations, while a small value yields smoother yet slower parameter estimation convergence.Consequently, σa selection is case-studydependent.Cadini et al. (2019) proposed leveraging the adaptability of neural networks to learn from a monitored metallic structure and derive real-time models for diagnostics and prognostics.To achieve this, neural networks were incorporated within a particle filtering scheme, and the network's training process occurred in real-time as CM data became available during the structure's operation.Consequently, the proposed RUL model could sequentially update itself using the accessible CM data.This model was demonstrated in simulated and real fatigue crack growth tests conducted on metallic aeronautical panels.Primary limitations of this model include the time required to achieve convergence to the actual RUL, which tends to be longer compared to similar RUL models, volatile RUL predictions, and divergent behavior of confidence intervals towards the end of life.Nevertheless, this model could potentially play a role in structural prognostics in the future as physics-based or more accurate empirical/phenomenological models become available.
Finally, Eleftheroglou et all (2020) developed a new datadriven model i.e. the Adaptive Non-Homogenous Hidden Semi Markov Model (ANHHSMM), which is an extension of the NHHSMM.The ANHHSMM uses diagnostic measures, which are estimated based on the training and testing CM data, and it adapts the trained degradation process parameters Γ of the NHHSMM.The training data set was collected from open-hole carbon-epoxy specimens, subjected to fatigue loading, while the testing data set was collected from specimens, subjected to fatigue and in-situ impact loading.The ANHHSMM provided better predictions in comparison to the NHHSMM for all the cases, demonstrating its capability to adapt to unexpected phenomena and integrate unforeseen data into the prognostics course.However, the suggested model was able to adapt only part of the training parameters i.e. the degradation process parameters when the observation process parameters were predefined.
Based on the conducted literature review, there is clearly a need to further develop models with real-time adapting capabilities so as to be able to predict more accurately the RUL of engineering systems and structures that either underperform or outperform due to unexpected phenomena that might occur during the service life.These adaptive models have to be data-driven in the case of composite structures because the incomplete knowledge about the physics behind the evolution and interaction of composites' damage mechanisms and the unrealistic involvement of any physical law, that is able to describe all the possible unexpected phenomena, make a MB model not a visible option.
The contribution made in this paper is to propose a new RUL probabilistic model, the Similarity Learning Hidden Semi Markov Model (SLHSMM) which is an extension of the NHHSMM.The remainder of this paper is organized as follows: the SLHSMM is described in section 2, the case study analysis is presented in section 3 and finally, the paper is concluded in section 4.

METHODOLOGY
Approaches grounded in stochastic filtering (Orchard & Vachtsevanos, 2009), multi-stage degradation modeling (Rabiner, 1989), and covariate hazard modeling (Lu & Liu, 2014) represent common methodologies that can consider the variability in component lifetimes (Si, Zhang & Hu, 2017).Given that the accumulation of damage in composite structures exhibits stochastic correlation with Condition Monitoring (CM) data, multi-stage degradation models, particularly Markov models (MMs), emerge as the preferred approach for estimating the RUL of composite structures.MMs have been in use since the 1980s (Bogdanoff & Kozin, 1985).However, a key limitation of MMs lies in the Markovian assumption, which posits that future degradation states are independent of past degradation states, a condition not universally valid in engineering systems.
Recognizing this drawback, Hidden Semi-Markov Models (HMMs) were introduced by Rabiner (1989).HMMs feature a multi-state structure wherein each state remains hidden and is linked to the damage accumulation phenomenon through a set of parameters referred to as observation process parameters (B).However, a notable disadvantage in this case is the assumption of an exponential sojourn time distribution for each hidden state, a premise not consistently valid.Hidden Semi-Markov Models (HSMMs) address this issue by allowing for the unconstrained selection of sojourn time distributions (Peng & Dong, 2011).
Both HMMs and HSMMs share a limitation in terms of state transition, which remains independent of the age of the engineering system or the time spent in the current hidden state.To account for this limitation, Moghaddass and Zuo (2014) extended the HSMM approach by developing the Non-Homogeneous Hidden Semi-Markov Model (NHHSMM).In this model, the degradation process, described through the Γ parameters, depends on the current hidden state, the time spent in the current hidden state, and the overall age of the studied system.However, a common limitation across all these models, including MMs, HMMs, HSMMs, and NHHSMMs, is the absence of adaptation capabilities for the estimated model parameters θ={Γ,Β} while the engineering system, such as a composite structure, is in operation.To address this adaptation issue, Eleftheroglou et al. (2020) introduced the Adaptive NHHSMM (ANHHSMM), which, as previously mentioned, was capable of providing accurate predictions for outlier cases.Nonetheless, the proposed model could adapt only the degradation process parameters (Γ) without allowing for any adaptation of the observation process parameters (Β).
In this respect, the objective of this study is to develop a novel adaptive version of the NHHSMM, termed the Similarity Learning HSMM (SLHSMM).This model will possess the capability to adapt not only the degradation process parameters (Γ) but also the observation process parameters (Β).

Similarity Learning HSMM
The SLHSMM consists of a bi-dimensional stochastic process.The first process forms a finite Semi Markov chain, which is not directly observed, and the second process, conditioned on the first one, forms a sequence of independent random CM data variables.In order to describe the aforementioned bi-dimensional stochastic process the model's parameters θ={Γ,Β} have to be estimated via the available CM data.Γ parameters characterize the transition rate distribution between the hidden states (degradation process), while Β parameters deal with the correlation between the hidden states and CM data (observation process).This correlation is represented in a nonparametric and discrete form via a matrix called emission matrix.
The parameter estimation process consists of the initialization and training procedure.The purpose of the initialization procedure is to identify a set of parameters ζ, with high computational efficiency, which will associate the damage accumulation phenomenon and the available CM data.The initialization procedure is obtained by defining; the number of possible discrete degradation states (N), the transition diagram which defines the connectivity between the states and the allowed transitions (Ω), the transition rate's statistical function (λ), the CM data of K training observation sequences y (k) , and the discrete CM indicator space (Z={z1,z2,…,zV}).The reader can refer to Eleftheroglou and Loutas (2016) for a more detailed description.
With regards to the training procedure, parameters θ={Γ,Β} are obtained via a novel similarity learning maximum likelihood estimation (SL-MLE) method.The similarity relationship between the testing and training degradation histories is dynamic and represented by a nonparametric discrete distribution referred to as the similarity learning vector (SLV).The SLV is time-dependent and has K elements, where the kth element of this vector quantifies the similarity of the testing degradation history and kth training degradation history up to time T (wT (k) ).For similarity quantification, different methods can be used e.g.cosine similarity, Euclidean distance, Manhattan distance etc.In this study, the Euclidean distance method is utilized in terms of simplicity.To that end, the Euclidean SLV is obtained via Eq.(1).
where K is the available training degradation histories, xi the testing CM data at the time step i, yi (k) the training CM data of the kth degradation history at the time step i.
The proposed SL-MLE utilization leads to maximize the likelihood function L(θ,y (1:K) ), where y (k) is the kth degradation history, K is the number of available degradation histories, θ={Γ,Β} and w (k) is the kth SLV element at a predefined time step T.

L(𝛉, 𝐲
setting initial values for Γ, Β, defining the time step T and solving the aforementioned optimization problem, the parameter estimation process is obtained. It is worth mentioning that in the case of a noninformative and static SLV function, i.e. wT (k) = 1/K for every possible T and k, the SL-HSMM is identical to the NHHSMM.

Diagnostics
Finding a monotonic degradation measure, which at least reflects qualitatively the damage accumulation has always been an interesting and challenging topic in real-time CM applications (Shen et al., 2012).In addition, finding such a monotonic measure will be critical in terms of defining the parameter T. To that end, a reasonable measure to monitor the overall health status of a composite structure is the diagnostic measure Most Likely State (MLS) (Moghaddass & Zuo, 2014), which can be determined via Eq.(3).

MLS(t|𝑥
This measure maximizes the probability Pr(Q t = i|x 1:t ,  * , ) of being at the hidden state i at the time point t given the testing CM data up to time t (x1:t).
Utilizing the MLS diagnostic measure, the similarity learning timestep T can be defined as the transition timestep from the damage state N-2 to N-1, where N is the failure state.Following the aforementioned definition of T a representative amount of data will be available in order to calculate the SLV vector.However, the number of degradation states (N) should ideally be relatively small (N < 10) to allow sufficient time for decision-making and maintenance actions, while also providing enough data for the adaptation task.

Prognostics
Prognostic measures can be defined based on the θ * parameters and the testing CM data (x).In other words, conditional to the testing CM data and the complete similarity learning model θ * , prognostics tries to estimate the probability of being in degradation states 1,…, N-1 at specific time points in the future i.e. the conditional reliability function.
Conditional reliability function, R (t|x 1:t p , L > t p ,  * , ) = Pr (L > t| 1:t p , L > t p ,  * , ) , represents the probability that the studied structure continues to operate after a time t, less than life-time L (L>t), further than the current time tp given that the structure has not failed yet (L>tp), the testing CM data x1:tp and the complete model θ * , ζ.
In this study, the mean and confidence intervals of RUL are proposed as prognostic measures.These measures were calculated via the cumulative distribution function (CDF) of RUL (Moghaddass & Zuo, 2014).The CDF of RUL is defined at any time point via the conditional reliability according to the following equation: Pr (RUL t p ≤ t| 1:t p ,  * , ) = 1 − R(t + t p │x 1:t p ,  * , ) (4)

CASE-STUDY
To illustrate the adaptability and effectiveness of the proposed model, open-hole carbon/epoxy specimens were subjected to in-situ impact and constant amplitude fatigue loading until failure occurred.The training dataset comprises strain CM data collected from specimens exposed exclusively to fatigue loading.In contrast, the testing dataset comprises CM data gathered from specimens that experienced both fatigue and in-situ impact loading.It's essential to note that the introduction of impact loading was limited to the testing phase, with the specific aim of influencing fatigue life and generating outlier cases.In this context, the in-situ impact can be characterized as an unforeseen event and an unexpected phenomenon in relation to the training data.The primary objective of this case study is to validate that the SLHSMM exhibits enhanced accuracy in predicting RUL compared to the NHHSMM, particularly when the testing composite specimens deviate significantly from the norm, either as left or right outliers.

Experimental campaign
The experimental set-up consists of a 100 kN MTS fatigue controller and bench machine, an impact canon, and two cameras for digital image correlation measurements i.e. strain data, Figure 1.A laminate with [0/45/90/-45]2s lay-up and average thickness of 2.28mm was manufactured using the autoclave process.Ten specimens, with the following geometrical details; dimensions [400mm x 45mm] and a central hole of 10mm diameter, were tested at 90% of the static tensile strength (S=36 kN) with R=0.1 and f=10 Hz.The in-situ impact occurred at the hole, as this location experiences the highest stresses, aiming to maximize the effect of impact on the damage accumulation process.The selected energy was E=6 J (impact velocity 20 m/sec) for all the cases and it can be categorized as high-speed low energy impact.Furthermore, during the impact, the specimens were under tension equal to the mean fatigue load (16 kN).The time of impact was limited to the period between the start of the fatigue test and until damage could be observed by visual inspection.Table 1 presents the lifetime of the training and testing specimens and when the impact occurred.Specimens 9-10 are the testing specimens for which the impact occurred at 8200 and 2200 sec of their fatigue life respectively.The testing data consists of two outliers, one left (specimen 9) and one right (specimen 10).
The digital image correlation (DIC) technique is used for fullfield strain measurements and the following procedure was adopted so as to extract the strain measurements; every 500 cycles the fatigue test was interrupted, the load was set automatically within one second at σmin and then the load ramped to σmax within a second and two images were acquired.After that the fatigue test continued for the next 500 cycles, see Figure 2. In case of the in-situ impact, the safety aluminum cylinder covered the specimens' monitoring area so DIC images could not be acquired during the impact but only afterwards.
Figure 3 presents the training and testing axial strain degradation histories.The monitoring area has been defined based on the analytical model of Lekhnitskii et al. (1963), which calculates the effect of a notch on the stress/strain distribution.Table 1.Lifetime and impact times of training and testing specimens.

Specimens
Figure 2. DIC data acquisition strategy.

Similarity Learning HSMM
Initially, the procedure of damage accumulation in composite structures under fatigue loading (Specimen01-Specimen08) is modelled via the NHHSMM and θ * ={B * ,Γ * } parameters were determined via the SL-MLE procedure Eq. ( 2), defining the SLV vector as wT (k)  Reifsnider and Talug (1980) the damage accumulation process of composite structures can be efficiently approximated as a four-state process.In Table 2 the estimated B * parameters are presented and in Figure 4 the black-shade lines depict the NHHSMM estimated Γ * parameters (Moghaddass & Zuo, 2014).
Table 2. NHHSMM (B * ) and SLHSMM (BSL * ) emission matrixes.The MLS diagnostic measure was calculated utilizing the estimated θ * parameters and the online testing CM data.
Figure 5 presents the estimations of the MLS measure as calculated from Eq. ( 3) at each time point during the operation time of Specimen09 and Specimen10.
Based on the MLS estimations the similarity learning timestep T was defined for each testing specimen, i.e.TSpecimen09=17000 sec and TSpecimen10=97500 sec, and the Euclidean SLV was obtained via Eq.( 1).In Figure 6 the SLV nonparametric discrete distributions for each testing specimen are presented.
Based on Figure 6 and Table 1 the testing Specimen09 has a higher similarity with the training Specimen04, the training set's left outlier, and the testing Specimen10 is 100% similar to the training Specimen08, the training set's right outlier.These similarity-learning outcomes are the desired ones since they reflect that Specimen09 is a left outlier and Specimen10 is a right outlier.
Figure 5. MLS diagnostic measure of testing specimens.Utilizing the testing CM data and SLV vectors the SLHSMM can be defined and dynamically adapt the parameters θ * ={B * , Γ * } to θSL * ={BSL * , ΓSL * }, following the SL-MLE procedure Eq. ( 2).In Table 2 and Figure 4 the outcomes of the SLHSMM are presented.
As Table 2 depicts the difference between the NHHSMM emission matrix (B * ) and the SLHSMM emission matrix (BSL * ) is negligible as was expected since the emission matrix does not depend on time.The emission matrix correlates CM data and hidden states.Furthermore, the CM data range remains the same since the last observation, as already mentioned, should be unique dictating a common failure threshold in the training and testing data set.

Remaining Useful Life Estimations
Following the aforementioned similarity-learning framework, two four-state (N=4) models, allowing soft and hard state transitions, were developed and θ * , θSL * ={ θ * SL-Specimen09, θ * SL-Specimen10} parameters were estimated according to the training and testing CM data.Through Eq. ( 4), the conditional RUL CDF is calculated from the similarity learning timestep T, i.e.TSpecimen09=17000 sec and TSpecimen10=97500 sec, till the end of life.The mean RUL and the 2.5% and 97.5% percentiles that define a 95% confidence intervals are also highlighted.Figures 7 and 8 present the prognostic results of the SLHSMM and the NHHSMM for Specimen09 and Specimen10 accordingly.Based on Figures 7 and 8 the SLHSMM provides better outlier prognostics since the mean SLHSMM RUL estimations are able to approach more satisfactorily the real RUL estimations than the NHHSMM.Additionally, the confidence intervals of the SLHSMM contain the real RUL curve during the whole lifetime of Specimen10 and their distance is shorter than the classic model in terms of Specimen09.

CONCLUSIONS
In this study, a new similarity learning probabilistic datadriven methodology was developed.The aim was to enhance the accuracy of predictions, especially in cases of outlier behaviors not encountered in the training data.The model's performance was evaluated by predicting the RUL of openhole carbon/epoxy specimens subjected to constant amplitude fatigue loading until failure, while in-situ impact events were introduced to demonstrate unexpected phenomena.The DIC technique was employed so as to collect strain CM data.
For training, eight observation degradations were utilized, with the training specimens solely subjected to fatigue loading.For testing the proposed adaptive methodology, two degradation histories were used.These observations were obtained from two different specimens exposed to both fatigue and in-situ impact, creating both left and right outlier cases compared to the training histories.
The results clearly demonstrate that the SLHSMM provides more accurate prognostics compared to the state-of-the-art NHHSMM.These findings suggest that adapting the NHHSMM's parameters using the similarity learning vector, as demonstrated in this work, has the potential to significantly improve the RUL predictions.
However, it's essential to acknowledge a key limitation: the dependency between the Similarity Learning Vector (SLV) and the outliers present in the training set.Nevertheless, this dependency has a relatively minor impact on RUL predictions, enabling the model to effectively handle outlier cases.Another area for improvement is the selection of the similarity learning timestep T, which has currently been defined manually based on diagnostics.Future work aims to automate this selection process by continuously calculating the similarity between the testing system and the training systems, eliminating the need for manual definition and enhancing the model's adaptability.
Furthermore, enhancing the similarity calculation method is crucial.While the Euclidean distance method for quantifying similarity has been employed, this point-by-point formulation may not fully capture the complexities of damage evolution processes.Future research will explore extending this formulation to a vector-to-vector approach to provide a more comprehensive understanding of the degradation process.
Lastly, although the SLHSMM was tested in the context of composite materials, its high flexibility suggests potential applications in various engineering prognostic challenges.

Figure 6 .
Figure 6.Similarity Learning Distribution of testing specimens.