A Hybrid Approach for Fusing Physics and Data for Failure Prediction

This work describes the architecture for developing physics of failure models, derived as a function of machine sensor data, and integrating with data pertaining to other relevant factors like geography, manufacturing, environment, customer and inspection information, that are not easily modeled using physics principles. The mechanics of the system is characterized using surrogate models for stress and metal temperature based on results from multiple non-linear finite element simulations. A cumulative damage index measure has been formulated that quantifies the health of the component. To address deficiencies in the simulation results, a model tuning framework is designed to improve the accuracy of the model. Despite the model tuning, unmodelled sources of variation can lead to insufficient model accuracy. It is required to incorporate these un-modelled effects so as to improve the model performance. A novel machine learning based model fusion approach has been presented that can combine physics model predictions with other data sources that are difficult to incorporate in a physics framework. This approach has been applied to a gas turbine hot section turbine blade failure prediction example.


INTRODUCTION
The ability to accurately predict the failure of hot section components (e.g.turbine blades, nozzles, rotor and combustor components) has been a critical problem to solve owing to the impending turbine down-time and high downstream damage costs in case of an un-intended failure.Condition based maintenance (CBM) can be performed on components for which accurate prediction of time to failure is available to avoid any such events.Hence, there has been a shift in the industry from fixed maintenance intervals towards CBM.
Accurate prediction of time to failure can also enable interval extension of components that have sufficient life remaining, which provides more flexibility to operate the machine.The life of such components depends on many factors, such as operation conditions of the machine, material properties and associated variability, manufacturing variability and environment conditions to name a few.This work describes a method to fuse information from physics based failure models with other relevant features using machine learning methods.The physics based models mainly account for variability in operational behavior to capture the major failure modes.The other relevant features are chosen based on understanding of the system and are features that impact the failure of the component but are not easily accounted through physics models.

FAILURE ESTIMATION
In the past few decades, there have been many efforts to estimate the time to failure, or remaining useful life (RUL) in the industry and academia.Traditional approaches to estimating failure or reliability have been based on recording failure events on a population of identical units of machines.Then parametric approaches like Weibull, log-Normal, Poisson etc. have been used to model part or system reliability.Reliability studies have been extensively published in literature in (Kapur & Lamberson, 1977), (Keller, Perera, & Kamath, 1982), (Crowder, Kimber, Sweeting, & Smith, 1994), (Elsayed, 1996), (Groer, 2000), (Lawless, 2002) and (Schömig & Rose, 2003).In summary, these approaches use historical time to failure data to estimate the failure characteristics of a population.Though very useful in maintenance scheduling for populations of machines, these models may be of limited use in CBM, as they provide very little information about an individual machine's condition.Towards this, many prognostic approaches have been developed that try to estimate the life of components in individual machines.These approaches can be broadly categorized into three types, which are discussed in the following sections, followed by a brief description of the current approach.

Physics Based Approaches
One of the major approaches to estimate the RUL of components is through modeling their physics of failure (PoF).This method is heavily dependent on a deep understanding and modeling of the system's fundamental process which could be mechanical, electrical, thermal or chemical.These physics models are used to make prediction of how long it will take for the failure to progress to a predefined state, for example, a crack to grow to a certain critical size.The methods generally work on obtaining the usage spectrum of the machine including speed, temperature, pressure and power data.This information is then processed to obtain local loading information like the loads, stresses, temperatures and strains that can be obtained, based on either phenomenological models or first principle models.Then a damage index can then be computed based on the history of loads and material property information present for the component in question.
In the rotating machines space, such methods have been applied on various components (Heng, Zhang, Tan, & Mathew, 2009).A methodology to predict bearing defect size using fatigue crack propagation models, incorporating operation information, was developed in (Li, et al., 1999).It also presented the capability of the model through comparisons with experimental results.In (Oppenheimer & Loparo, 2002), an approach using Forman law for crack growth was applied to model rotor shaft crack growth.
A spur gear fatigue crack prognostic methodology was devised in (Li & Lee, 2005), which combined a gear dynamic simulator for dynamic load estimation, a finite element-based simulator for calculating stress intensity factor and a fast algorithm based on a Paris crack growth model for crack propagation.The stress intensity factors were generated using the finite element simulations and pre-stored in a lookup table, enabling on-line application of the model.
In a two series work (Koul, Zhou, Fuleki, Gauthier, & W., 2005), (Koul, Bhanot, Tiku, & Junkin, 2007), a methodology for lifing turbine engine components was presented.The models incorporated engine operation data with detailed engine modeling, finite element model and a microstructural damage model to obtain high accuracy predictions of life to creep crack initiation.
The advantage of these methods is that it does not require data sets associated with failure data to make prediction models feasible.However, this method does require understanding of the operating profile of the machine to relate operational conditions to failure.The disadvantage of this method is that physics models need to be developed for individual failure modes, making it time consuming.In addition, it is difficult to capture certain phenomena that are not understood well and hence difficult to model, but could be contributing to failure.

Data Based Approaches
Data-driven approaches, based on machine learning methods are increasingly being applied for lifing and fault prognostics.These approaches can be seen as a set of black-box models that learn directly from sensor data (eg: vibration, temperature, pressure, current etc.) collected from the machine.These models are built based on historical data and use input sensor data directly to produce the required output.In literature, data-based approaches using machine learning methods and statistical techniques have been widely used to predict component degradation (Heng, Zhang, Tan, & Mathew, 2009), (Si, Wang, Hu, & Zhou, 2011), (Sikorska, M.Hodkiewicz, & Ma, 2011) and (Javed, 2014).
Machine learning techniques attempt to learn from the available data, often capable of capturing complex relationships between the input parameters and desired output that can be difficult to describe using physics.Depending on the problem and available data, supervised or un-supervised learning methods can be applied.Supervised learning is applied in cases where the input and corresponding output are known.Unsupervised learning is applied to unlabeled data, i.e., learning data are only composed of input and desired output is unknown.
A large number of machine learning algorithms are being developed for prognostics.Artificial Neural Network (ANN) is currently the most common method used to train the datadriven models.An ANN consists of a layer of input nodes, one or more layers of hidden nodes, one layer of output nodes and connecting weights.The network learns the unknown function by optimizing its weights with multiple observations of inputs and outputs.Numerous studies across various disciplines have demonstrated the merits of ANNs (Tickle, Andrews, Golea, & Diederich, 1998), (Tse & Atherton, 1999), (Setiono, Leow, & Thong, 2000), (Bostwichk & Burke, 2001) and (Joshi & Reeves, 2006).
The most common criticism of the data based models is that they require a lot failure data than the physics based approach.In practice, it is not feasible to obtain such data in large quantities as machines may not always run till they fail.Also, such models can be created only after seeing multiple failures and cannot help in identifying critical components that could drive maintenance schedules and cost.In addition, it is difficult to incorporate variability in material properties and operation during modelling which in turn causes errors in estimates.

Hybrid Approach
A hybrid approach is a combination of physics based and data-driven approaches, that takes the advantages of both approaches.The main idea is to achieve finely tuned prediction models that have better capability to manage uncertainty, and can result in more accurate RUL estimates.There have been two categories of hybrid approaches practiced in the past: series approach and parallel approach (Javed, 2014).
A series approach, shown in Figure 1, combines a physics based model which captures the failure mode or process being modeled through an understanding of the system and a data driven approach that helps estimate the process parameters that are uncertain, using failure data from field.The data model can be a simple parameter optimization method using classical optimization techniques when historical data is available.In cases where the degradation may not be observed directly, on-line parameter estimation techniques like Kalman filter, Particle filter and their variants can be used (Sikorska, M.Hodkiewicz, & Ma, 2011).These methods update the tunable parameters when new data is collected.The fundamental idea behind using this approach in PHM is that the predicted feature is not necessarily a direct outcome of the tuned parameters but could be a down-stream parameter.For example, in a system, the cooling effectiveness for metal temperature calculation could be tuned using the crack lengths observed from a borescope inspection.

Figure 1. Series Approach in Hybrid Modeling
On the other hand, in a parallel approach, the output from the physics model is combined with the data from other sources using data based methods.This is schematically shown in Figure 2. In this approach, the machine learning module can be trained to predict the errors in prediction that are not explained by the physics model, using other relevant features that cannot be modeled using the physics based model, but impact the failure mode.These other parameters and associated features can be thought to be estimates that account for un-modeled effects in the physics model (Javed, 2014).
Such a prediction could potentially capture the failure better than either of the models independently.Some examples of such approaches have been presented in (Hansen, Hall, & Kurtz, 1995), (Cheng & Pecht, 2009)

Current Work
The purpose of this work is to communicate a fusion approach which combines both the series and parallel approaches in hybrid modeling to achieve better accuracy models, through application on a turbine hot section blade.The life model, built on the physical understanding of damage accumulation derived from intensive thermomechanical analyses is described in section 3. A method for subsequent model updating to match actual field observations (e.g.Inspection data) is described in section 4. Finally, different methods to fuse the information from the calibrated model with information from other sources are explained in section 5. Comparisons of the methods show the importance of each method.The authors believe that this understanding/ approach could be later be leveraged to address similar such failure modes on other mechanical components.

PHYSICS-BASED DAMAGE ACCUMULATION MODEL
The core of the proposed fusion framework is a physics based damage accumulation model that translates turbine operation data to the probability of failure of a given component.Thus, differences in accumulated damage due to differences in operation pattern in different gas turbines can be accounted for in this step.The ensuing section summarizes the key technical steps involved in the development of such a model to predict probability of failure in gas turbine components.

Technical Approach
An outline of the key steps followed is presented in Figure 3.The first step in this approach is to use historical data from sensors to determine the nature of operation of the gas turbine.Typically, this would first comprise aggregating time series information from key sensors (like temperature, pressure, turbine-speed and output power) followed by standard data pre-processing algorithms to eliminate erroneous data arising due to sensor or data-collection faults.
In a physics based modeling framework, the sensor information gets translated into mechanical loads that result in progression of damage which can be estimated through intensive computational (CFD, FEA) simulations.In practice, it is impractical to perform these simulations over the entire operational history of each machine for which damage of components needs to be evaluated.
To this end, a 'design-of-experiments' (DOE) is constructed using the key sensor parameters that govern gas-turbine operation.A statistical analysis of the ranges of these key parameters, using the previously aggregated data, helps in determining the ranges of these parameters that encompasses the machine operation envelope.Each point in the constructed DOE therefore corresponds to a different operating condition at which the state of the critical component must be evaluated.
Thereafter a mechanistic approach is used to determine the 'state' of the critical components under varying operating conditions in the DOE space.'State' here refers to the physical condition of the component and in particular implies parameters like localized temperature, stress, strain, etc. that influence the concerned failure mode.For example, for a creep-driven failure mode the localized metal temperature and Von-Mises' stress need to be determined as a function of the gas turbine operation.To achieve this, thermodynamics equations are used to calculate bulk gas temperatures and pressures around the critical components to be analyzed.This helps in determining the boundary conditions for computational fluid dynamics (CFD) and finite element (FE) simulations that need to be performed next.The CFD simulations help in determining the spatial variation of flow velocities, pressures and the heat transfer coefficients across the critical components.These results are then used as boundary conditions for FE simulations to determine the thermo-mechanical state of the critical component.
The results of the DOE from the FE model simulations are used to construct a reduced order model (ROM) or surrogate model that can determine the key state variables as a function of the different operating conditions.For example, if the target failure mode of the key component is low-cycle fatigue (LCF), then reduced-order models need to be constructed to estimate the temperature and stress at the key locations of the component.This is typically achieved via meta-modelling techniques, such as regression or artificial neural networks (ANN), in which the inputs are the different operational parameters (at the different DOE points) and the outputs are the concerned states (e.g.stresses and temperatures) determined using the FE simulations.Depending on the complexity of the problem, the original DOE might have to be refined at this point in order to obtain a meta-model that is adequately accurate at the different operating condition.
Thereafter, the machine specific sensor data is provided as inputs to these surrogate models to obtain a time-series representation of the state at the critical locations in the components of interest.For example, if the failure mode of interest is LCF, then the stress and temperature time-histories need to be analyzed using rain-flow counting methods to compute the stress and temperature ranges in each cycle.Finally, this time series then feeds into a damage accumulation rule (e.g. in this case Miner's rule (Zaretsky, 1998)) specific to the failure mode of concern.Thus, the sensor data time series is translated into a time series of the evolution of damage at the critical location.
The damage accumulation law typically comprises material parameter coefficients that are determined through controlled experiments on coupons.For example, for determining the crack growth law, steady-state crack propagation experiments are performed on coupons, whose results are then used to determine Paris' law coefficients (Paris & Erdogan, 1963).The inherent variability in material property (that results from microscopic variations in the microstructure of the components) can thus be captured by treating material parameters as random variables and estimating their distribution from the results of these experiments.For parameters estimated via regression, this typically implies computing both the mean and standard error of the parameter estimates.The effect of variability in material properties on damage can therefore be accounted for by using the distribution of material parameters to perform Monte Carlo simulations for damage evolution.This results in an ensemble of damage evolution curves, each corresponding to a realization of the material property parameters, that can then be used to obtain a distribution of damage at every instant of time (as shown in Figure 3).For gas turbine components that have multiple identical components subject to the same operating conditions (such as multiple blades at a given stage), this helps in estimating the variability one would see in the degree of damage across the different blades in the same stage in a given machine.
Finally, the estimated damage is compared against a damage threshold (at which we expect to see a failure) to determine if the component would fail in the given failure mode.For more physical damage parameters (such as crack length) the threshold can be determined based on allowable engineering limits (such as maximum repairable crack-length).In other cases, the threshold can be determined by constructing a single variable (i.e.damage) classifier using historical failure data.In this case the damage threshold can be varied to obtain a desired true-positive and false-positive rate.The quality of the failure prediction model can thus be estimated by constructing the receiver operating characteristic (ROC) curve and computing the area under the curve (AUC).

Example: Predicting Creep-driven Cracking of Turbine Blades
The approach described in section 3.1 was used to develop a model for predicting failure of gas turbine blades in GE gas turbines.The primary failure mode in this case was creepdriven cracking near the tip of the blade.Historical turbine repair data was used to identify 17 turbines that had reported one or more cracked blades and 21 turbines in which no crack was reported.
In order to create the DOE for describing different operating conditions, historical sensor data was collected for a large number of turbines in the same fleet.Some key sensor parameters that were analyzed were output power ( ) , ambient air temperature (  ), combustor exit temperature (  ), compressor discharge conditions(), mass flow rate of air (̇), and exhaust gas temperatures( ℎ ).A statistical analysis of these parameters was performed to determine the primary independent parameters and their ranges.This was then used to construct the operation-space DOE.
Next, gas turbine performance models were used to determine the bulk pressure and temperatures close to the region of interest at different operating conditions.These results were used to determine boundary conditions for CFD simulations that were performed at each DOE point.The outputs of these simulations were used to determine the heat transfer coefficients and thermal boundary conditions for subsequent thermo-mechanical FE simulations.
Since the dominant failure mode in this case was creep, nonlinear steady state FE runs were performed at each DOE condition to determine the metal temperature and the stress relaxation curve at the critical location (near the tip of the blade).At first the steady state heat transfer (FE) problem was solved for each DOE point with its corresponding thermal boundary condition.The results of these simulations provided the spatial distribution of metal temperature in the gas turbine blade at different DOE point.This result was then used to construct a surrogate model that relate the metal temperature at the critical location,   , to the operating conditions of the turbine (as the different DOE points correspond to different operating conditions) as follows: =   (,   , ,   ,  ℎ , ̇) Next steady-state non-linear structural FE simulations were performed at each DOE point using the corresponding metal temperature distributions, gas pressure loads and centrifugal forces.The results of the FE simulations were used to build a surrogate model for the von-Mises stress, S, as a function of operation,   and time, , (to account for the stress relaxation due to creep strain accumulation) as follows: Thereafter, sensor data collected for all the 38 gas turbine units were used in combination with Equations 1 and 2 to obtain a time-history of stress and temperature at the critical location of the gas turbine blade for each unit.
Next, the metal temperature and stress histories were used to determine intervals of operations over which the metal temperature and stress remains fairly constant.For each such interval, , with metal temperature    and stress   , creep rupture curves were used to determine the expected time-torupture,    , using equation 2.
=   (   ,   ), where,   represents empirical creep rupture curves obtained from experiments.The expected time to rupture,    , is then used in a linear damage accumulation framework to estimate the creep damage in the  -th interval.Specifically, Robinson's rule (Suresh, 1998) and (Spera, 1969) is used to calculate damage at the -th time-interval,   , as: where,    is the time-to-rupture at the -th time-interval and Δ  is the time spent at the -th time-interval.Thus, the total damage accumulated over  hours of operation of the turbine,   , was computed as: where,  is the total no of constant stress-temperature time intervals identified over  hours of turbine operation.As described in the previous section, the variability in the material curves was used to compute an ensemble of such damage evolution curves for each turbine and thus a damage distribution at each time instant was constructed.The mean value for the damage distribution at the end of the operation history for each machine was computed.This has been considered as a damage severity index to distinguish the failure progression across machines.
It is to be noted here that the critical damage threshold, that defines failure is not essentially 1.This is primarily because the nature of the failure observed in the field (micro-cracks) are different from those observed in laboratory experiments conducted to obtain the creep rupture curves that are used to calculate damage at each time step.Furthermore, the variability in material life curves makes our damage estimate probabilistic and it is not necessary that the critical damage threshold for the mean damage should be set to 1. Instead, we develop a single-variable logistic regression classifier model to decide the appropriate damage threshold.For this, 22 units (11 failed and 11 healthy) were selected to train the classifier model.The critical damage threshold was varied to construct an ROC curve as shown in Figure 4.This helps in understanding the predictive capability of the model and also to select a classifier threshold that would maximize the probability of detection (POD) while minimizing the false positive rate (FPR).
As evident from Figure 4, the predictive capability of a model based on the physics based damage estimate is not very high the AUC for this ROC curve is only 0.578.This is also evident from the boxplots of damage shown in Figure 5, which show a significant overlap in the distribution of damage values from both failed and healthy units.
The ROC curve was then used to determine an optimal damage threshold by choosing the knee-point of the ROC curve.The damage threshold chosen was then used to predict failure in the remaining 16 units.The confusion matrix corresponding to this prediction, as presented in Table 1, shows that the probability of detection (POD) or True Positive Rate (TPR) of this model is only 33.33%.The accuracy of such physics based damage accumulation models are governed primarily by the modelling assumptions in the thermo-mechanical (CFD and FE) simulations performed to construct the model (i.e.modelling inaccuracies) and by the extent to which other un-modelled effects play a role in the failure.The first can be accounted for by calibrating the model using field failure information (see Figure 6).Typically, this update is performed on the parameters belonging to the portion of the model that has most modelling uncertainties.
However, the process of model tuning can often be met with challenges especially when the actual damage mechanism has not been captured entirely using the physics-based framework.For example, various hot gas path components in a gas turbine are subjected to damage with varying level of contributions from low cycle fatigue, high cycle fatigue, creep, oxidation, erosion, fouling, hot corrosion etc.In some cases, the actual degradation mechanism for a specific failure mode or the interaction among failure modes is often not well understood especially for a multi-physics problem and hence, it gets challenging to model the system behavior.In addition, there could be effects from other parameters like manufacturing vendor variation, customer operating characteristics (water wash frequency, mission profiles, machine trips etc.), geographical effects, atmospheric parameters (dust quality, air salinity levels, humidity, etc.) that could alter the progression of damage.Most times, the quantification of the effect of each of these variables on the overall part damage becomes difficult to model.

Figure 6: Model calibration
To counter these challenges, a model fusion strategy has been adopted (see Figure 7) that involves combining a damage severity index from a physics model along with other unaccounted variables within a machine learning modelling framework.The first step in this process is to typically formulate a physics-based model based on the dominant failure modes identified through historical engineering knowledge or fractographic studies.The damage severity index for the dominant failure modes could then be combined with other unaccounted features through an embedded feature selection algorithm so as to down select the vital features and thereby come up with a final model with an improved performance.The advantages of this approach are: (1) Quantification of effect of dominant failure mode on component damage (2) Ability to handle known interactions of operational parameters within a physics framework (3) Ability to translate temporal information in operational parameters into a cumulative damage index (4) Fusion with additional features within a machine learning framework helps assess the effect of other additional variables that could enhance a physics model prediction.Application of these two steps to the example problem in section 3.2 are presented in the subsequent sections.

MODEL CALIBRATION
For the example problem mentioned in section 3.2, the parameters in the metal temperature prediction model were updated using field failure information.This was done since the uncertainties in the flow-thermal models were higher than those in the structural model.For this the field data set was first partitioned into a training (22 machines) and test (16 machines) data set.Thereafter SQP (Sequential Quadratic programming) optimization routine in MATLAB was used to choose optimal parameters in the metal temperature estimation model that would ensure an increase in prediction accuracy.Finally, the updated model was tested on the test data-set.The box-plots in Figure 8 show a better separation between the damage distributions for the failed and nonfailed turbines.This improvement is also reflected in the ROC curve (Figure 9) which now has an AUC of 0.892.Finally, this also results in a significant improvement in the prediction accuracy of the test set (Table 2) with a POD of 66.67%.

MODEL FUSION
The example illustrated in Example 3.2 has been used for demonstrating the proposed model fusion approach.The calibrated damage values computed in Section 4 for the dataset of 38 machines were used in addition to a list of additional features identified based on engineering knowledge.These would include parameters like: (1) Machine operating behavior like Turbine starts count, Turbine trip count (2) Manufacturing vendor -Fraction of buckets manufactured by Vendor A, Vendor B, Vendor C (3) Regional environmental parameters -Dust density, Sea Salt density, Atmospheric chemical constituents' densities (ACC1, ACC2), Black carbon mass density and Organic carbon mass density.Details of the features used for the analysis are illustrated in In this analysis, two of the above mentioned regularization methods have been applied on a logistic regression framework so as to arrive at vital features for failure prediction.In addition, a tree-based classification method has also been demonstrated.
To avoid model over fitting, the dataset of 38 machines have been split as earlier into training data (22 datasets) for building the model and testing data (16 datasets) for reporting model performance.Cross-validation process has been adopted during model building to identify the optimal penalty parameters for each shrinkage regression method.ROC curve developed on the training data has been used to identify optimal thresholds and this has been applied during prediction on the test datasets.

Ridge Classifier
A Ridge classifier estimates model coefficients while imposing an L 2 norm penalty on the size of the coefficients (Hastie, Friedman, & Tibishirani, 2001).Addition of the penalty parameter results in shrinking the value of the model coefficients while considering all predictor variables.The extent of shrinkage is controlled primarily by the penalty parameter and has been arrived at an optimal value using cross-validation.The ridge classifier was built on the training data and the ROC curve was generated as in Figure 10.The AUC is 0.876 indicating a marked improvement from the calibrated model.The trained ridge classifier was then applied on the test dataset (16 machines) and the model metrics are shown in Table 4.This model gives a TPR: 83.33%, FPR: 20% and Accuracy: 81.25%.Although the accuracy of the ridge classifier is high, the number of predictor variables is also very high.This is because the ridge classifier doesn't drive the coefficients of the insignificant features entirely to zero and thereby resulting in their removal.

LASSO Classifier
A LASSO classifier imposes an L 1 norm penalty on the size of the coefficients during model building (Hastie, Friedman, & Tibishirani, 2001).Unlike the ridge classifier, incorporating the L 1 penalty helps shrink the estimate of the model coefficients to zero.In the analysis, in addition to the Damage index, environment variables turned up as significant.This indicates that environmental parameters also impact the probability of failure.As it is difficult to include failure due to environment parameters into physics models today, this data fusion method is a good way to incorporate these features.The ROC curve over the training data is shown in Figure 11 and the AUC is 0.926.Table 5 shows the test data metrics for the trained LASSO classifier: TPR : 83.33%, FPR : 40%, Accuracy : 81.25%.

Classification and Regression Trees (CART)
Tree based methods are very simple methods and are great for interpreting the results, but because of their implicit simplicity, they are generally less accurate than advanced supervised learning techniques (Hastie, Friedman, & Tibishirani, 2001).In the case of classification problems, like the problem at hand, classification trees can be used.Tree based methods involve segmenting the predictor space into a number of simple regions.To make predictions for a given set of independent variables, its region is determined and the output is given by the majority class in that region.Figure 12 shows the CART model developed on the training data.The test data metrics for the CART model are shown in Table 6.This model gives TPR : 66.67%, FPR : 20%, Accuracy : 75%

Interpretation of Model Fusion
In traditional life assessment methods, a constant damage threshold is identified based on prior understanding of the failure mode and prior laboratory coupon level testing.When working with real-life problems, the application of the same damage threshold universally for different machines gets challenging, as the working environment for the component need not remain identical to each other as well as to that in a laboratory setup.For example, a creep failure mode in a corrosive environment can experience a more pronounced damage progression behavior.Moreover, the physics model may not incorporate the effect of other critical environmental parameters that are to account for this, an approach towards identifying an optimal threshold is imperative.The proposed model fusion approach attempts to address this by evaluating unit-specific thresholds by taking into account the effect of regional environmental parameters.To study this behavior, the ridge classifier model developed in Section 5.1 has been used as an example on two machines that had experienced component failure.The cumulative damage progression over time were computed for both the units based on their operational parameter histories.The time histories of the cumulative damage values were then applied along with other environmental parameters as inputs into the Ridge classifier.This resulted in a time series of classifier predictions.Figure 13 illustrates the time progressed cumulative damage values for the two machines.The time instances where the classifier predicts healthy are plotted in green color and the faulty instances are plotted in red.The damage value at the time when the machine transitions from healthy towards faulty is thus the threshold damage value for that machine.It could be observed in Figure 13 that machine (b) has a much lesser damage threshold than machine (a).This is due to the effect of environmental parameters like dust, atmospheric constituents, sea-salt and manufacturing parameters that also contribute to this failure mode but are not captured in the physics model.

CONCLUSIONS
In this work, a novel machine learning based model fusion approach has been demonstrated that combines physics model predictions with other data sources that are difficult to incorporate in a physics framework.A detailed physics model is constructed for this work, which lays the foundation of the model.In many other works in literature (Heng, Zhang, Tan, & Mathew, 2009), a direct data approach is chosen (where temporal parameters are condensed into statistical features to build models) instead of a physics model.Even though the physics model is cumbersome to construct, the authors believe it offers certain key advantages.Firstly, it consolidates many operation parameters into a damage feature, which is a strong feature (as shown by the model performance).This helps in feature reduction, while maintaining an understanding of the physics.In addition, it gives insight into the impact of parameters, which can be difficult to interpret or sometimes even be not understood in a data-based model.It offers the ability to translate temporal sensor data into a cumulative damage index that evolves over time and can therefore be used to predict RUL.But, since all parameters impacting the damage are not necessarily captured in a physics model, the accuracies may not be adequate.This necessitates development of fusion methodology which is presented in this work.
The methodology was applied to a gas turbine blade failure example.A creep-based damage accumulation model was used to compute the damage indices for 38 machines based on operation.The physics model was calibrated using field failure data to improve its accuracy.Further, other unmodelled effects were incorporated multiple machine learning models.
A comparison of the ROC curves generated for each model is shown in Figure 14 Overall, these approaches help in identifying important unmodelled effects and augment a physics model performance.
These features can also be identified for further detailed study that can help in building an enriched physics model.

Figure 2 .
Figure 2. Parallel Approach in Hybrid Modeling

Figure 3 :
Figure 3: Framework for developing a physics-based damage accumulation model

Figure
Figure 10: Ridge classifier ROC curve

Figure 13 .
Figure 13.Cumulative damage models for two machines (a) and (b) with variable damage thresholds.A normalized time scale has been used for plotting purposes.

Table 1 .
Confusion matrix -Pure physics model

Table 2 .
Confusion matrix -Calibrated physics model

Table 3 .
Machine learning model features Shrinkage regression methods have been very commonly used in literature (Hastie, Friedman, & Tibishirani, 2001) to identify vital features during a model selection process.Some of the very common regularization methods used include Least absolute shrinkage and selection operator (LASSO), Ridge, Elastic net, Orthogonal matching pursuit and LARS.

Table 6 .
Confusion matrix -CART classifier Figure 12: CART Classifier Tree Structure

Table 7 .
. It can be observed that the predictive capability of the model improves with model calibration and data fusion.The test set prediction results are summarized in Table 7.The Ridge model with 14 input features shows up as the model with the best test set performance among the models.LASSO model on the other hand is the simplest model with highest failure detection capability.Figure 14.ROC curves comparison between models Comparison of model metrics on test set