A Methodology for Updating Prognostic Models via Kalman Filters

Prognostic models are built to predict the future evolution of the state or health of a system. Typical applications of these models include predictions of damage (like crack, wear) and estimation of remaining useful life of a component. Prognostic models may be data based, based on known physics of the system or can be hybrid, i.e., built through a combination of data and physics. To build such models, one needs either data from the field (i.e., real-world operations) or simulations/tests that qualitatively represent field observations. Often, field data is not easy to obtain and is limited in its availability. Thus, models are built with simulation or test data and then validated with field observations when they become available. This necessitates a procedure that allows for refinement of models to better represent real-world behavior without having to run expensive simulations or tests repeatedly. Further, a single prognostic model developed for an entire fleet may need to be updated with measurements obtained from individual units. In this paper, we describe a novel methodology, based on the Unscented Kalman Filter, that not only allows for updating such “fleet” models, but also guarantees improvement over the existing model.


INTRODUCTION
Prognostic models are built to make predictions on the future evolution of a system.Applications of these models include prediction of cracks, wear, scrap rate and estimation of remaining useful life in industrial components.These prognostic models can be completely data-driven, based on known physics of the system or can be hybrid, i.e., a combination of both data and physics.
Prognostic models can be built using field (real-world) data, if available, or with simulation/test data.Often, field data is not available or may be inadequate (i.e., not enough data points) for model building.In such scenarios, prognostic models are built with simulation data, like Design of Experiments (DOE) data, or test data.These models are then validated and refined with field data when it becomes available.Thus, there exists a requirement for a methodology that allows for model updating even with limited field data1 .One of the techniques that can be used for model or transfer function updating is the Kalman Filter (Kalman, 1960) and its variants, namely the Extended Kalman Filter (EKF) (Swerling, 1959) and the Unscented Kalman Filter (UKF) (Julier, Uhlmann, & Durrant-Whyte, 1995).
The Extended, Unscented and Ensemble Kalman Filters have been used for history matching and continuous model updating in diverse fields such as Petroleum Engineering (Ning & Oliver, 2005), Meteorology (Houtekamer & Mitchell, 2001) as well as for gas turbine performance diagnostics (Volponi, Ganguli, & Daguang, 2003).In all these examples, the model being updated represented a specific reservoir or an aircraft engine.The data for updating the model was obtained from a single source (either the reservoir or the engine) and it was ordered in time, i.e., a time series.Thus, the standard Kalman Filter framework is applicable as-is to these problems.
In this paper, we consider the problem of updating a model that represents, not a single entity, but all entities of a class.An example is a model built to predict cracks or scrap percentage in a fleet of gas turbines or aircraft engines.The model parameters are the same for all units of the fleet with the operational history of the units being the inputs of the model.The predictions of these models are usually verified through inspections during outages or shop visits.The challenge is to update the parameters of such "fleet" models based on field inspection measurements obtained from various units of the fleet.Given that there is no notion of (time) sequence in the measurements, i.e., there's no ordering of data from different units of the fleet, and the fact that these measure-ments become available in a batch, the conventional Kalman Filter framework does not apply.We present a methodology that modifies the Kalman Filter based model updating framework to allow for updating "fleet" models by utilizing measurements from different entities of the fleet.Further, the methodology is designed to ensure improvement in model accuracy after the update.
The rest of the paper is organized as follows: Section 2 provides a brief review of the Unscented Kalman Filter.Section 3 discusses the problem of updating "fleet" models and describes a novel methodology based on Kalman Filtering.In Section 4, the efficacy of the proposed methodology is illustrated with a simulated and a real-world example.Section 5 provides a summary of the paper.
Throughout the paper, scalars and vectors are respectively represented as small letters and small, bold letters.Furthermore, I n denotes an n × n identity matrix.Additionally, the symbol R refers to the 1−dimensional space over the field of real numbers, R n refers to the n−dimensional space over the field of real numbers and R n×m refers to the n × m dimensional rectangular matrix real space (square if n = m).The notation N (µ, σ) represents a normal/Gaussian distribution of mean µ and standard deviation σ.The symbol D refers to a compact region of the space of appropriate dimension.Thus D x ∈ R represents a compact region of the variable x which is a subset of the state space of appropriate dimension.The operator E [x] denotes the expectation of the random variable x.

KALMAN FILTERS: A BRIEF REVIEW
The availability of all state variables for direct measurement is a rare.In physical systems, some components of the state are inaccessible internal variables, which either cannot be measured or the measurements require the use of very costly measurement devices.Hence in most practical scenarios, there is a true need to construct estimates of the unknown state variables via known measurements, albeit noisy.In the case of systems that are linear in the process and measurement, and which are corrupted by white process noise and measurement noise, the linear Kalman Filter (KF) offers a recursive solution in the sense of minimizing the trace of the error covariance of the system states (Kalman, 1960), (Kalman & Bucy, 1961), (Brown & Hwang, 1992).However, in the case of systems which are inherently nonlinear in either the process or the measurement or both, straightforward implementation of the KF is not guaranteed to yield optimum results in the sense of minimizing the root mean square of the estimation error.Thus the need to be able to effectively reconstruct the unknown states of a nonlinear system has promoted research in nonlinear filtering theory (Nijmeijer & Fossen, 1999).
Of the numerous attempts being made for the development of nonlinear filters, the Extended Kalman Filter, was the earliest and the most prevalent approach.The design of the EKF is based on a first order local linearization of the system around the current state estimate at each time step (Eykhoff, 1974), (Jazwinski, 1970), (Daum, 2005).The first ever implementation of an EKF is credited to Peter Swerling (Brookner, 2001), and was called as the Swerling Filter for filtering problems (Swerling, 1959).This approach approximates the nonlinear equations by a Taylor series of up to the first order.The well known KF (Kalman, 1960) equations can then be applied to the linearized system to compute the Kalman gain and the covariance matrices.To address the limitations inherent in an EKF 2 , the Unscented Kalman Filter (UKF) (Julier et al., 1995), (Julier & Uhlmann, 1997) was developed, which neither relies on the linearization steps required by the EKF nor the computation of Jacobian matrices.Instead, the UKF uses a deterministic sampling approach to estimate the mean and covariance with a minimal set of sample points.

Discrete Nonlinear Time Invariant System
Consider the following observable 3 , discrete, nonlinear dynamical system: where, n is a finite nonlinear mapping of the system states and system input, is a nonlinear mapping of the system states to the output, w k ∈ D w ⊂ R w denotes the w−dimensional random process noise vector and v k ∈ D v ⊂ R v denotes the v−dimensional random measurement noise vector.The process and measurement noise are assumed to be zero mean, band-limited, uncorrelated, additive, white Gaussian noise processes such that: where, Q k is the process noise covariance, R k is the measurement noise covariance and δ kj is the Kronecker delta function (discrete equivalent of the Dirac delta function).The Gaussian random variables w k and v k are commonly denoted as w k ∼ N (0, Q k ) and v k ∼ N (0, R k ) respectively, where 2 A major limitation of the EKF is that it approximates the expected value of a nonlinear function f (x) as a function of the expected value, i.e., ). 3 For the definition on observability of nonlinear systems, refer (Isidori, 1995).
w k and v k are 0 mean distributions and of variance Q k and R k respectively.The initial state of the system in Eq.( 1) is assumed to be a Gaussian random vector with mean x0 and covariance P0 and can be denoted as x 0 ∼ N (x 0 , P0 ).

The Unscented Kalman Filter
The Unscented Transform (UT) is a mathematical function used to estimate the result of applying a given nonlinear transformation to a probability distribution that is characterized only in terms of a finite set of statistics.In other words, the UT approximates a Gaussian distribution with a set of deterministically chosen sample points.These points completely capture the mean and covariance of the Gaussian distribution such that when propagated through a nonlinear transformation map, the transformed points accurately capture the mean and covariance of the new density (due to the transformation) up to the third order.The Unscented Kalman Filter (UKF), which is based on propagating the mean and covariance through the UT (Julier et al., 1995), is a method for calculating the statistics of a random variable that undergoes a nonlinear transformation.
Consider the observable, discrete, nonlinear dynamical system in Eq.( 1).Initialize the state and state error covariance estimates to x0 and P 0 , respectively.Let n denote the state dimension.For each time index k ≥ 1, the typical UKF predict-update steps are: 1. Generate sigma points ξ k−1 , based on xk−1 and P k−1 , as: where, i = 1, . . ., n, γ = √ n + λ and λ = α 2 (n + κ)− n such that 10 −4 ≤ α ≤ 1, where α determines the spread of the sigma points around the mean, κ is a secondary scaling parameter and P k−1 i denotes the i th column (or i th row transpose) of the matrix P k−1 .If P k−1 is positive definite, the Cholesky decomposition can be used to obtain the square root.If P k−1 is positive semi-definite, perform an eigen decomposition to identify the eigen values that are 0, reset those eigen values alone to a very small positive number (E.g. 10 −8 ), regroup the eigen value and eigen vector matrices to obtain P k−1 and then use Cholesky decomposition.

At time index k, PREDICT:
Sigma points as: State estimate x− k as: (m) 0 State error covariance matrix as: where, W and β is used to incorporate prior knowledge of the distribution of x (β = 2 for Gaussian is optimal).Recalculate the sigma points based on x− k and P − k to incorporate the effect of process noise as: Measurement as: 3. At time index k, UPDATE: Innovation covariance as: Cross covariance between x− k and ẑ− k as: State estimate as: State error covariance matrix as: Remark 2.1 Although, theoretically, the matrix P k is symmetric, for real data this condition might be violated.Therefore, after every update step it is necessary to force a symmet-ric constraint as Remark 2.2 For details on the iterative UKF, refer (Zhan & Wan, 2007).

UPDATING MODELS WITH KALMAN FILTERS
In this section, we describe the methodology for updating models with the Unscented Kalman Filters.The model of the measured output is given in Eq.( 10) The function h could be as simple as a single equation, can be a system of equations or may represent a series of computational steps resulting in the output z .The parameter vector Θ, given by represents the coefficients associated with various terms in the model and u represents the inputs to the model.The inputs, for example, could be temperature, stress, pressure or any other measured input to the system.The purpose of the update methodology is to modify the values of the coefficients, utilizing the actual measurements of the output and the model predictions, in a Kalman Filter framework.To achieve this, we postulate the parameter vector Θ to represent the "states" of the filter.The states are modeled as random walk process and hence we obtain the process and measurement equations as: While such a formulation is standard for model updating with Kalman Filters, it should be noted that the procedure described in literature cannot be applied as-is to updating "fleet" models.This is due to the fact that there is no ordering when measurements are obtained from different units of the fleet.Therefore, one can easily end up with a completely different updated model if the order of measurements used for updating is changed.
Also, with the standard procedure, the model gets updated to minimize the error between the model prediction and the measurement at that point (in the "fleet" case, measurement obtained from a particular unit).This is acceptable, as long as the latest measurement represents the most current information obtained about the system.Since this is not the case with measurements used for updating "fleet" models, standard procedure may result in an updated model that is worse than the original model in terms of overall performance against all measurements.Therefore, there is a need for a methodology that ensures an improvement in prediction accuracy for the updated model when compared to the original model.
In Such a selective updating process can be effective in mitigating the effect of excessively noisy measurements and outliers in the data.This process is depicted in Figure 1.Each of the M updated models is validated with the test data set and the model that provides the least test error is chosen as the updated model.Figure 2 provides an illustration of the overall procedure.
The salient aspects of the model update procedure can be summarized as: • Separation of data for training (updating) and testing • Permutations of training data to generate multiple updated models • Monotonic reduction in error during the update • Validation with test data to select the best updated model • Guarantee that the updated model is no worse than the original model (across all data).

RESULTS AND DISCUSSION
In this section, the analysis and results of applying the update methodology to a simulated and a real-world example are presented.The simulated example is a non-linear model with two inputs and an output.The real-world example is a prognostic model for scrap rate prediction in an industrial gas turbine.It is noted that measurements from several individual gas turbines are used for updating this model.A model of the form given in Eq.( 13) is considered for updating with the Unscented Kalman Filter.
where x 1 and x 2 are the inputs and z is the output.
The model coefficients Θ i } , i = {1, 2, 3, 4} are given by It is noted that the model is non-linear both in the coefficients as well as the inputs.The initial model is thus represented as Θ 0 .The coefficient values of the initial model are provided in Table 1.

Generating Data for Update
This section describes the process by which the data for updating the initial model was generated.The model parameters used for the simulation are given in Table 2.
The inputs, x 1 and x 2 , were sampled randomly from a Uni-  13).Additive White Gaussian Noise (AWGN) is then added to the model output to simulate the effect of measurement noise.This noisy output is treated as "actual" measurement z. 25 out of the 35 data points were corrupted with zero-mean AWGN with a standard deviation of 0.01.The remaining 10 points were corrupted with zero-mean AWGN with a standard deviation of 0.1.This is done to explore the robustness of the proposed update methodology vis-a-vis the standard Kalman Filter approach with respect to excessively noisy measurements and outliers.The x 1 , x 2 and z values corresponding to the data set are provided in Table 3.Out of the 35 data points, 80% of the data (28 points) were randomly chosen as the training data.The remaining 7 points constitute the test data.The training data and test data remain the same for the all experimental results presented subsequently in this paper.

Updating with Standard UKF
The plot of predictions of the initial model Θ 0 , for the update data set, against the "actual" values (obtained through simulation that was described earlier) is shown in Figure 3.
From the figure, it is obvious that the initial model predictions are not very accurate.The prediction bounds are shown merely to visualize the data in an easy to interpret manner.The average absolute error, for the training data, was found to be 0.1649 and that for the test data was found to be 0.1989.Also, 34/35 predictions lie beyond the prediction bounds.Therefore, this model accuracy can be improved with an update.Further, it is also desirable if the updated values of the model coefficients are close to the "true" values given in Table 2.
To begin with, the initial is updated with a standard UKF, i.e., no multiple runs with shuffled training data and no checking on whether an update with a measurement improves overall accuracy.The parameters to be updated ("the states") are modeled as a random walk process as in Eq.( 12).The process noise covariance Q was assumed to be diagonal matrix with a value of 1e − 4 for the elements in the diagonal.The measurement noise variance R is assumed to be 1e − 4. The UKF parameters are chosen as α = 1, β = 2 and κ = 0.The number of iterations of the filter is chosen to be 1.The training error was found to be 0.0767 and the test error was found to be 0.1113.Significant difference between training and test error suggests that the updated model obtained through this process may not be optimal.
The plot of the average error, at each update step, is depicted in Figure 5.The average error, at an update step, is calculated by predicting on all data with the updated model obtained at that step.Though the final error of 0.0767 is lower than the initial model error of 0.1649, it can be observed that the error does not decrease monotonically as more data is utilized for the update.This is due to the fact that, at each update step, the model is being modified in a way to reduce the error between prediction and actual value at that particular data point.Thus, an updated model need not necessarily be more accurate than the previous model across all data.
Further, the significant increases in error, at update step 11 and between update steps 24 and 27, suggest that the standard UKF based model update is highly sensitive to noisy data.A few outliers in the data can have a significant impact on the filter estimates.In fact, it is not inconceivable that the final updated model may actually be worse than the initial model,  Finally, the coefficients of the updated model along with the "true" values are presented in Table 4.It can be observed that while the update did improve the prediction accuracy, the updated model coefficients are quite different from "true" values.The average absolute error between the "true" values and the updated model coefficients was found to be 0.0885.

Updating with Standard UKF -Multiple Runs
It is possible to improve the results obtained with a single run of standard UKF by shuffling the training data and generating multiple models and then selecting the model that best performs on the test data.To illustrate this, M = 5 shuffles were performed on the training data and an updated model was generated with all the M permuted training data sets.
Figure 6 shows the average errors obtained at each update step for all the runs.
The test errors for each of the runs is given in In this section, we present the results of updating the nonlinear model with the methodology described in Section 3. The Unscented Kalman Filter parameters, namely α, β and κ, the number of iterations, the process covariance Q and the measurement variance R are the same as those used in Section 4.1.2.The training and test data used are also the same as that used in the case of the standard UKF.
To illustrate the efficacy of the proposed methodology, only a single run (i.e., M = 1) is performed on the training data.The predictions of the updated model are shown in Figure 7.
It can be observed from Figure 7 that the updated model pre-Figure 7. Updated model: actual prediction dictions are significantly more accurate than that of the initial model.Only 5/28 training data predictions and 1/7 test data prediction are outside the prediction bounds.The training error was found to be 0.0297 and the test error was found to be 0.0309, which are less compared to their respective values reported in the previous sections.
The plot of the average error, at each update step, is depicted in Figure 8.It can be observed that the error non-increasing as more data is utilized for the update.This is due to the fact that in the proposed approach, at each update step, the model modified if and only if the update leads to an overall decrease in error across all data.Thus, the updated model, at any given update step, is guaranteed to be no worse than the initial model when compared with all data.Finally, the coefficients of the updated model along with the "true" values are presented in Table 7.It can be observed that the updated model coefficients are quite similar to the "true" values.The average absolute error between the "true"values and the updated model coefficients was found to be 0.0056, much less than the error of 0.0885 obtained in the case of a standard UKF and 0.0282 obtained with multiple runs of standard UKF.It can be observed that while the performance of the standard UKF with multiple runs and proposed approach (single run) are similar in terms of prediction error on test data, the proposed methodology is superior in estimating the coefficients of the underlying model.While, in this example, superior results were obtained with a single run of the proposed model update methodology, it is suggested to use a M value greater than 1 in practical applications.
Even though, the data for update came from a single source (i.e., simulation with a set of model parameter values), this example illustrated the robustness of the proposed approach with respect to excessively noisy measurements and outliers.Thus, the proposed update methodology can be more effective than the standard approach for updating models.Metrics for the all the three approaches discussed so far are summarized in Table 8.

Updating a Damage Model
A model that tracks the damage in a part was built to predict the fraction of the overall fielded components expected to be declared as scrap at the time of inspection.A component is deemed "scrap" if there exists a defect of size greater than a certain limit.
Blades, nozzles and shrouds in high pressure turbine sections of a gas turbine are prime examples of parts that require accurate damage models.This is because, these parts experience the harshest conditions in the gas turbine in terms of high temperatures and pressures.Damage of these parts are thus a function of operating conditions of the gas turbine as well as the variations in design parameters such as material proper-ties and manufacturing variability.
The damage model is a hierarchical model that takes the operational inputs such as airflow pressures and temperatures that are measured and compute the damage as a function of cycles and stresses in specific locations.The stresses are in turn computed as a function of thermal and mechanical loads on the part.The thermal and mechanical loads are in turn functions of the operational inputs.Thus, the hierarchy of a damage model can get very complex depending on the part and the operational characteristics.Since details of all the constituent models and parts are proprietary, we have represented the model mathematically as a collection of coefficients (Θ) and operational parameters.Although the proprietary restrictions prevent us from sharing the underlying models and data sets, we hope that details presented below throw light on the applicability of the techniques developed here for real-world non-linear applications.
The scrap rate z is modeled as: where T h and T c respectively, are the hot and cold temperatures experienced by the component, H is the hours of operation, N is the number of cycles and Γ represents the material properties.A Bayesian estimation was performed to compute the initial coefficients of the model.The model predicts the distribution of crack lengths as a function of operating cycles.The model predictions are distributions because the coefficients are computed through Bayesian estimation accounting for variability and uncertainty in the system.This predicted damage distribution was then used with a damage threshold to compute the predicted scrap rate.All results presented in this section are provided in terms of a scaled version of scrap rate.The median values of the posterior distribution are used as the initial value of the model coefficients in the UKF.
The parameter vector Θ, represents the coefficients d An initial model (Θ 0 ) was developed with data obtained from 14 industrial gas turbines.The model coefficients are given in Table 9.

5.5086E-01
The model predictions versus the actual scrap, for the data set used for model building, are depicted in Figure 9.The prediction bounds are derived from acceptable error limits for scrap prediction.These limits are proprietary.The normalized scrap rate and the model predictions are always greater that zero.The parallel prediction bounds are shown for the purpose of easier interpretation.It can be observed that over 85% of the predictions are within the prediction bounds.Only 2 out of 14 points have an error beyond the prediction limits.
The average absolute prediction error, for the scaled scrap rate, was found to be 0.0080.Predictions were then obtained for a new data set comprising 13 industrial gas turbines with this model.Figure 10 shows the plot of actual scaled scrap Vs model predictions.It can be observed that almost 50%(6/13) of the predictions have error beyond the prediction limits.The average absolute error was found to be 0.0229.
The initial model is then updated with the new data to improve its prediction accuracy.The model parameters are modeled as a random walk process as in Eq.( 12).To illustrate the efficacy of the procedure described in Section 3, the model is first updated with a standard Unscented Kalman Filter.The updated model predictions are provided in Figure 11.
It can be seen from Figure 11 that the performance, in terms of average error, of the updated model is worse than the original model shown in Figure 10.This is due to the fact that in a standard UKF based update of "fleet" models, there is no guarantee that an update with a measurement leads to an overall improvement in performance.The overall average absolute error, calculated at each update step, is plotted in Figure 12.It can be observed that an update can lead to an increase in error, across all points, even if it reduces the error for that particular measurement.Also, there's no guarantee that the   It can be observed that the updated model is more accurate than the original model in predicting scrap rate.The number of predictions with error beyond the prediction limits has been reduced from 6 to 4 and the average absolute error has been reduced from 0.0229 to 0.0185.The updated model coefficients are provided in Table 10.1.8026E+00

CONCLUSION
This paper presented a Kalman Filter based methodology to update prognostic models.Unlike prior approaches, where unit-specific models (applicable to a single reservoir or an aircraft engine) were updated with sensor measurements, the problem of updating "fleet" models (applicable to all entities of a fleet) was considered in this paper.The various challenges involved in updating such "fleet" models were discussed and a methodology for mitigating the challenges has been proposed.In this approach, the standard Kalman Filter based model updating framework is adapted in a manner where a portion of the data is randomly chosen for training the filter while the remainder is used for testing purposes.Further, the methodology guarantees non-decreasing accuracy by choosing to update the model coefficients with new data only if the updated model outperforms the current model in terms of reducing the error across all measurements.Also, in an effort to ensure robustness, the training data is permuted several times and an updated model is obtain with each permuted set.Finally, the best among these updated models is chosen through validation with the test data.The efficacy of the proposed update methodology was demonstrated through its application to a simulated non-linear model and to a real-world problem of scrap rate prediction in an industrial gas turbine.While the train-test approach was employed for selecting the best updated model in this paper, a k-fold cross-validation approach could potentially yield better results.

Figure
Figure 1.Model update algorithm

Figure 5 .
Figure 5. Error at each update step -standard UKF

Figure 8 .
Figure 8. Error at each update step -proposed methodology

Figure 9 .
Figure 9.Initial model: actual Vs prediction, training data

Figure 10 .
Figure 10.Initial model: actual Vs prediction, new Data

Figure 12 .
Figure 12.Error at each update step -standard UKF

Figure 13 .
Figure 13.Updated model-actual Vs prediction, new data the proposed methodology, the available data is split into training and testing data set.Then, the training data set is permuted M times to create different training data sets.The M training data sets contain the same measurement points, but the ordering in each set is different.The original model is updated with each of the permuted training data sets to generate M updated models.To guarantee, that an updated model is always better (or at least no worse) than the original/previous model, the original/previous model is updated with a measurement only if the updated model performance (for example, in terms of overall average error computed across all training data), is better than that of the previous model.A limited comparison of this approach can be made with stepwise regression.In stepwise regression, an existing model is modified at each step, by adding a term in case of forward stepwise or by removing a term with backward stepwise, only if such an addition or deletion improves the existing model.Similarly, in our "stepwise"' model updating approach, an existing model is a modified at a step (in this case, a measurement in the training set) only if the modification results in a more accurate model.Otherwise, the modification is discarded and the existing model is retained.

Table 1 .
Initial model: Θ 0 .A total of 35 data points were generated for x 1 and x 2 .The model output is then obtained with generated inputs as per Eq.(

Table 3 .
Data for model update

Table 5 .
The lowest test error of 0.0326 , obtained from run 2, is smaller than the value of 0.1113 obtained earlier.The training error for this run was 0.0329.The coefficients of the best model, obtained from run 2, is presented in Table 6.It can be observed that the estimates are Figure 6.Error at each step -std UKF, multiple runs

Table 8 .
Summary of metrics