An Inference-based Prognostic Framework for Health Management of Automotive Systems

This paper presents a unified data-driven prognostic framework that combines failure time data, static parameter data and dynamic time-series data. The framework employs proportional hazards model and a soft dynamic multiple fault diagnosis algorithm for inferring the degraded state trajectories of components and to estimate their remaining useful life times. The framework takes into account the crosssubsystem fault propagation, a case prevalent in any networked and embedded system. The key idea is to use Cox proportional hazards model to estimate the survival functions of error codes and symptoms (probabilistic test outcomes/prognostic indicators) from failure time data and static parameter data, and use them to infer the survival functions of components via soft dynamic multiple fault diagnosis algorithm. The average remaining useful life and its higher-order central moments (e.g., variance, skewness, kurtosis) can be estimated from these component survival functions. The framework is demonstrated on datasets derived from two automotive systems, namely hybrid electric vehicle regenerative braking system, and an electronic throttle control subsystem simulator. Although the proposed framework is validated on automotive systems, it has the potential to be applicable to a wide variety of systems, ranging from aerospace systems to buildings to power grids.


INTRODUCTION
Conventional maintenance strategies, such as reactive and preventive maintenance, are inadequate in fulfilling the needs of high-availability in complex automotive systems.What is needed is a continuous monitoring and early warning capability that detects, isolates and estimates component degradations (fault diagnosis and prognosis) over time and thus minimizing downtime and operational costs via condition-based maintenance.Failure prognosis, an add-on capability to diagnosis, involves forecasting system degradation and time-to-failure (remaining useful life) based on "state awareness" gained from monitored data, for example, parameters collected from various sensors, such as vehicle speed, individual wheel speeds, yaw rate, master brake cylinder pressure, and so on.
The existing time-series based approaches to prognostic health management are component-centric and do not make use of widely available data in archived databases of vehicle equipment, such as historical usage patterns, error codes (i.e., codes that are recorded by onboard software units in case of a fault or malfunction of the component), observed failure modes, repair and inspection intervals, environmental factors, skill levels of personnel, and status parameters collected periodically or at the onset of error codes.Examples of status parameters include operating parameter identifier data collected periodically or when an error code is recorded.Consequently, the time-series based prognostic health management approaches are both incomplete and inaccurate for coupled systems with cross-subsystem fault propagation.In coupled systems, the cross-subsystem fault propagation comes from the fact that when multiple components are interacting with each other, a fault in one component would propagate to other components, i.e., degradation in one component affects the other and new faults emerge as a result of interactions among multiple components.For example, interactions with a degraded battery would eventually damage the starter.Therefore, individual component-based prognostic algorithms are inadequate and a system-level prognostic algorithm taking into account the fault propagation effects due to coupling/interactions among the components is needed.
On the other hand, the classical survival theory-based approaches and reliability-based methods (Jardine & Tsang, 2005;Murthy, Xie, & Jiang, 2004) use only failure time data to estimate time to failure distribution.These methods rely on Weibull and other nonlinear regression models to infer timeto-failure, and these estimates are used to optimize the timeto-maintain or time-to-repair/replace.These techniques do not consider actual condition of a specific component and therefore result in large variability in the time-to-failure estimates.Evidently, the two disparate methodologies, viz., prognostic health management techniques based on dynamic time series data and survival theory-based techniques using archived data, need to be reconciled and unified under a common modeling framework that can work with all of the three types of data (see Fig. 1) listed below: (a) archived failure data (Type I data): age (or a surrogate function such as the mileage or operational time) of the vehicle at the time of failure, i.e., age when an error code or symptom is observed, or a component is replaced; (b) static environmental and status parameter data (Type II data); and (c) dynamic data (Type III data): time-series data and periodic status data.
In this paper, an integrated approach that seamlessly combines all three types of data to infer the component degradations and to estimate their remaining useful life (RUL) times is presented.The framework employs two key techniques: (i) Cox proportional hazards model (Cox PHM) (Klabfleisch & Prentice, 2002), and (ii) soft dynamic multiple fault diagnosis (soft DMFD) inference algorithm (Singh, Kodali, & Pattipati, 2009).The Cox PHM computes the survival functions of tests (or error codes), whereas the soft DMFD algorithm is used to infer failing components in coupled systems.The soft DMFD algorithm determines the most likely evolution of component states that best explains the observed soft test failure outcomes (i.e., complementary Figure 1.Categories of Data Available for Prognostics test survival probabilities).Here, a soft Viterbi algorithm is employed to decode the most likely probabilistic evolution of the fault sequence.The prognostic framework is discussed in detail in the subsequent sections and the capabilities of the framework is illustrated on two datasets: (i) a dataset derived from an automotive electronic throttle control (ETC) system simulator with failure time data, static parameter data, and simulated test outcomes; and (ii) a dataset derived from an automotive regenerative braking system (RBS) with failure time data, and static as well as dynamic parameter data obtained from simulation-based fault injection experiments conducted in MATLAB ® /Simulink ® .The proposed framework is modular, leading to a flexible and evolvable software architecture for prognostic health management.
The paper is organized as follows.An overview of the existing prognostic methods is discussed in Section 2 and Section 3 presents the proposed prognostic framework to estimate the component degradations.A brief discussion on the soft DMFD inference algorithm is presented in Section 4 and the experimental results from two automotive applications are discussed in Section 5. Finally, section 6 concludes the paper with a summary.

PREVIOUS WORK
Prognosis is a salient component of condition-based maintenance (CBM) of systems.The Prognostic methods can be broadly classified into the following two approaches: model-based and data-driven (Chiang, Russel, & Braatz, 2001).These methods are discussed in detail in the following subsections.

Model-based methods
The model-based methods are applicable to systems where accurate mathematical models of the system and an adequate number of sensors to observe the state of the system are available.The model-based methods are also termed the physics-based or physics-of-failure-based models in the literature.The model-based methods employ statistical estimation techniques to track residuals generated using observers (e.g., Kalman filters, reduced-order unknown input observers, interacting multiple models, particle filters) and parity relations (dynamic consistency checks among measured variables) in order to provide an estimate of the accumulated damage and assess the remaining life (Luo, Namburu, Pattipati, Qiao, Kawamoto, & Chigusa, 2003).
Several methods have been proposed based on damage accumulation models.Ray and Tangirala (1996) presented a nonlinear stochastic model of fatigue crack dynamics for damage rate computation in mechanical structures.Adams (2002) proposed to model damage accumulation in a structural dynamic system as first/second order nonlinear differential equations.Serrao, Onori, Rizzoni, and Guezennec (2009) proposed a damage accumulation model for automotive batteries by taking into account the deterioration of physical or functional parameters, for instance, battery state-of-charge, operating temperature, as well as load profile and its impact on the battery aging process.Chelidze, Cusumano, and Chatterjee (2002) modeled degradation as a "slow-time" process, which is coupled with a "fast-time" observable subsystem.Luo et al. (2003) developed a prognostic process based on data collected from model-based simulations under normal and degraded conditions; interacting multiple models were used to track the hidden damage.Daigle, Matthew, andGoebel (2011), Celaya, Saxena, andGoebel (2012), and Byington, Watson, Edwards, and Stoelting (2004) are some examples for degradation modeling and physics-based approaches.
The main advantage of a model-based approach is its ability to incorporate a physical understanding of the degradation process into the process monitoring scheme.However, it is difficult to apply the model-based approach to large-scale systems because it requires detailed analytical models in order to be effective.

Data-driven methods
The data-driven approaches to prognostics are derived directly from routinely monitored system operating data (e.g., vibration and acoustic signals, temperature, pressure, currents, voltages).The data-driven approaches are based on statistical and pattern classification techniques, ranging from multivariate statistical methods, linear and quadratic discriminants, partial least squares and canonical variate analysis, support vector machine and relevance vector machine regression, Gaussian processes, graphical models (Bayesian networks, hidden Markov models) to black-box methods based on neural networks (e.g., multi-layer perceptrons, probabilistic neural networks, learning vector quantization), self-organizing feature maps, signal analysis (filters, auto-regressive models etc), and fuzzy rule-based systems.
Depending on the type of information used, the prognostic techniques may also be categorized into three types: (i) timeto-failure data-based, (ii) stressor-based, and (iii) degradation-based.Time-to-failure data-based methods use failure time data to estimate the lifetime of a component (e.g., Weibull analysis).Stressor-based methods consider the operating conditions, such as temperature, humidity, vibrations, load, input current and voltage.Degradationbased methods estimate and track the degradation parameters and predict when the total degradation (damage) exceeds a predefined threshold of functional failure.These degradation parameters are directly measured from the system or via a fusion of multiple parameters (Coble, 2010).
A survey of data-driven prognostics is provided by Schwabacher (2005).Si, Wang, Hu, and Zhou (2011) presented a detailed review of the statistical data-driven approaches.Gorjian, Ma, Mittinty, Yarlagadda, and Sun (2009a;2009b) discussed the existing state-of-the-art literature based on covariate-based models, and the commonly used degradation models in reliability analysis.Wang and Vachtsevanos (1999) employed a dynamic wavelet neural network trained on vibration signals of defective bearings with varying crack depth and width to predict the crack evolution and to estimate their remaining useful life times.Gebraeel and Lawley (2008) discussed a feedforward neural network based method for predicting the remaining useful life of ball bearings.Swanson (2001) proposed to use a Kalman filter to track the dynamics of mode frequency of vibration signals in a tensioned steel band.Garga, Mcclintic, Campbell, Yang, and Lebold (2000) presented a signal analysis approach for prognostics of an industrial gearbox.The main features used include the root mean square value, kurtosis and wavelet magnitude of vibration data.Cox and Oakes (1984) developed proportional hazard models that merge both failure time data and stress data (vibration signals) to estimate the remaining useful life.Kumar, Torres, Chan, and Pecht (2008) described a hybrid prognostic framework utilizing both data-driven and physicsof-failure models to estimate the remaining useful life in electronic systems.The monitored parameters included fan speed, temperature and percentage of CPU utilization.
Most of the prognostic approaches in the literature consider one or two categories of data delineated in Fig. 1.For instance, the reliability-based methods use historical failure times (Type I data) to generate life time models.These methods can only provide an average estimate for component degradations based on historical failure time data.It is evident that a prognostic algorithm built solely on the historical data cannot provide an accurate estimate for component degradations because typically, components operating in a harsh environment will fail more quickly than those operating in a mild environment.Hence, a prognostic algorithm should consider operating parameter data and with the historical data to provide an accurate estimate for component degradations so that an appropriate predictive maintenance action can be taken on an as-needed basis.Coble (2010) compared the performance of prognostic models based on reliability data and condition monitoring data and demonstrated that reliability data alone does not result in satisfactory results for prognostics.Pecht, Das, and Ramakrishnan (2002), Lall, Pecht, andHarkim (1997), andVichare, Rodgers, Eveloy andPecht (2004) are few other examples where it is emphasized that methods based on Type I data are insufficient for prognosticsthese methods tend to either underestimate or overestimate the remaining useful life.Therefore, it is vital to take into consideration the current operating status of the components (i.e., Type III data) as well as the historical data (Type I and II data) to estimate component degradations with good degree of accuracy.
In the literature, there are many methods developed exclusively for prognostics of a single component.The methods usually employ feature data collected over a period of time until the component fails to develop a prognostic model.More significantly, they are component-centric and they primarily focus on predicting the remaining useful life of one particular component in isolation.The approach presented in this paper overcomes these two fundamental limitations by developing a prognostic framework that is capable of tracking degradations of multiple interacting components in a system.Also, the framework has the ability to incorporate data generated via a model-based approach, or a sensor monitoring approach or both.
The novel contributions of this paper are: (i) a unified framework combining failure time data, static parameter data and dynamic parameter data, (ii) Cox proportional hazards model to estimate the survival functions of error codes (tests) using failure time data and static parameter data, (iii) estimating multiple component degradations that are coupled via observations (test outcomes) using a novel inference algorithm, and (iv) simulation results on degradation estimation of multiple components in electronic throttle control (ETC) subsystem simulator, and regenerative braking system in hybrid electric vehicles.Since coupled systems are common in aircraft, automobiles, power systems, nuclear energy systems, and indeed any networked embedded cyberphysical system, the proposed prognostic framework is readily applicable to these systems.

PROGNOSTIC FRAMEWORK
As shown in Fig. 2, the proposed prognostic framework consists of two phases: an offline training and validation ("model learning") phase, and an online testing ("deployment") phase.

Training and Validation Phases (offline module)
The training phase consists of two steps (see Fig. 2).In Step 1, Type I and Type II data are used to compute static datamodulated survival functions for components, error codes, symptoms and any observable test outcomes via Cox proportional hazards model.Here, mutual information algorithm (Duda, Hart, & Stork (2000); Bishop (1997) where i denotes a component, diagnostic error code, symptom or any failure event of interest, z is a vector of covariates (Type II static data such as freeze frame data), a is a vector of regression parameters, and ϕ0(t) is the failure rate without any covariates (i.e., z = 0), i.e., it is the baseline hazard function.The baseline hazard function can be from any of the standard failure time distributions (e.g., exponential, Weibull, normal, log normal, Gamma, etc.) or it can be nonparametric.The baseline hazard function and the regression parameters are estimated via the maximum likelihood method (Klein, and Moeschberger, 2003).
The survival functions of components, Ri(t,z) and the associated failure time density functions, fi(t,z) can be computed from the hazard function ( , ) ( , )exp ( , ) ( , ) In Step 2, the survival functions corresponding to an event of interest i are grouped via clustering techniques, such as kmeans, learning vector quantization (LVQ), Gaussian mixture models (GMM), or hierarchical clustering (Duda et al, 2000;Bishop, 1997).These clusters represent different usage profiles of the components, depending on the usage conditions, environmental factors, etc.
Also, the probabilistic dependencies between the error codes (and other observables i.e., symptoms), and the component failure modes are evaluated using maximum likelihood estimation procedure.This dependency matrix captures the relationships between failure modes and error codes (or symptoms) and thus can be either binary (hard (0 or 1)) or probabilistic (soft).The probabilistic dependencies are in the form of a matrix of likelihoods of observing an error code or other observables given a failure mode and is given by, ( , where oj is the error code j, nij is the number of times oj is associated with failure mode i (FMi) and ni is the total number of observed cases with FMi.In order to avoid the problem with ML estimates, viz., the possibility of having a zero probability because of an unseen combination of (oj, FMi) in the training data (the so-called "black swan" problem), Laplacian smoothing (Metzler, Lavrenko, & Croft, 2004) is used as follows: where |FM| is the number of system failure modes.
Another  1986), such as the generalized likelihood ratio test, cumulative sum test, sequential probability ratio test, etc.) can be used to define thresholds to detect the presence of faults.
The tests that can detect a fault are represented as "1" in the corresponding row of the D-matrix; thus, each row corresponds to a "signature" for the fault associated with the row.Refer to (Singh, Holland, & Bandyopadhyay, 2010), and (Kodali, Ponizovskaya-Devine, Robinson, Luchinsky, Bajwa, Khasin, Perotti, & Brown, 2015) for various methods of generating dependency matrices.

Testing (Deployment) Phase (online module)
In the testing phase, when new feature data y(t) (Type III dynamic data) is obtained via online data acquisition systems, the survival probabilities of error codes (see Eq. ( 5)) are estimated using the Cox PHM model as well as the baseline hazard functions obtained from the offline module (from Type I and Type II data).Once the component survival functions are obtained, the first and second order moments of time-to-failure T of each component can be computed from the survival functions Ri(t,z) via Eqs.( 6) and ( 7), respectively (Ma & Krings, 2008).
Alternately, the remaining useful life (RUL) of a component at any time t can be computed from the survival function by defining a threshold on the survival probability.Mathematically, it is written as, where ε0 denotes the threshold for a functional failure.

SOFT DYNAMIC MULTIPLE FAULT DIAGNOSIS
The The inference problem can be formulated as one of finding the maximum a posteriori (MAP) configuration: ˆarg max Pr( (0)) where The solution is a primal-dual optimization framework that employs Lagrangian relaxation method for decomposing the original soft DMFD problem into parallel decoupled subproblems, one for each fault.
Each subproblem corresponds to finding the optimal fault-state sequence, which is solved using a soft Viterbi decoding algorithm.The subproblems are coordinated by updating the Lagrange multipliers using a subgradient method.The inputs to the algorithm are test outcomes o(k) at each sampling time k, their reliabilities, and a fault diagnostic matrix (D-matrix) that captures the relationships between failure sources and tests (see Eq. 4).The test outcomes could be statistical test decisions derived from sensor data.These test outcomes, together with the fault-dependency matrix, are processed through a primal-dual optimization method that exploits temporal correlations of test outcomes over time for inference.A detailed description of soft DMFD algorithms can be found in our previous work (Sankavaram, Kodali, Pattipati, Wang, Azam, & Singh (2011)).

EXPERIMENTAL RESULTS
The prognostic framework is applied to datasets derived from two automotive systems, namely, electronic throttle control (ETC) subsystem simulator, and regenerative braking system (RBS).The experimental results are briefly discussed in the following subsections.

Application to Electronic Throttle Control System
In the first application, the prognostic approach is applied to a dataset derived from an automotive ETC subsystem simulator.The function of an ETC subsystem is to determine the necessary throttle opening using sensors (such as the accelerator pedal position, the engine RPM, and the vehicle speed) and drive the actuator to obtain the required throttle position via a closed-loop control algorithm in the engine control module (ECM).The ECM also monitors the health of the engine subsystem by processing parameter identifier data (PIDs) collected from various sensors and generates diagnostic trouble codes (DTCs or error codes) when a failure occurs in any component.Refer to Appendix for a detailed description of DTCs and repair codes (RCs) in an ETC subsystem.
The dataset derived from the ETC simulator consisted of 11 error codes (DTCs), 479 status parameters (PIDs) collected at the time of DTC firing, accelerated age of the vehicle and the repair/replacement actions (i.e., repair codes (RCs)) performed on the system.A total of five different repair codes (replaceable components) are present in our training data.Here, the age of the vehicle at the time of repair/replacement action (i.e., the failure time of the replaced component) is the Type I data and the set of 479 PIDs collected at the time of DTC occurrence forms the Type II data.
On the available Type II data, information gain (mutual information) algorithm is employed to rank order the PIDs and optimal number of PIDs are then selected for the analysis (Duda et al., 2000;Bishop, 1997).The idea of IG algorithm is to evaluate the amount of information contributed by each feature to a particular class; and the subset of features with high information content is used for analysis.In the current application, IG algorithm is employed on 479 status parameters and the top 16 PIDs are selected based on the failure probability estimation accuracy on the test data.Fig. 3 shows the value of mutual information for each of the status parameters and the top features are circled in red.The total data available for Type I and Type II data is of the dimension β x 1 and β x 16 respectively where β is the total number of observed failure cases.
The fault-test dependency matrix (D-Matrix) between DTCs and RCs are learned via maximum likelihood estimation of probabilities (see ( 3) and ( 4)), which is later used in the inference algorithm for inferring the failing components.Table 1 shows the diagnostic matrix with 5 RCs and 11 DTCs.The diagnostic matrix represents fault signatures for component faults, these components are coupled via observations.For instance, when any of the components RC2, RC4 and RC5 are faulty, these faults will generate a set of observations that will fail the test P0121 (i.e., set to 1failed test outcome).It is evident that RC1 and RC3 are ambiguous, and hence are grouped into a single repair code throughout our analysis (RC1/RC3).Also, the fault signatures of components RC1/RC3, RC2, and RC5 are hidden within the fault signature of RC4 and hence are termed as hidden faults.
As mentioned in Section 3, the survival functions for components and tests are initially learned using the Cox PHM model.Then, a k-means clustering technique is employed to group the survival functions of RCs as well as DTCs.Here, there appear to be 3 clusters of survival functions.and 6, respectively.When the number of clusters were increased from 3 to 5 and 6, there was no significant improvement in terms of failure probability estimation.Hence, for this work, the number of clusters for the DTC and the RC was selected to be 3.
In order to validate the prognostic framework, the detection and false alarm probabilities of tests are initially learned from the averaged DTC survival function and the averaged RC survival function by minimizing the objective function derived from a noisy OR model (Singh et al., 2009b) given in Eq. ( 10).
A nonlinear least squares estimation technique is implemented to solve the problem in Eq. ( 10) using the MATLAB's fmincon function from the optimization toolbox to determine the optimal parameters {pdij, pfij}.Once the parameters are learned, continuous test outcomes are generated from each of the RC survival function clusters using Eq. ( 11).These soft test outcomes (equivalent to the test outcomes obtained from the Type III data in the Online phase) are used as input to the soft dynamic multiple fault  (Cluster 2), and RC4 (Cluster 1), respectively.In the first two cases i.e., Figs.7 and 8, the R-square fit is 94%, where as in Fig. 9 the R-square fit is only 78%.Although the R-square fit is low (in Fig. 9), at around 400 in the (accelerated) age axis, the estimated failure probability is higher than the actual component fault probability.This suggests that the algorithm can estimate/predict the failing component before the actual component reaches the failing thresholdwhich is expected of an effective prognostic algorithm.In addition to RC4 in Fig 9, the failure probability of RC2 is also significant compared to other repair codes, this is because RC2 is a hidden fault of RC4 and the algorithm inferred RC2 as failing component with failure probability of <0.5.Fig. 10 shows the mean square error (MSE) in the estimation of failure probabilities.To demonstrate the validity of the proposed prognostic framework, the component survival functions prior to clustering are used for further validation.Fig. 11 shows the component degradation curves prior to clustering.Using the learned parameters {pdij, pfij}, the continuous test outcomes are generated for each of the RC survival functions using Eq. ( 11).These soft test outcomes are used as input to the inference algorithm to infer the component degradations.The component degradations are estimated with an R-square fit of about 90% except for RC4 (see Fig 12).The performance of the soft-DMFD algorithm is also compared with some of the state of the art data-driven prognostic techniques, including, support vector machine regression (SVMR), relevance vector machine (RVM), and Gaussian process (GP) regression (Goebel, Saha, & Saxena, 2008).To train these regression techniques, 5x2 cross validation is employed, i.e., 50% of the data for each of the repair codes is randomly chosen for training the prognostic algorithm and is validated using the remaining 50% of the data and vice-versa.The process is repeated five times and the average performance in terms of R-Square and MSE are computed.Figures 12 and 13 shows the performance statistics of all the prognostic techniques in comparison.The Soft-DMFD inference technique performed better than other inference techniques.The average R-square with RVM was 85% whereas the other techniques, SVMR and GP had R-square fit of about 78% and 75%, respectively.The better performance of the soft-DMFD algorithm over others could be attributed to the Bayesian inference framework maximizing the MAP objective function and the soft-Viterbi algorithm for tracking the evolution of coupled component fault states that best explains the observed test outcomes over time.Similar performance results were observed in terms of MSE performance (Fig 13).

Application to Regenerative Braking System
In this application, the prognostic framework described in section 3 is applied to estimate the sensor and parameter degradations in a regenerative braking system (RBS).The RBS model with a series-parallel drivetrain configuration (Ehsani, Gao, & Emadi, 2009)  supervisory controller making the high-level decisions that affect the general state of the powertrain (e.g., engine on/off), the operating mode of the vehicle (e.g., propelling, regenerative braking etc.), and accordingly deliver the torque requests to the component controller.Subsequently, the component controller converts these torque requests into component commands.These commands are, in turn, treated as the actuator commands by the individual components in the powertrain model to achieve the requested torque and, consequently, report the system status (e.g., engine speed, battery state of charge) to the supervisory controller.The powertrain model comprises of all the components that mimic the behavior of hardware components, such as the engine, the battery, and the motor.Fig. 15 shows the individual component blocks in the powertrain model (with a series-parallel drivetrain configuration).The detailed model description and mathematical details are presented in (Sankavaram, Pattipati, Pattipati, Zhang, & Howell, 2014).2.
To demonstrate the framework, two faults are consideredmotor speed sensor fault and wheel inertia fault.These faults are injected into the model as additive biases on the measured signals.For instance, a 10% deviation in the motor speed is used to model the motor speed sensor fault.Similarly, the parametric fault i.e., wheel inertia fault is simulated as 10% deviation from its nominal value.Mathematically, the fault scenarios are simulated using the following equation, (1 ).
In Eq. ( 12), Xfaulty is the parameter value under faulty condition, Xnominal is the nominal parameter value and Δ is the   The component degradations (or) fault evolution is assumed to be of sigmoid (S-) shape as shown in Fig. 16 with degradation progressing gradually from low severity level (2%) to high severity level (10%) and eventually leading to a complete failure.Fig. 16 illustrates the fault progression from nominal to failure.Here, 10% severity level is treated as the component failure, i.e., the component has failed and is not considered to be operational at this point.
As a first step, the simulated failure time data (Type I data) for sensor and parametric faults is considered as shown in Table 3.Since the component failure is assumed to occur at 10% severity level, the corresponding feature data i.e., monitored variables is treated as the static parameter data       To establish the validity of the proposed approach, a Gaussian measurement noise with random seed is added to the monitored signals to generate additional patterns for the two faults considered in this application.The variance of the added noise is 0.6% of the squares of the magnitude of the signals; this corresponds to a signal-to-noise (SNR) ratio of 22.2 dB.A total of 10 patterns are generated for each fault, Fig. 24 shows the evolution of component degradations for motor speed sensor and wheel inertia faults.Here, the evolution of the faults is assumed to be of sigmoid shape (monotonically increasing), other forms of fault evolution will be studied as part of our future research.The dynamic feature data, i.e., feature data corresponding to 2%, 4%, …, 10% fault severity levels are used along with the learned baseline survival functions to generate test probabilities (see Eq. 2).When these test probabilities are fed to the soft-DMFD algorithm, the component degradations are estimated with an R-square fit of about 94%.Table 6 shows the average R-square and MSE in estimating the failure probabilities of motor speed sensor and wheel inertia faults.

CONCLUSIONS
The paper presented a novel approach for fault prognosis problem in coupled systems by combining three types of data, i.e., failure time data, static environmental and status parameter data, and dynamic data.The framework employed the Cox PHM to infer the survival functions of components and subsequently estimated the component degradations via the soft dynamic multiple fault diagnosis algorithm.The framework is applied to two different automotive applications to infer the component degradations (complementary survival functions) and the inference algorithm estimated the component failure probabilities with a good R-square fit.
The future work should include the application of this approach to continuous parameter identifier data and account for the uncertainty in RUL estimation.Another extension of the Cox-PHM framework for prognosis is by modeling coupled survival dynamics as monotone positive linear systems or monotone Markov processes (Zaslavski, 1984;Zaslavski, 1987).In monotone positive linear systems, the state matrix is a Metzler matrix (i.e., has nonnegative offdiagonal elements) to ensure that the state variables (in our case, survival functions) are nonnegative.In monotone Markov processes, the state generator matrix is upper triangular.Investigation of these concepts in the context of prognosis to estimate survival functions is a novel extension of the proposed prognosis framework.

Figure 2 .
Figure 2. Cox PHM-based Approach to Prognosis of Coupled Systems condition monitoring data (periodic or time-series data) alongwith the historical data to provide an accurate estimate for component degradations so that an appropriate predictive maintenance action can be taken on an as-needed basis.Coble (2010) compared the performance of prognostic models based on reliability data and condition monitoring data and demonstrated that reliability data alone does not result in satisfactory results for prognostics.Pecht, Das, and Ramakrishnan (2002),Lall, Pecht, and Harkim (1997), and Vichare, Rodgers, Eveloy andPecht (2004) are few other examples where it is emphasized that methods based on Type I data are insufficient for prognosticsthese methods tend to either underestimate or overestimate the remaining useful life.Therefore, it is vital to take into consideration the current operating status of the components (i.e., Type III data) as well as the historical data (Type I and II data) to estimate component degradations with good degree of accuracy.
to error codes (tests, symptoms).These soft error code outcomes are used to infer the failing components via the soft DMFD based on the D-matrix.The soft DMFD determines the evolution of fault states (complementary survival functions) given the soft error code outcomes at the observed time t.A brief explanation of the soft DMFD algorithm is presented in Section 4.
soft dynamic multiple fault diagnosis (DMFD) is a factorial HMM-based inference algorithm to determine the evolution of component fault states based on the observed soft evidence on the test states.Formally, the soft DMFD problem set of components (failure sources) associated with the system.The state of component si is denoted by xi, where ( ) 1 i xk if si is faulty; and ( ) 0 i xk otherwise.At epoch k, the status of all components at epoch k is set of available tests where γj(k)=1 implies that the test γj is in a failed state at time epoch k; and γj(k)=0 otherwise.Here, {0 1, , }      kK is the set of discretized observation epochs.The observations 12 {}    n O o o o constitute the soft evidence on the test states.For each component state, e.g., for component si at epoch k, A (Pai (k), Pvi (k)) denotes the set of fault appearance probability and fault disappearance probability defined , P (Pdij , Pfij) represent probability of detection and probability of false alarm associated with test outcome j and fault class i.
Fig 4 shows the original DTC survival functions.The different colors represent different clusters.The averaged survival function clusters for DTC P1682 and RC2 are shown in Figs. 5

Figure
Figure 3. Plot of Mutual Information for 479 Status Parameters

Figure 11 .Figure 13 .
Figure 11.Component Degradation Curves Figure 14.Functional Flow Diagram of RegenerativeBraking System There are 25 signals that are being monitored in the RBS system including (a) sensor signals, such as temperature, speed, and current measurements from the hardware components in the powertrain model; (b) motor, wheel, and engine torque demands sent from the powertrain controller to the component controllers; and (c) component commands sent from the individual ECU's to the hardware components in the powertrain model.The list of monitored signals is provided in Table Battery State of Charge; S2: Motor2 Torque Demand; S3: Wheel Torque Demand; S4: Motor1 Torque Demand; S5: Engine Torque Demand; S6: Battery Temperature; S7: Battery Current; S8: Driver Torque Demand; S9: Motor1 Command; S10: Gearbox Speed; S11: Wheel Input Speed; S12: Wheel Output Speed; S13: Wheel Torque; S14: Vehicle Linear Speed; S15: Motor1 Speed; S16: Motor1 Current; S17: Clutch Input Speed; S18: Engine Command; S19: Motor2 Command; S20: Motor2 Speed; S21: Motor2 Current; S22: Engine Speed; S23: Clutch Output Speed; S24: Mechanical Accessory Torque; S25: Wheel Command

Figure 15 .
Figure 15.Vehicle Powertrain Model with Series-Parallel Configuration fractional change in the parameter value (fault severity).The simulation data (monitored signals) thus obtained is used in the prognostic framework for estimating the component degradations.
(features at the time of failure) -Type II data.This feature data corresponding to 10% fault severity level is obtained via simulation-based fault injection experiments.During the offline training phase, the failure time data and static parameter data are used to learn the component survival functions using Cox PHM model (see (1)).Figures 17 and 18 show the baseline survival function and the complementary survival functions (component degradation curves) for motor speed sensor (F2) and wheel inertia (F3) faults.

Figure
Figure 16.An Illustration of Failure Progression from Nominal to Faulty

Figure 17 .
Figure 17.Baseline Survival Function of Components

Figure 21 .
Figure 21.Steps involved in Testing Phase (Online Phase)

Figure 22 .
Figure 22.Degradation Curve Estimation for Motor Speed Sensor Fault Figures 22 and 23 show the estimated component failure probability for motor speed sensor fault and wheel inertia faults, respectively.The estimated component degradations are in good agreement with the truth with an R-square fit of about 96%.An estimate of RUL at any time t can be obtained by defining a threshold on the failure probability.

Figure 23 .
Figure 23.Degradation Curve Estimation for Wheel Inertia Fault ) is employed to select minimum number of Type II static parameter data for the Cox proportional hazards model.
way to generate the fault-test dependency matrix is via a model-based approach.The model-based methods employ residuals as features, where the residuals are the deviations of actual sensor measurements from the expected ones.The residuals can be generated, for instance, based on parameter estimation, observers, parity relations, or via simulation-based fault injection experiments (Ciccarella, Dalla Mora, & Germani, 1993; Bar-Shalom, Li, &  Kirubarajan, 2001; Gertler, 1997).The residuals thus generated are used to devise tests and to extract fault-test relationships.Statistical hypothesis testing techniques (e.g., change detection techniques (Basseville & Beneveniste

Table 2 .
List of monitored signals

Table 3
. Simulated Failure Time Data for Components International Journal of Prognostics and Health Management, ISSN 2153-2648, 2016 009

Table 6 .
Table6also shows the comparison of soft-DMFD performance with other data-driven prognostic techniques.The soft-DMFD inference technique performed better than other inference techniques with 94% R-square fit.RVM was the next best technique with an average R-square value of 83% whereas the other techniques SVMR and GP had an R-square fit of about 75% and 68%, respectively.Similar performance results were observed in terms of MSE performance.Performance Results