A New Adaptive Prognostics Approach Based on Hybrid Feature Selection with Application to Point Machine Monitoring

This paper proposes a new adaptive prognostics approach consisting of hybrid feature selection and remaining-usefullife (RUL) estimation steps for railway point machines. In step-1, different time-domain based features are extracted and the best ones are selected by the hybrid feature selection method. Then, a degradation model is fitted to each of the selected features and the parameters are estimated. In step-2, the RUL of the component is predicted by using the proposed adaptive prognostics approach. The adaptive prognostics is based on the weighted likelihood combination of the estimated model parameters. The model parameters each of which estimated by curve fitting are used in the calculation of the likelihood probability weights. Then, an adaptive degradation model is built by using the weighted combination of the model parameter estimates and the component RUL is estimated. The proposed approach is validated on in-field point machine sliding-chair degradation and the results are discussed.

Railway turnout system, which consists of sliding-chair plates, point machine, stock rails and locking systems are used to control the train turnouts at a distance (Eker et al. 2011).In literature, the point machine failure diagnostics (Atamuradov, Medjaher, Lamoureux, Dersin, & Zerhouni, 2017;García Márquez, Roberts, & Tobias, 2010) and prognostics (Letot et al., 2015) have been studied extensively.However, there still remain many problems that need to be studied to increase the accuracy while minimizing the uncertainty in RUL prediction.One of the key steps in the development of robust and accurate fault prognostics is the selection of good prognostics features.
In literature, feature evaluation and selection techniques are classified as a) inherent: which uses ranking metrics to filter out least interesting feature (e.g.trendability, monotonicity (J.B. Coble, 2010) and seperability (Camci, Medjaher, Zerhouni, & Nectoux, 2013), etc.), b) consistent: which filters out the least correlated feature from the given feature population, and c) hybrid: which is the combination of inherent and/or consistent techniques (Lei et al., 2018).The authors in (Javed, Gouriveau, Zerhouni, & Nectoux, 2015) proposed an inherent feature selection technique to increase the prognostics accuracy.In (Liao, 2014) an inherent feature evaluation metric was integrated with genetic algorithm (GA) to discover good prognostics features for RUL prediction.The authors in (J.Coble & Hines, 2009) developed a hybrid feature selection technique for prognostics based on the linearly weighted combination of the inherent and consistent techniques.In this proposed technique, the weights were optimized by utilizing the GA.However, despite the good optimization performance, feature selection techniques based on heuristic algorithms, e.g.GA, might be computationally expensive, particularly if there is a big amount of feature samples.Hence, the development of computationally efficient feature selection techniques is necessary to improve the failure prognostics accuracy.Then, the selected prognostics features can be used to train the prognostics tools for RUL prediction.
The failure prognostics can be defined as the process of predicting the remaining time (RUL) at which a component will no longer perform a particular function.The authors in (Omer F Eker & Camci, 2013;Omer Faruk Eker et al., 2011), developed a state duration based prognostics approach for point machine monitoring.The developed approach gave better RUL prediction results when compared with different prognostics tools.A data-driven failure prognostics model was proposed by (Letot et al., 2015) for point machine monitoring based on the power signals to predict the RUL.A similar data-driven prognostics approach based on a Bayesian parameter update was also proposed in (Ashasi-Sorkhabi et al., 2017), for train gearbox monitoring using vibration signals.A failure of train braking system was studied in (Lee, 2017).The authors developed an air leakage detection and prediction approach based on a density-based clustering and logistic function.Since prognostics approaches deal with the prediction of the future component health states, the uncertainties in system parameters, nominal system model, degradation model, RUL prediction, and failure threshold should be well quantified in component health assessment (Atamuradov, Medjaher, Dersin, et al. 2017;Sankararaman & Goebel, 2015).
To fill the aforementioned gaps in the literature, this paper proposes a new adaptive prognostics approach based on hybrid feature selection for railway point machine slidingchair degradation.The proposed approach is composed of two steps.
In step-1, a hybrid feature selection method is developed.It is based on the affinity matrix and inherent feature evaluation.The affinity matrix is built to calculate the features' relative importance weights (RIWs).The inherent feature evaluation deals with the calculation of monotonicity, correlation and robustness metrics of each feature.Then, a hybrid fitness function is constructed by combining the weighted (with RIWs) inherent feature metrics and the features are ranked accordingly.The features with the highest hybrid ranking value are selected and used in prognostics.
In step-2, a degradation model is defined to each of the selected features and the model parameters are estimated.Then, a likelihood probability of each parameter is calculated by using the estimated model parameters of each feature.Afterward, an adaptive degradation model is constructed by using the weighted combination of the estimated model parameters with the likelihood probabilities.The adaptive degradation model parameters are estimated and updated at each prediction time, iteratively, to estimate the RUL.
The paper contains four sections.After the introduction, Section 2, describes the main steps of the proposed prognostics approach.Section 3 presents the experimental rig setup, data collection and the results of the proposed approach.Section 4 concludes the paper.

PROPOSED APPROACH
In this section, the hybrid feature selection and the adaptive prognostics steps will be explained in detail.The overall scheme of the proposed approach is depicted in Figure 1.
Extracted features

Physical system
dry-sliding chair failure mode modelling Step-1: Hybrid feature selection Step-2: Prognostics Model definition RUL prediction

Hybrid feature selection
First, time-domain based features such as skewness, root mean square (rms), kurtosis, mean, standard deviation (stdev), variance (var), crest factor (crfactor) and peak-topeak (p2p) are extracted from the raw measurements.The features in different scales are normalized before selection by using equations ( 1) and ( 2).
The hybrid feature selection is carried out in two-steps.In step-1, the affinity matrix ( 4) is built using the Euclidean distance (3).
where  is the length of the given features  and .
where (  ,   ) is the Euclidean distance between the features   and   from the feature population with a size of  .The relative importance weight   of the  ℎ (∀ = 1 … ) feature is then derived by using the exponential membership function (5).
where   is the monotonicity value for the  ℎ feature (  ) with length of  .The absolute value of the difference between number of positive (# where  is the covariance of  ℎ feature (  ) with the time vector , and  is the standard deviation.The robustness metric stands for the features' resistance to the measurement noise and it is calculated by decomposing the feature into trend (ℎ_  ) and residual (  ) components by using equations ( 8) and ( 9).
=   − ℎ_  (8) where ℎ_  is the smoothed feature,  is the length of  ℎ feature (  ).Then, the hybrid ranking function is built by using equation ( 10), which is the combination of the inherent metrics weighted by the corresponding relative importance weights.
Finally, the ℎ vector is sorted in descending order starting from the highest relevant feature to the lowest relevant feature.Once the feature ranking step is completed, the top best (  ) features are selected and used in prognostics.

Adaptive prognostics approach
In this study, a polynomial function with a degree of 3 is used to model the sliding chair degradation due to its good degradation representability.This model is given in equation ( 11).The steps of the adaptive prognostics approach are illustrated in Figure 2.
()  =  ×  3 +  ×  2 +  ×  +  (11) where ()  is the model output at time  and , , ,  are the model parameters to be estimated.The model parameters of each of the selected features are estimated by using a curve fitting toolbox of MATLAB.Then, the estimated parameters are used to build an adaptive degradation model for RUL prediction.
A similar work based on the Dempster-Shafer evidence theory to build a prior model for battery degradation was proposed in (He, Williard, Osterman, & Pecht, 2011).The belief measure was assigned to each of the estimated parameters of the corresponding feature, by comparing the parameter confidence intervals.The basic idea was to assign a more belief weight to the parameter interval that includes the other parameter intervals to be used in parameter combination.However, the disadvantage of this evidence theory-based approach is that if there are no any such interval subsets, then it combines the parameters with the equal weights resulting in a simple weighted arithmetic mean combination.
The difference between our approach and the work in (He et al., 2011) is that the calculation of parameter likelihood Figure 2. The adaptive prognostics approach steps.
weights are not limited to the confidence interval length.Instead, in our work, the estimated parameters get varying likelihood weights, as follows:  If there is no parameter confidence interval that includes the other parameter intervals, then each parameter gets a varying likelihood weights proportional to their values.
 If one of the parameter intervals includes the other parameter interval(s) or has wider interval length, then a more likelihood weight is assigned to this parameter(s).Note that, our approach does not compare the parameter intervals, but only the likelihood of the estimated parameters.If one of the estimated parameters is bigger than the others, then, theoretically, it should have the wider length of the confidence interval (i.e. the estimated parameter is the mean of the confidence interval estimates).
Let's assume that there are   selected features and the  , = { , ,  , ,  , ,  , ,  = 1, . .,4} are the estimated initial parameters from each features' degradation model.Then, the likelihood probability weight for the  ,=1 (i.e. the 1 st estimated parameter of model  ) is calculated by using equation ( 12).
The same equation ( 12) is used to build the likelihood probability weights for the other  ,1 ,  ,1 ,  ,1 parameters.
After the calculation of ℓ , values, the adaptive degradation model parameters  1 ,  2 ,  3 ,  4 can be estimated by the weighted arithmetic mean function, which is given in (13).
The adaptive degradation parameters are updated at each time stamp, then the adaptive RUL is estimated.The RUL prediction accuracy () is calculated by using equation ( 14) (Tobon-Mejia, Medjaher, & Zerhouni, 2012).
where  is the number of data points used in RUL prediction.
For the best prediction performance, the  produces 1, and 0 for the worst.

APPLICATION AND RESULTS
This section explains the experimental rig setup and data collection procedures for point machine and presents the proposed approach results.

System description and data collection
In this study, we investigated the dry sliding-chair failure mode of the point machine, which is generated by an accelerated aging procedure (i.e. a manual contamination process such as soiling or scratching out the grease) of the sliding-chair plates.Sliding-chair plates are the metal assets of the turnout system that assist the point machine drive rods in moving the rail blades easily.The dry sliding-chair degradation data were generated on the real turnout system with 12 sliding-chair plates, in total.At first, all 12 plates were individually lubricated and the point machine was run 10 times in each movement to get the first healthy (fault-free) measurements.
Afterward, the accelerated aging procedure took place by contaminating the three farthest (10 th , 11 th , and 12 th ) plates from the point machine to get an initial faulty state.The second faulty state was generated by contaminating the 9 th plate after the first process.After each step of the contamination process, the point machine was run 10 times from normal-to-reverse (forth) and reverse-to-normal (back) positions to collect the measurements.The contamination on sliding-chair plates results in variation of performance measurement signals (e.g.force, current, voltage, etc.) due to the increasing friction force against the turnout driving rod force applied to move the blades.The accelerated aging procedure was repeated until a final and complete slidingchair failure state was reached.Note that no trains went through the turnout system during the data acquisition operation.It was temporarily reserved for experimentation purposes only.The force and current sensor measurements are the most commonly used data in the literature for point machine diagnostics and prognostics (García Márquez & Schmid, 2007).In this study, the force measurements are used to validate the proposed approach.

Results and Discussions
Figure 4 shows the extracted features and normalized features from the raw measurements (equations ( 1) and ( 2)).
Table 3 shows the results of the model goodness-of-fit statistics (R 2 ) and the estimated parameters for each of the features by using the curve fitting toolbox of MATLAB.The R-statistics indicates that the polynomial model is suitable to represent the degradation of the sliding-chair plate.Before triggering the prognostics tool, a faulty state from the degradation data should be detected first.In this paper, the faulty state was obtained by projecting the F5-F3-F8 feature combination in the representation space (Soualhi, Medjaher, & Zerhouni, 2015) as depicted in Figure 5.The representation space of the feature combination allows identifying the health state transitions of the sliding-chair degradation.Since the training features F5, F3 and F8 have correlated degradation pattern, it was assumed that they have the same cycle number where an incipient fault occurs.After the detection of the incipient fault, which is at cycle 69, the feature degradation models ( 1 ,  2 ,  3 ) are trained and the initial parameters are estimated as shown in Table 4.   5.
From the initial parameter estimates (see Table 5), the combined parameters can be estimated by using equation ( 13).The combined parameters  1 ,  2 ,  3 ,  4 for the adaptive degradation model () are given in Table 6. Figure 6 shows that the combined parameters and their confidence intervals (C.I.) are adapted to the change of the initial model parameter estimates and their C.Is.The parameter updating is iteratively repeated as new data points are available until the end-of-life (EoF) threshold value (see Figure 2).Figure 7 shows the RUL prediction results for the models  1 (1),  2 (2),  3 (3) and () , whereas Table 7 presents the RUL prediction accuracies ( 1 ,  2 ,  3 ,   ).As can be seen from the given Table 7 , the proposed adaptive prognostics approach improved the RUL prediction accuracy, which proves the applicability in railway point machine monitoring.

CONCLUSION
In this paper, a new adaptive prognostics approach based on a hybrid feature selection method was proposed for point machine sliding chair monitoring.A polynomial model was defined for each of the selected features and the model parameters were estimated.Then, the adaptive degradation model was built based on the likelihood probability weights calculated by using the initial model parameter estimates of the selected prognostics features.The model parameters were updated, iteratively, and the RULs were estimated.The results showed that the proposed prognostics approach improved the RUL prediction accuracy for the sliding-chair degradation.
As a future work, we plan to extend the proposed approach and to develop an adaptive system-level prognostics approach based on the extracted features from different components for condition monitoring and predictive maintenance.
Figure 3 a) shows the in-field experimental test-rig setup, Figure 3 b) the turnout system and Figure 3 c) installed sensors for data acquisition.

Figure 3
Figure 3. a) Experimental setup, b) railway turnout system and c) installed sensors.
Figure 4. a) Raw measurements, b) extracted features and c) normalized features.

Figure 5 .
Figure 5. State detection by representation space projection.
equation (12) the likelihood probability weights of the training model parameters were calculated and the results are given in Table

Table 3 .
Estimated parameters including the 95% confidence interval bounds with R-statistics.

Table 4 .
Initial parameter estimates after fault detection.

Table 6 .
Combined adaptive degradation model parameters.