Anomaly Detection and Severity Prediction of Air Leakage in Train Braking Pipes

Air leakage in braking pipes is a commonly encountered mechanical defect on trains. A severe air leakage will lead to braking issues and therefore decrease the reliability and cause train delays or stranding. However, air leakage is difﬁcult to be detected via visual inspection and therefore most air leakage defects are run to fail. In this research we present a framework that not only can detect air leakages but also predicts the severity of air leakages so that action plans can be determined based on the severity level. The proposed contextual anomaly detection method detects air leakages based on the on/off logs of a compressor. Air leakage causes failure in the context when the compressor idle time is short than the compressor run time, that is, the speed of air consumption is faster than air generation. In our method the logistic regression classiﬁer is adopted to model two different classes of compressor behavior for each train separately. The logistic regression classiﬁer deﬁnes the boundary separating the two classes under normal situations and models the distribution of the compressor idle time and run time separately using logistic functions. The air leakage anomaly is further detected in the context that when a compressor idle time is erroneously classiﬁed as a compressor run time. To distinguish anomalies from outliers and detect anomalies based on the severity degree, a density-based clustering method with a dynamic density threshold is developed for anomaly detection. The results have demonstrated that most air leakages can be detected one to four weeks before the braking failure and therefore can be prevented in time. Most importantly, the contextual anomaly detection method can pre-ﬁlter anomaly candidates and therefore avoid generating false alarms. To facilitate the decision-making process, the logistic function built on the


INTRODUCTION
Dutch Railways, the principal railway operator in the Netherlands, operates 178 VIRM (lengthened interregional rolling stock) trains which are a series of electric multiple unit (EMU) double-deck trains.These trains were built between 1994 and 2009 with on-board train management systems continuously logging particular events on the local disk or on a remote disk using wireless data communications.To ensure all trains are reliable and safe to operate at the lowest cost, Dutch Railways is continuously optimizing the maintenance schedule to plan when and what to maintain ( de Vos & van Dongen, 2015).Condition-based monitoring data collected from the on-board train management systems enable us to move towards prognostics and health management (Coble, Ramuhalli, Bond, Hines, & Upadhyaya, 2015;Schenkendorf & Groos, 2015) of rolling stock.Condition monitoring (Bartram & Mahadevan, 2015;Eker, Camci, & Jennions, 2014;Prakash, Narasimhan, & Pandey, 2017) is more than detecting train failure or malfunctions.Continuous gathering of data allows for trend analysis over the entire fleet and allows for data-driven performance improvements for instance actual state-dependent maintenance (Poot-Geertman, Huisman, & van Rijn, 2015;O'Donovan, Bruton, & O'Sullivan, 2016;Qiao & Weiss, 2016).With the wealth of condition information, static and reactive preventive and corrective maintenance scheduling will be replaced by dynamic and proactive predictive maintenance in the future (Eker et al., 2014;Schenkendorf & Groos, 2015).
Like the heart of a human body which pumps blood around the body, the compressor of a train pumps air into the braking pipe.The air in the braking pipe is then consumed by op-Figure 1. Propose prognostic health management methodology for air leakage in braking pipe.erations such as braking, door opening and closing and bioreactor usage.Air leakage in braking pipes is a commonly encountered mechanical defect on trains which occurs roughly 1.5 times a train per year on average.A severe air leakage will lead to braking issues and cause train delays or stranding.For a dense network operating 1,000 trains in the Netherlands, it is a high risk in disturbing the passenger transportation service.In short, air leakage defects are critical both in business impact and service reliability.Therefore, checking air leakage is one of the daily maintenance tasks in the service depot carried out by technicians.Nevertheless air leakage is one of the most difficult defects to be detected by visual or audio inspection carried out in the workshops.According to historical maintenance records from Dutch Railways, only a very low percentage of air leakages can be discovered by daily inspections.Vast majority of air leakages were often reported by train drivers during operation when they were facing braking or door issues resulting in an event of train delay or stranding.
In this work, a new discovery of using switch on and off logs of a compressor to detect air leakages in braking pipes is reported.Due to the high variation and noisy nature of the compressor behavior data, clustering techniques (Upadhyaya & Singh, 2012;Behera & Rani, 2016;Khan, Awad, & Thuraisingham, 2007) are considered most applicable to our application for anomaly detection.Anomaly detection (Chandola, Banerjee, & Kumar, 2009;Behera & Rani, 2016;Khan et al., 2007;Biswas et al., 2016) is widely applied in many applications where continuous monitoring is available.The goal is to find variants that are different from normal behaviors.In applications where false positives are very expensive, post-processing or human interaction are often required to eliminate false positives.Contextual anomaly detection (Mahapatra, Srivastava, & Srivastava, 2012;Hayes & Capretz, 2015) is a newly emerging field of study that aims to detect anomalies that occur within the context of other meta-information such as spatial or temporal information.For instance, a sensor value 0 during work hours might be normal while it is abnormal during off-work hours.In this study the logistic regression classifier (Mitchell, 1997;Ng & Jordan, 2002) is adopted for building context of "Compressor Run Time" and "Compressor Idle Time" separately for each train.The context is used for defining a threshold to filter out nontargeted regions because air leakage is most likely occur in regions where "Compressor Idle Time" is overlapped with "Compressor Run Time".This threshold differs per train due to difference in configuration, age and usage, and therefore the role of the logistic regression classifier is to model the distribution of these two classes in order to identify the decision boundary between them separately for each train and use it as the threshold.
However, during normal services, the air consumption in the braking pipes can be triggered by activities such as braking, door opening and closing and bio-reactor usage and so forth.Also the number of carriers of the train has an impact on the duration of air consumption.Therefore there might be sudden and singular occurrences of speedy air consumption due to sudden increase of air consumption demands.This kind of sudden increase of usage demand is defined as an outlier.The most intuitive way to distinguish anomaly from outlier is based on density since a leakage is a mechanical failure which will occur constantly in a certain period of time.On the other hand, a sudden increasing demand is often a single and random event.In this work, we have developed a density-based clustering approach which is inspired by DB-SCAN (Ester, Kriegel, Sander, & Xu, 1996;Birant & Kut, 2007) to detect regions of high density.These regions indicate the existence of air leakage.To consider the severity degree of air leakage in anomaly detection, we have defined a dynamic density threshold based on the logistic model describing the context.The result of contextual anomaly detec- tion based on the density-based clustering approach suggest that air leakages in braking pipes can be detected at least one to four weeks before the braking failure.
To facilitate the decision-making process and decide the right time for operations and maintenance actions, the logistic function built on the compressor run time is further used together with the duration of an air leakage to model the severity of the air leakage.The severity level is modeled in such a method that it is always between 0 and 1, and a higher value means a higher severity degree in air leakage.By building the prediction model on the severity, the remaining useful life of the air braking pipe until it reaches a certain level of severity can be estimated.
To summarize, the main contribution of this work include: • enabling the data-driven detection of air leakage in train braking pipes, • development of the robust contextual anomaly detection method to detect air leakages among different data distributions, • development of the severity index to indicate the severity level of a detected air leakage to facilitate the decision making process for maintenance planning.
Our proposed prognostic health management methodology for air leakage in braking pipe is as shown in Fig. 1.The lower part consists of the procedures and the upper part contains the proposed methods/models.In addition to the conventional procedures (Coble et al., 2015), we have added a feedback loop to the methodology to enable both validation and continuous improvement on the models.In this paper, the application and development of logistical regression classifier in the Monitor and Detect procedure and density-based clustering in the Diagnose procedure will be described in detail in Section 2. The development of severity modeling by logistic function and severity prediction using the Linear Regression models will be introduced in Section 3. Experimental results on individual trains and over the fleet will be presented and discussed in Section 4.

CONTEXTUAL AIR LEAKAGE DETECTION IN BRAK-ING PIPES
In Fig. 2, the sketch of the mechanism of the air circulation system is illustrated.The air pressure of a main reservoir on VIRM trains should be kept in the level between 8.5 and 10 bar at all times.When it drops to below 8.5 bar, the compressor will be switched on to pump air into the main reservoir until the air pressure reaches 10 bar again.After 10 bar is researched, the compressor will be switched off.Air in the main reservoir will be consumed by the braking pipe during service.The time it takes for a compressor to pump air into the main reservoir is defined as the "Compressor Run Time" in this work.The time it takes for the braking pipe to consume air in the main reservoir while the compressor is switched off is defined as the "Compressor Idle Time".It is not difficult to imagine that when the speed of air generation is slower than air consumption there will be insufficient air supply to the braking pipes and therefore braking issues will occur.One of the most possible cause of this phenomenon is air leakage in the braking pipe which is a commonly found mechanical defect on trains.To the best of our knowledge, this is the first work discovering the capability of switch on/off logs of a compressor in detecting air leakage in braking pipes.Such a discovery is extremely valuable since air leakage is one of the most difficult defect to be detected by visual or audio inspection carried out in the workshops.By converting the switch on/off logs of a compressor into duration of compressor run time and idle time, air leakage is finally possible to be continuously monitored and detected from data.
Due to the difference in configuration and operational use of each train, the range and distribution of "Compressor Idle Time" and "Compressor Run Time" differ per train.However, the physical observation that when the speed of air consumption is faster than the speed of air generation, there might exist an air leakage applies in general to all trains.To find out the region of interest for air leakage detection for each train, the logistic regression classifier is adopted for building context of "Compressor Run Time" and "Compressor Idle Time" separately for each train as a two class problem where "Compressor Run Time" is the positive class and "Compressor Idle Time" is the negative class.By building the context with logistic models, a threshold can be defined at the intersecting point where the probability of the positive class and negative class are both 0.5 to pre-filter non-targeted regions.Since air leakages occur most likely in regions where "Compressor Idle Time" is overlapped with "Compressor Run Time" and therefore only "Compressor Idle Time" with a similarity higher than 0.5 by applying the logistic model of the positive class will be considered for the clustering procedure.

Learning Context with Logistic Regression Classifier
Logistic regression classifier is a linear model for learning P (Y |X) in the case where Y is the class label and X = [x 1 , x 2 , ..., x m ] is an input data vector.In our application we only consider the case where Y is a boolean variable (2 class problem) and m equals 1 which means the data vector is one-dimensional.The parametric model assumed by logistic regression in the 2 class setting is: and . ( The goal is to learn the parameters w j , for all j's from training data.Since the sum of the two probabilities in Eq.( 1) and Eq.( 2) must equal 1, Eq.( 2) can be directly derived from Eq.(1).
In our application, we use the point X where P (Y = 0| X) equals P (Y = 1| X) as the threshold.That is the point and X = [ x1 ] since the data vector is one-dimensional in our application.By taking the natural logarithm, this becomes By transforming Eq.( 4), one can derive Therefore, after learning parameters w j , for all j's, the point X can also be derived.That is, for all data points in the negative class ("Compressor Idle Time"), only those less than or equal to X will be included in the clustering procedure for anomaly detection.
In Fig. 3, an example is given to illustrate the functionality of the logistic regression classifier in our application.From the distribution of 2 classes in Fig. 3(a), the overlapped area can be observed.The logistic regression classifier models these 2 classes with logistic functions as shown in Fig. 3(b) to find the most significant point to distinguish these 2 classes.In this example, the intersecting point of these 2 classes is at 497.4 and it is used as the threshold for filtering out any "Compressor Idle Time" with a duration value larger than 497.4 since these values are very unlikely to be generated from air leakages.

Training Logistic Regression Classifier
One common approach to train a logistic regression model is to choose parameter values that maximize the probability of the observed Y values in the training data, conditioned on their corresponding X values.That is, to choose parameters W satisfying where W = [w 0 , w 1 , ..., w m ] is the vector of parameters to be estimated, Y k denotes the observed value of Y in the kth training example, and X k demotes the observed value of X in the kth training example.The expression to the right of the argmax is the logarithm of the conditional likelihood.
By substituting with Eq.(1) and Eq.( 2), the log of the conditional likelihood l(W ) can be then expressed as: where x k i denotes the value of x i for the kth training example.However, these is no closed form solution to maximizing l(W ) with respect to W , and one common approach is to use gradient ascent.The ith component of the vector gradient has the form where P (Y k = 1|X k , W ) is the prediction result of the logistic regression classifier.Since the conditional log likelihood is a concave function, this gradient ascent procedure will converge to a global maximum.By beginning with initial weights of zero, the weights are iteratively updated with where η is the step size which is often a small constant.

A Density-Based Clustering Approach for Anomaly Detection with A Dynamic Density Threshold
For clustering in a noisy dataset, density-based approaches are most commonly adopted.Among them, DBSCAN (Ester et al., 1996;Birant & Kut, 2007) is one of the most wellknown approaches which requires two parameters: and min-P ts.The parameter defines the neighborhood of a considered point, and minP ts is the minimum number of points required to form a dense region.DBSCAN starts with an arbitrary starting point that has not been visited.This point'sneighborhood is retrieved, and if it contains a sufficient number of points, a cluster is started.Otherwise, the point is considered as a noise.However this point might later be found in the -neighborhood of a different point containing a sufficient number of points and hence be made a part of a cluster.If a point is found to be a dense part of a cluster, its -neighborhood is also part of that cluster.Hence, all points that are found within the -neighborhood are added, as is their own -neighborhood when they are also dense.This procedure iterates until all points are visited.
In the case of air leakage detection, the detection capability in a severe region needs to be higher than that in a less severe region.Therefore we have defined a dynamic density threshold based on the condition of severity.The procedure of our density-based clustering approach for anomaly detection with a dynamic density threshold is described in the following: • Step 1: Use the threshold defined in Section 2.1 to limit the search range of anomalies. • Step 2: In the interested region, calculate the neighborhood density of each data point.The neighborhood density of a data point is the number of data points located in its -neighborhood region.A -neighborhood region of a data point x i is defined by: where is an user-defined constant. • Step 3: Classify a data point and all other points located in its -neighborhood as anomalies if its neighborhood density is higher than the density threshold.The density threshold should be dynamic and vary with the degree of severity.That is, a more severe air leakage (shorter idle duration) should be more easily detected by giving a lower density threshold and vice versa.By giving a user-defined density limit minP ts, the dynamic density threshold β becomes where 1 1+exp(w0+ m i=1 wixi) is adopted from Eq.(1).
• Step 4: For a data point, if its neighborhood density is greater than or equal to its dynamic density threshold β, this data point and all the other data points within the neighborhood of this data points will be labeled as anomalies. • Step 5: Data points share equivalence relations will form a connected component which is an irregular shaped cluster C k .
In detecting mechanical defects, occurrence frequency and severity degree are two of the most commonly concerned factors.The proposed density-based clustering captures both by setting dynamic density thresholds.

SEVERITY MODELING AND PREDICTION
From the diagnostic procedures described in the above section, air leakages often can be detected one to four weeks before failure.However, for operations and maintenance planning, a exact indication for alarm triggering and action is usually desired.Therefore, in order to define the moment for action, in this work a severity index S is proposed and it is designed by combining the logistic function and the duration of a air leakage as shown in Eq.( 12).
where t p is the current time stamp of incoming data points, t 0 is the oldest time stamp of all data points x i in C k , and D is the pre-defined duration of an observation period.That is, S tp C k is the severity index of cluster C k at time stamp t p .Since S tp C k is always between 0 and 1, a threshold θ can be easily chosen for decision making.The purpose of the observation period is to reduce the possibility of unnecessary alarms.The necessity of the observation period D may differ per application and the duration of the observation period depends on the operational and maintenance planning requirements of the train operator.
Given the current severity index, a prediction model f (•) can be designed to estimate the severity index in the upcoming hours as where Ŝtp+1 C k is the estimated severity index of cluster C k for time stamp t p+1 .In this application, the Linear Regression model is adopted for predicting the severity indexes.The time horizon in which the severity index is predicted to reach the action threshold θ is considered the remaining useful life of the air circulation system.

EXPERIMENTAL RESULTS
From 178 VIRM trains 632,683 data points were collected in the period from May 2015 to October 2016, in which 6,957 are labeled as "Air Leakage" and 625,726 are labeled as "Normal".To be more clear, each data point is the median "Compressor Run Time" or median "Compressor Idle Time" of a specific compressor within one hour.For different trains, the occurrences of compressor switch on/off logs differ per day (ranging from 30 to 120 counts a day) depending on the number of carriers, compressor type and bio-reactor, etc.In order to build a generic detection algorithm over the entire fleet, a more universal unit is required to construct a stable system and therefore time (in this case, hour) is used as the unit but not individual occurrence.The data labels are derived from maintenance records of air leakages in the same period.In these 178 trains, 55 trains are mounted with real-time monitoring systems and the data were sent via 4G network directly into the data center.In the rest of 123 trains, the data were read out physically with laptops in the maintenance depot.Due to the manual operations, there were sometimes long gaps in weeks or months between data records.

Diagnostics
In the experiments, the logistic regression classifier was built for each train to derive the filtering threshold on "Compressor Idle Time".For anomaly detection, our density-based clustering approach finds clusters in a two-dimensional dataset consisting two features, i.e., "Compressor Idle Time" and date converted into the number of days from January 0, 0000.The -neighborhood of a data point for the clustering procedure is therefore two-dimensional with 1 of "Compressor Idle Time" being 0.2 times X and 2 of the number of days being 2 days.The density limit minP ts is set to 20 for computing the dynamic density threshold β.The selection of these and minP ts values is based on both the engineering principals and optimization against the confusion matrix (detection capability against false alarm).The considered ranges of and minP ts need to be physically sensible and then the different combination of these parameters sampled within the ranges should be tested in order to derive the optimal combination.
The original "Compressor Run Time" and "Compressor Idle        Time" data, thresholds derived from logistic regression classifiers, and the results of contextual anomaly detection for four trains with numbers 8608, 9580, 8652 and 8640, are presented in Fig. 4,Fig. 5,Fig. 6 and Fig. 7,respectively.From the figures, it can be observed that the logistic regression classifier identifies a proper boundary separating the "Compressor Run Time" and "Compressor Idle Time".Please notice that the percentage of air leakage data points is relatively small and therefore it generally does not give a large impact on the logistic regression classifier.
In order to verify the effectiveness of pre-filtering using the logistic regression classifier, a baseline anomaly detection procedure is compared with the proposed procedure.The baseline anomaly detection procedure first adopted the DBSCAN clustering approach to find the dense regions on all "Compressor Idle Time" data points as shown in Fig. 8(a).Then the logistic regression classifier is used for post-filtering to remove detected points above the threshold as given in Fig. 8(b).It can be observed in Fig. 8(b) that several points were wrongly detected as anomaly after post-filtering due to the high density in normal regions.
The confusion matrix of the results of our proposed contextual anomaly detection is given in Table 1.The values in brackets are those computed in the form of percentage.From the confusion matrix, our method for contextual air leakage detection in train braking pipes not only has a high detection capacity of 84% but also a very low false alarm ratio.Without the context modeled with the logistic regression classifier, there will be a large amount of false alarms if a densitybased clustering approach is applied.Even applied with a post-filtering threshold as described in the baseline anomaly detection procedure, the amount of false alarms is also significant as shown in Table 2.Moreover, as observed in Fig. 9 the proposed anomaly detection procedure is computationally much more efficient than the baseline anomaly detection procedure since the proposed anomaly detection procedure pre-filters the "Compressor Idle Time" which resulting in a small subset of data points considered for the density-based clustering while in the baseline anomaly detection procedure, all "Compressor Idle Time" data points were used in densitybased clustering.

Prognostics
To incorporate the operational and maintenance planning requirements of Dutch Railways, the observation period D is set to 7 days in the experiments.As a result, the severity indexes derived from Eq.( 12) for various air leakages cases are shown in Fig. 10.
To be able to estimate the remaining useful life, the severity indexes of all 5,844 correctly detected air leakage data points listed in Table 1 are  on Root-Mean-Square Error (RMSE) by using the previous 5 hours data to estimate the severity index of the upcoming hour.That is, This procedure can be iterated by using the previous estimated output as input such as up to a certain time horizon or until the pre-defined severity threshold θ is reached.
To avoid collinearity issue in the input features, the experimental results of Ridge Regression is also provided in Fig. 11 in addition to Linear Regression.For severity prediction there is no collinearity issue observed since the RMSE of Linear Regression and Ridge Regression are very similar to each other.From Fig. 11 the 10-fold RMSE of the Linear Regression model in predicting severity index from the next first (operational) hour to the next fifteenth (operational) hour proportionally increases from 0.01 to 0.09.This indicates the certainty of severity prediction will decrease when the prediction time horizon increases.However, it still remains in a reasonable degree within 15 operational hours which is about 2 calendar days since a train is about 8 hours in operation each day in the Netherlands.
The methodology of remaining useful life prediction for an air leakage is illustrated in Fig. 12.After an air leakage was detected for 5 hours, the severity indexes of these 5 hours were then used to predict the upcoming 320 operational hours.
The predicted severity indexes are indicated by the blue solid line while the actual severity indexes are indicated by the red solid line in the figure.Assume that the severity threshold θ is set to 0.8, the severity index is predicted to reach 0.8 after 156 operational hours by our method.Therefore the predicted re- maining useful life is 156 operational hours while the actual remaining useful life is 190 hours.

DISCUSSION
In this paper, we presented a framework that not only can detect air leakages but also predicts the severity of air leakages so that action plans can be determined based on the severity level.The air leakage in train braking pipes is detected based on the compressor behavior data.In order to avoid false alarms, the logistic regression classifier is adopted to model context of "Compressor Run Time" and "Compressor Idle Time" and use the boundary separating these two classes as the threshold for pre-filtering candidate of anomalies.In order to detect anomalies according to their severity degree in a noisy dataset, a density-based clustering approach with a dynamic density threshold is developed.The experimental results have demonstrated that our method for contextual air leakage detection can detect air leakages effectively without generating false alarms.To facilitate the decision-making process and define the moment for action in operations and maintenance planning, the logistic function built on the compressor run time is further used together with the duration of an air leakage to model the severity of the air leakage.By building the prediction model on the severity, the remaining useful life of the air braking pipe until it reaches a certain level of severity can be estimated.

AKNOWLEDGEMENT
The author would like to thank Mark Aalbers and Inge Kalsbeek from System Engineering Department of NS Techniek for their valuable contributions in explaining the mechanism of air circulation system in trains.

Figure 2 .
Figure 2. Illustration of air circulation system.

Figure 3 .
Figure 3.The (a) distribution in histograms, and (b) trained logistic models of "Compressor Run Time" and "Compressor Idle Time" data of train 8608.

Figure 4 .
Figure 4.The (a) original compressor duration data and (b) result of the proposed air leakage detection of train 8608.
Figure 5.The (a) original compressor duration data and (b) result of the proposed air leakage detection of train 9580.
Figure 6.The (a) original compressor duration data and (b) result of the proposed air leakage detection of train 8652.
Figure 7.The (a) original compressor duration data and (b) result of the proposed air leakage detection of train 8640.

Figure 8 .
Figure 8.The results of (a) density-based clustering and (b) baseline anomaly detection for train 8652.

Figure 9 .
Figure9.Comparison in computational cost between the baseline and our proposed anomaly detection method for air leakage.
Figure 11.Comparison of RMSE in predicting time horizon from 1 to 15 hours using trained Linear Regression and Ridge Regression models.

Table 1 .
Confusion matrix of the experimental results of the proposed air leakage detection procedure

Table 2 .
Confusion matrix of the experimental results of the baseline anomaly detection procedure Figure 12.Illustration of remaining useful life prediction for an air leakage.