An application based comparison of statistical versus deep learning approaches to reciprocating compressor valve condition monitoring

This paper presents a vibration-based condition monitoring approach for early assessment of valve wear in an industrial reciprocating compressor. Valve seat wear is a common fault mode that is caused by repeated impact and accelerated by chatter. Seeded faults consistent with valve seat wear are installed on the head-side discharge valves of a Dresser-Rand ESH-1 industrial reciprocating compressor. Due to the cyclostationary nature of these units a time-frequency analysis is employed where targeted crank angle positions can isolate externally mounted, non-invasive, vibration measurements. A region-of-interest (ROI) is then extracted from the timefrequency analysis and used to train a suitably sized convolutional neural network (CNN). The proposed deep learning method is then compared against a similarly trained discriminant classifier using the same ROIs where features are extracted using texture and shape image statistics. Both methods achieve > 90% success with the CNN classification strategy nearing a perfect result.


INTRODUCTION
Current reciprocating compression technology is the culmination of over 100 years of design and manufacturing experience and are one of the most widely deployed compressors in industry. They operate reliably at a wide range of pressures, can compress a large variety of gas, and are highly adaptable thanks to multi-stage capabilities. However, reciprocating compressors suffer from relatively high maintenance costs, certainly more so than their centrifugal counterpart. The majority of reciprocating compressor downtime and maintenance costs can be attributed to the compressor valves, which account for 36% of shut downs and 50% of total repair costs (Schirmer, Fernandes, & Caux, 2004). Condition monitoring of valves and related components can provide sig-nificant reduction in overall maintenance costs and provide a basis for condition-based maintenance programs.
The most common approach employed in industry for condition monitoring in reciprocating compressors is through the use of the pressure-volume (P-V) curve. When measurement of the P-V diagram deviates from theory certain failure modes are likely, such as chatter or leakage. While these diagrams have proven successful they do require the use of real-time, in-cylinder, pressure measurements that add expense and additional maintenance.
Another common monitoring approach is through vibration analysis that looks for deviations in a machines typical vibration signature due to a fault condition. The vibration for reciprocating machines is characterized by a series of periodic events (such as piston slap, valve opening and closing, etc.) all which produce a highly cyclic vibration signature (Randall, 2011). This type of vibration signal is described as cyclostationary in which signals exhibit some periodicity in their energy profile that have key characteristics which can be used to identify statistically significant variation due to changes in operating condition (Antoni, 2009). Due to the cyclostationary nature of the measurement, timefrequency transforms in the cycle-domain are thus used rather than the time-domain. The first half of this research is based on the concept of compressor's cyclostationary nature and subsequent time-frequency analysis.
A variety of research has been done investigating valve fault detection in reciprocating compressors. (Liang, Gu, Ball, & Henry, 1996) developed a procedure to detect valve faults using the smoothed-pseudo Wigner-Ville Distribution which revealed characteristic patterns due to impact response vibration. (Elhaj, Gu, Ball, Shi, & Wright, 2001) investigated early detection of valve leakage through the extraction of detection features using Continuous Wavelet Transforms of both vibration and acoustic measurements. They later combined the monitoring of dynamic cylinder pressure and instantaneous angular speed to develop a reliable means of detecting valve leakage (Elhaj, Almrabet, Rgeai, & Ehtiwesh, 2010). (Zouari et al., 2007;Antoni, 2009) have shown the use of cyclo-stationary modeling for the purposes of reciprocating machine condition monitoring. In regard to valve faults they identified simple fault indicators through the use of the Wigner-Ville Spectrum. (Yih-Hwang, Liu, & Wu, 2006, 2009 examined the use of time-frequency analysis for reciprocating compressor vibration signals with a neural network for automated condition classification and later applied this to valve fault classification using seeded faults. Deep learning has recently been implemented in condition monitoring of reciprocating compressor valves. (Liu, Duan, Yuan, Wang, & Zhao, 2019) created a method of fault classification by combining local mean decomposition for processing vibration signal and stack denoising autoencoder for feature extraction. This method was used to detect spring failure, valve fracture, and valve wear and had a classification accuracy of 92.7%. (Guo et al., 2020) utilized a one-dimensional convolution neural network (1DCNN) with pressure and temperature signals as inputs to classify leakage in a 6-stage reciprocating compressor. The 1DCNN output is fed into the Softmax function to identify the stage of the compressor that is leaking with the results showing a 100% accuracy.
In 2013, using the compressor in this work at the RIT Compression Test Cell, (Guerra & Kolodziej, 2014) developed a mechanical-thermodynamic model of the compressor and investigated health monitoring of discharge valves using P-V diagrams, dynamic pressure measurements, and frequency domain analysis. Later, (Kolodziej & Trout, 2018) extended this work into the time-frequency domain using image processing methods, and most recently (Scott & Kolodziej, 2020) investigated fault isolation of valve health across all manifolds, inlet and outlet, using a novelty detection SVM.
The presented work advances previous health monitoring research in the RIT Compression Test Cell (Fig. 1) by incorporating time-frequency analysis of vibration measurements into the detection of valve related faults. Using timefrequency analysis two machine learning methods are compared as effective vibration-based methods for early detection of valve wear within industrial reciprocating compressors. One of the more common valve related fault conditions is valve seat wear and is investigated at various degrees of severity on the head-side discharge valves of Dresser-Rand ESH-1 compressor. Using common operational data including vibration, cylinder pressure, and crank shaft position, two condition monitoring methods are developed to classify fault severity. Nominal (healthy) and two levels of degraded (nonhealthy) valves are seeded in the compressor and operating data analyzed using time-frequency analysis. The resulting diagrams are processed as images and then used to train two machine learning methods: a statistical Bayesian classifier and a deep learning convolutional neural network (CNN). The

COMPRESSOR TEST CELL AT RIT
The experimental test platform used in this work is a Dresser-Rand, now A Siemens Business, ESH-1 reciprocating compressor located at the Rochester Institute of Technology's (RIT) Compression Test Cell shown in Fig. 1. The singlestage, dual-acting, compressor, commonly used in the petrochemical industry, was donated by Dresser-Rand and installed at RIT in 2010. One of their smaller industrial compressors, the ESH-1 is driven by a 10-hp electric motor and has a 6-inch piston with a 5-inch stroke that operates nominally at 360 RPM.
The ESH-1 is an intermittent flow, positive displacement air compressor with a single piston which pressurizes cylinders on both sides of the piston head, denoted as crank-side cylin-der and head-side cylinder. The compressor can be operated under full load (both cylinders), half-load (only crank-side), or no load. Each cylinder has a set of inlet suction valves that allow air to be drawn in at atmospheric pressure, and a set of discharge valves that allow compressed air to be discharged into an anti-pulsation tank. Each valve assembly (Fig. 2) includes 16 individual poppet valves that are spring loaded to keep the valves closed until a pressure differential is achieved. Condition monitoring of these valve-spring assemblies is the focus of this research.
The compression test cell is outfitted with a comprehensive NI-based CompactDAQ system to record measurements during compressor operation. The specific sensors utilized for this work are a triaxial accelerometer (PCB 356A16) mounted magnetically to the head-side discharge valve manifold ( Fig. 3), an angular encoder (Photocraft HS20.5QZ) mounted on the main crankshaft, and two in-cylinder pressure transducers (Omega PX309-100AI) to measure both cylinder pressures. Four single-axis stud mounted industrial accelerometers (PCB 622B01) are permanently affixed to the valve manifolds but are only used to verify the performance of the magnetically mounted sensor. Measurements are collected at 25.6-kHz using a custom NI LabVIEW interface to view sensor readings and export data for post-processing.

TIME-FREQUENCY ANALYSIS
Traditional spectral analysis techniques, such the Fourier transform, estimate the frequency content of a signal over its entire length and are ideal for analyzing stationary, or non-time varying, systems. However, when considering nonstationary signals, such as those produced by reciprocating compressors, it is often valuable to know how the frequency spectrum of a signal varies with respect to time. Timefrequency analysis techniques have been developed for these types of signals such as the Short-Time Fourier Transform (STFT), the Wigner-Ville Distribution and the Continuous Wavelet Transform. For this work the STFT is applied because of its computation ease, well established acceptance, and overall success in the proposed method.
Simply stated, a STFT is performed by dividing a signal into short time segments and applying the Fourier transform to each segment. The resulting spectrum segments are combined with the third dimension showing amplitude to illustrate how the signal's spectrum varies with each time window. In general, the magnitude scale (linear vs log), window size, window shape, and overlap chosen all effect the visual properties of the STFT and as such are "tunable" knobs the condition monitoring application can use.

FAULT SEEDING & METHODOLOGY
Compressor valves experience several different fault conditions, such as spring fatigue and leakage, but the one chosen for this study is valve seat wear because of its common occurrence in field data. The valve seat as shown in Fig. 2 - [top] can experience a gradual loss of thickness due to poppet impact and torsional rubbing from the helical spring. This slowly increases the poppet's travel distance during opening thereby increasing gas flow and valve impact force. To avoid  , is 0" (healthy), -1 /32" (degraded 1), and -1 /16" (degraded 2) removed from the poppet. There is an assumption that all sixteen poppets degrade gradually and uniformly within the valve assembly. The fault condition is only introduced into the head-side discharge valve while the crank-side discharge valve and both suction valves remain in their original, healthy state. To maintain consistency during data collection, the compressor is operated only at full load with constant discharge tank pressure.

Signal Processing
An overview of the proposed compressor valve condition monitoring methodology is shown in Fig. 4. Post datacollection signal processing involves decomposing the raw vibration signals into individual compression cycles using the crank shaft position measurement and then performing a time-frequency analysis. Thus, a 15-second data file nominally has 100 compression cycles. Figure 5 shows one cycle of vibration data along with crank and head-side cylinder pressures collected during compressor operation.
As expected, the highest intensity vibration occurs during the opening and closing of the crank and head-side discharge valves as each cylinder reaches discharge pressure. Less intense vibration, in between discharge valve activity, is related to the opening and closing of the inlet suction valves. A STFT is generated for each cycle with the result being multiple experimental observations from the particular fault case. It is important to point out that Fig. 5 shows all three health conditions. From a cylinder pressure standpoint the effect is negligible, with raw vibration alone equally inconclusive, thereby illustrating the need for a more comprehensive analysis. As seen in Fig. 5, the vibration signal for a complete cycle is non-stationary and appears to be a function of shaft position. Time-frequency diagrams, created via the short-time Fourier Transform, are used to investigate the frequency content as a function of shaft position. For this work the STFT representing vibration measurement is found for each decomposed cycle using a 51 sample window length, 75% window overlap, and a half sine window shape with no averaging applied. Note that these parameters are selected in an ad hoc manner, but provide reasonable angular resolution with some frequency smearing but also represent further design options for future work.
Two STFTs for the healthy and 1 /16" degraded cases are shown in Fig. 6. Both diagrams show distinct frequency activity occurring from 120 • to 220 • and from 320 • to 20 • . These shaft positions coincide with discharge valve activity from the head-side and crank-side cylinders, respectively. However, it now becomes more apparent that there is a visual difference between the two health states.
The STFTs show a wide range of frequency content in the vibration signal during the time of valve operation. However, it is determined to target frequency regions within each wear case that shows cross-case variation with inner-case consistency. By observing areas within the discharge valve operation window, and through a simplified modal analysis when the compressor is off, a region-of-interest (ROI) is selected.
Poppet impact with the valve seat is a form of impulse input and can expect compressor structural natural frequencies to be present in the measurement. Moreover, the frequency range (5%) of the PCB-356A16 accelerometer used in this work has a published bandwidth of 5kHz. Thus the frequency range chosen is 2.5 kHz to 4.5 kHz between a shaft position of 125 • to 185 • as shown in the boxed regions in Fig. 6. The ROI is then extracted from every STFT observation from all cases reducing the image size drastically while still focusing on the key valve opening event.

Feature Extraction
These ROI's are then used to perform the two classification methods applied in this work. First, a traditional statisticalbased approach is applied (linear and quadratic discriminant (LDC/QDC) analysis) and used as a baseline. In order to apply this method a feature extraction step is required to process the ROIs into a set of feature vectors for each observation. Second, a deep learning approach is implemented by directly using the ROIs to train a convolution neural network (CNN). The benefit of the CNN method is that the feature extraction step, which can and does require potentially significant effort and experience, can be omitted at the expense of computational overhead and larger training data sets.
For the statistical (LDC/QDC) classifier approach it is necessary to summarize the ROI with a series of key metrics commonly called features. Feature extraction is a data reduction technique in which a sub-set of properties, or features, are used to represent a larger data object, such as an image. These features are compiled into a feature vector,x, which contains k calculated features x i , where i = 1, 2, ..., k for a (a) Healthy condition: 0" valve seat wear (b) Degraded 2 condition: -1/16" valve seat given observation as shown in Eqn. (1).
In this work, the ROI from each STFT is broken into a grayscale image that represents image texture and a binary image that shows shape. The right side of Fig. 4 illustrates the result. The texture of an image can be described as smooth, rough, bumpy, etc. Analysis of the spatial relationships and intensity gradients allow for quantification of such textural descriptions. The texture features extracted are divided into two groups, 1 st order statistics and 2 nd order statistics. The gray-scale representation of the ROI has a range in intensity from 0 (representing black) to 1 (representing white) at discrete levels. Each ROI is treated as an M × N image, in this case 51 × 19, with each element value representing a pixel intensity value I(m, n).
The amplitude spectrum within the ROI vary from approximately 0 to 60 (for the linear scale). Various values of I max and I min are tested and evaluated based on visual distinction of the region and how well object boundaries are identified.
Mapping values of 8 and -8 are chosen which appear to maximize gradients and object delineation within the region of 3.8 kHz, or the nominal peak amplitude frequency. These values are maintained for every ROI to ensure visual consistency between cases and faults. I(m, n) is then normalized to have a value range of 0 to 1, and binned into N gray = 265 discrete intensity values. These parameters are yet another set of options for tuning the proposed approach to the application.
Next, image statistics are calculated that result in 13 metrics representing each observation. First order statistics provide information about the overall gray level distribution of the image as a whole (Theodoridis & Koutroumbas, 2008). These include the following five metrics: mean, standard deviation, skewness, and kurtosis which are the average, dispersion, asymmetry, and peakedness about mean intensity, respectively; and entropy which is the measure of histogram uniformity.
Second order statistics provide additional information about the relative location of gray levels. To extract these features, a gray-level co-occurrence matrix (GLCM) is created from the gray-scale intensity image (Haralick (Robert M. Haralick, 1973)). The GLCM contains information which characterizes the texture of an image and the features extracted help describe the spatial relationship, transitional intensity, and general complexity of gray levels within the image. The resulting matrix can also be viewed as a second order histogram in which gray levels are considered in pairs with a specified spatial relationship, unlike a first order histogram where only single gray levels are considered. The second order statistics are calculated for the resulting GLCM matrices where the mean and range value across all matrices are used as features. This results in eight additional features per ROI.
To compliment the gray-scale image, shape features are determined by creating a binary image of the ROI. The intensity values equal to 0 (black) or 1 (white) are based on a chosen intensity threshold value. For this work, the threshold is selected for each image individually based on a method described by Otsu (Otsu, 1979) which minimizes inter-class variance of black and white pixels.
The result is an image with a black background and white "blob" like regions, or objects. All objects are treated as a single discontinuous region for the purpose of extracting region shape properties. Any smaller unwanted objects below a certain pixel area that do not make up the bulk of the main region are treated as artifacts and removed. The pixel area threshold (in this case 40) is chosen in an ad hoc manner that met with acceptable results but remains a tunable parameter. An example of the resulting binary representation is also shown in Fig. 4.
From each observation, the following 18 shape features are extracted: Area, Centroid (x and y), Bounding Box (four corners), Major Axis Length, Minor Axis Length, Eccentricity, Orientation, Convex Area, Filled Area, Extrema, Equivalent Diameter, Solidity, Extent, and Perimeter. For this work, two shape features are then omitted, Bounding Box (y-width) and Extrema, because they were statistically insignificant between the three degradation classes.
The final result of the feature extraction step of the LDC/QDC approach are 29 individual features that are then used to train and test the classifier.
One of the key benefits that CNNs provide over traditional statistical machine learning approaches is the lack of necessity of the feature extraction step. Since ROIs are essentially treated as images their application to fundamental CNNs is seamless. With an appropriately designed CNN the feature extraction step is by design interwoven into the training of the network. So once the ROI is extracted from the STFT, at the desired crank angle range, and a gray-scale representation calculated, the approach moves directly to the training of the classifier.

Classification
The two classification methods used for this work are supervised classifiers that require a set of "training" data with known class membership to predict the most likely class for unknown observations. First, two types of Bayes classifiers are investigated, a linear discriminant classifier (LDC) and a quadratic discriminant classifier (QDC). These assume that the training data are normally distributed within each class to which observations assigned include one of the three wear cases.
The second method is a traditional convolutional neural network that in its basic form has eight layers. They are an image input layer of size 19x51x1 to accept the gray-scale ROI, a convolution layer of size 10 (or 20), a batch normalization layer, a rectified linear Unit (ReLU), a 2x2 max pooling layer, a 2 or 3 fully connected layer for the predicted health classes, and a classification output layer. Most of these layers are typical default selections in a CNN. The two or three sized connected layer is used for the 2 class (0", 1 /16") or 3 class (0", 1 /32", 1 /16") predictions. The primary variable that was selected is the size of the convolution layer. In this case ten is chosen because of its prediction accuracy and speed of training. Due to the relatively small sized network the CNN is trained on a single CPU with a 2020 laptop in under 30 seconds for all cases. Given the high degree of success this minimal training time should be considered an notable benefit of the the deep learning approach.
Finally, Tab. 1 outlines the distribution of all compressor cycle observations used in this work. Performance of the classifiers are evaluated by comparing the predicted classes of both the training set and a test set to their known classes. The overall classification accuracy of the test data sets is used to assess how well the proposed methodology, and associated time-frequency technique, produced unique fault signatures of the valve seat wear cases tested. 80% is used for training data, and presented here, but note that other training data set sizes were studied (70%) with similar results.

RESULTS
The two described methodologies are applied to the vibration data collected from the ESH-1 compressor for the valve seat wear fault condition for two possible scenarios: two classes (0", 1 /16") and 3 classes (0", 1 /32", 1 /16"). Raw vibration data is processed through a transformation from the time to angular position domain followed by STFT generation of each compression cycle. ROIs in a gray-scale image format are extract from each STFT between 125 • -185 • crank angle across a frequency range of 2,500-4,500 Hz with a pixel size 51x19. This is where the two classification methods begin with the statistical LDC/QDC classifier determining 29 image-based features and the CNN using the pixel intensities directly. Detailed results for all scenarios are given as follows, first with the overall classification accuracy between the training and  Table 2 shows that classification accuracy is very good exceeding 80% for most cases. The results show that adding the shape features from the binary ROI representation improves accuracy significantly with the two class case nearly perfect and the three class case in the low 90%. The 13 feature (texture only) cases do well exceeding 70%, but with minimal overhead of adding shape features. However, from the accuracy improvement in the 29 feature case it is apparent there is a strong motivation to do so even if it adds a degree of freedom in threshold and outlier determination. From the confusion matrices in Tab. 3 and Tab. 4 the two class case shows very little miss classification while the three class case shows a bit more uncertainly in the classification of the slightly worn ( 1 /32") valves seat.
Next, the performance of the deep learning approach is presented, again with both the two and three class scenarios. A minor ad hoc investigation of the number of training epochs and nodes in the convolutional layer is shown in Tab. 5. In all cases the accuracy of the same test data sets from the   LDC/QDC approach above showed excellent success with the most challenging three class case reaching 93.3% with 20 nodes. Table 6 shows corresponding confusion matrices for both cases. With the high level of accuracy of the CNN there is little miss-classification.

CONCLUSION
The aim of this work is to develop and compare two vibrationbased condition monitoring methods for early detection of valve seat wear in reciprocating compressors. Vibration data is collected at the head-side discharge valve manifold and processed using time-frequency analysis. Regions of interest are then extracted based on known piston position during discharge valve open events. At this point two methods of health classification are presented: a Bayesian method using manually extracted features based on gray-scale and binary image statistics and a deep learning method based on training a relatively small sized convolution neural network. In both cases the methods frequently achieve greater than 90% health classification success. It is important to point out that the primary sensor is colocated near where the fault is seeded, namely the head-side discharge. There is a wealth of other conditions that are common to the compression industry that require investigation, such as mixed faults (not all poppets worn), different fault modes (spring fatigue or poppet leakage), and fault isolation (inlet valves, crank and head-side, crank-side discharge) to name a few. However, the performance achieved is particularly encouraging when considering the wear gradients investigated. The processes developed produced promising results with significant room for optimization of the methods presented specifically the "tunable" parameters pointed out throughout this paper.