Predictive Diagnosis in Axial Piston Pumps: A Study for High Frequency Condition Indicators Under Variable Operating Conditions

Increasing reliability, availability and safety requirements as well as an increasing amount of data acquisition systems have enabled condition-based maintenance in mobile and industrial machinery. In this paper, we present a methodology to develop a robust diagnostic approach. This includes the consideration of variable operating conditions in the data acquisition process as well as a versatile, non domain-specific feature extraction technique. By doing so, we train anomaly detection models for different fault types and different fault intensities in variable displacement axial piston pumps. Our specific interest points to the investigation of high-frequency condition indicators with a sampling rate of 1 MHz. Furthermore, we compare those to industry standard sensors, sampled with up to 20 kHz.By considering variable operating conditions, we are able to quantify the influence of the operating point. The results show, that high-frequency features are a suitable condition-indicator across several operating points and can be used to detect faults more easily. Although set up on a test-bench, the experimental design allows to draw conclusions about realistic field operational conditions.


PROBLEM SETTING AND MOTIVATION
During the last decades, preventive and reactive maintenance strategies prevailed in industry use cases due to their ease of Oliver Gnepper et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. use and simple planning. However, the first causes sorting out components that could have been operated longer, while the second can lead to long, unplanned downtime. Therefore, there is a strong desire for cost reduction and a simultaneous desire for more up-to-date information from the systems that can be enabled by condition-based maintenance. As a result, research and the proposed methods in the Prognostics and Health Management (PHM) domain have grown rapidly. PHM is a holistic approach, that incorporates all data acquisition, fault detection, diagnosis and prognosis techniques, including health management strategies, e.g., the ordering of spare parts before a failure occurs or the change of mission profiles (Kim, An, & Choi, 2017). In contrary to failure distribution based condition indication (Wöhler curves, Weibull analysis, Proportional Hazard models etc.), assessing the present and future health state of a single unit under test by measuring quantities that are directly linked to a fault type is the subject of PHM.
In industrial as well as in mobile machinery use cases, hydraulic drives are a major power source due to their high power density, precise control movements and stepless ratio change under variable operating conditions. In the agricultural domain an unplanned downtime of harvesting machines during beet harvesting season can cause significant financial losses (Brinkschulte & Geimer, 2017). In the mining sector, where the value-chains depend on a few excavators and conveyor systems, downtimes can even cost up to several tens of thousands of dollars per hour (Michaud, 2014). In the aerospace domain, a failure in the hydraulics circuit can even be safety-critical if the wing actuators can no longer be controlled or the landing gear cannot be operated (Gomes, Leão, Vianna, Galvão, & Yoneyama, 2012). This underlines the need to monitor the state-of-health of hydraulic assets in order to plan appropriate measures in a timely manner and to assess whether a system can still be operated despite being in a degraded state.
Since axial piston units (APU) represent the central power source in such drive systems that provide energy for the driving and working hydraulics, a special focus of investigation is on them to detect faults at an early stage. When in good condition, the function of the APU is to transform mechanical energy -depicted by shaft speed and torque -into volumetric flow and a system pressure. The system pressure results from the hydraulic resistance in the circuit. For the mobile machinery domain, the power is usually provided by a diesel engine while industrial applications are powered by electric drives. Explanations about the function, details and the mathematical modelling of axial piston pumps can be found in the literature of (Ivantysyn & Ivantysynova, 2003) and (Mandal, Saha, & Sanyal, 2008).
However, there are some specific challenges regarding the health assessment of a pump. First, due to the structure and the rotating parts surrounded by a housing, it is practically impossible to measure the fault directly. Therefore, at an economically reasonable cost, indicators and signal features based on measurable physical quantities must be built that have a functional mapping to the extent of fault. Second, there are several influencing factors that make it challenging to establish an efficient and robust methodology based on experimental data. On the one hand, different applications such as agricultural use, industrial use, use in construction machines as well as different pump characteristics (size, controllers, used materials and surface treatments etc.) within one application make it hard to generate a one-fits-all approach. On the other hand, fault simulation can only be helpful in limited situations (Bayer & Enge-Rosenblatt, 2011b), because of the complex multi-body and multi-domain mechanisms including damping, friction, tribological and lubricating effects. Considering all the mentioned varying influence factors and uncertainties can make it difficult for a pump manufacturer when it comes to the transfer of fault detection algorithms among several pumps.
Even for one single pump, various problems arise that increase the difficulty of monitoring and assessing the pump. This is since various subsystems and components can be affected by fault, and depending on the fault mechanism, different data patterns can appear. For example, undesired tilting, abrasive wear or plastic deformation can occur at the slipper pad / sliding shoe, which is located at the pistons head of the pump. A detection algorithm -i.e., classification of the fault and quantification of its extent can only work correctly if the algorithm was exposed to this specific pattern in the training data previously. In practice though, data for a healthy pump is mainly available, which sets a special focus on models, that learn from one class only. At the same time, the large variety of different fault mechanisms make it clear, that it is important to have a well generalizing algorithm, which avoids overfitting to a specific failure mechanism.
Furthermore, operating conditions -depicted by shaft speed, the system pressure and the swivel angle -influence the operating behavior of the APU and mask the fault specific signatures in the signal characteristics. In axial piston units the swivel angle is the angular displacement of the swash plate and thus determines the flow rate of the pump. Figure 1 depicts the central part of an hydraulic axial piston unit. Here, the oil is sucked into the cylinder at the intake port. Through the rotation of the cylinder and the tilted swash plate the pistons conduct a translatory stroke and hence deliver oil through the discharge port. Therefore, our goal is to design a condition indicator, that is sensitive to a specific fault but insensitive to operating point changes. To address this issue, in a first step we investigate how different fault types become noticeable in vibrational frequency bands up to 500 kHz. This is done for different operating conditions of the APU, for three different sensor positions and two different fault types -loose slipper and cavitation erosion in the port plate. In addition to the fault types examined, others (e.g. abrasion) can occur in specific applications. Since these often have a strong interdependence with oil quality, viscosity and temperature as well as oil contamination such fault types are not part of our study. In Section 2, we present a literature review that reflects the current state of research. Section 3 describes the methodology and experimental setup we used to acquire our data. Section 4 depicts the data-driven analysis leading to Section 5, the result presentation. It is followed by Section 6 and Section 7, where we put them into context as well as deriving further investigation steps.

LITERATURE REVIEW
As APUs are a major component in drive systems, several authors have dealt with fault detection and diagnosis in hydraulic pumps, most of them using a methodology based on vibration signals for the assessment. Physic based approaches are present (Hast, Findeisen, & Streif, 2015), (Bayer & Enge-Rosenblatt, 2011b) and (Bayer & Enge-Rosenblatt, 2011a), but data-driven approaches are far more common due to the high complexity of the systems and non-linear effects due to viscous friction and contacts (Lan et al., 2018). During our research, we investigated 39 papers and articles in the research area of fault detection and diagnosis in hydraulic pumps, from which 72 % make use of vibration signals, 33 % use pressure signals and 20 % state that volumetric flow rate is a quantity with informative content.
Among them are (Casoli, Pastori, & Scolari, 2019), who present a diagnostic approach based on time and frequency features in combination with neural networks and support vector machines (SVM) for the fault types abrasion on valve plate, cavitation erosion, slipper wear and cylinder wear. Further, a multi-layered diagnostic approach is presented by (Du, Wang, & Zhang, 2013). Together with the vibration power spectrum, they use pressure spectrum analysis and leakage flow rate to develop a rule-based method. The fault types are distinguished between slipper clearance, swash plate eccentricity, roller bearing wear, insufficient inlet pressure and abrasion of valve plate. Important to mention, that they use conventional signal analysis and statistical methods only, without any integration of self-learning models. Part of their work is also an intensive fault mechanism analysis and a study on fault occurrence, intensity and detection probability in the aerospace domain for hydraulic pumps. Regarding the slipper clearance, they use 12 different levels of fault intensities. However, a sampling rate of 4 kHz is inadequate for measuring vibration signals in hydraulic pumps, as demonstrated by (Torikka, 2011) who found that the high-frequency content of the vibration signals holds critical fault information that is not captured by this low sampling rate.
Another work from (Ding, Ma, & Tian, 2015) uses multidimensional scaling for feature extraction and softmax regression for classifying faults in a plunger pump. By doing so, they can give probabilities to the predictions of the classification labels for the faults. Helwig, Klein and Schütze (Helwig, Klein, & Schütze, 2015) use a working cycle from an industry application to identify and quantify faults in gear pumps, among them degradation of a directional valve, internal pump leakage, gas leakage of diaphragm accumulator and aeration of hydraulic fluid. With the use of a linear discriminant analysis, they reduce the dimensionality and can optimize the class separation. Another approach is presented by (Torikka, 2011), who uses time-frequency domain features extracted with a wavelet transformation. The classification results, produced by Naïve Bayes, support vector machine and neural networks for single operating points yield in classification rates between 80 % and almost 100 %. One of the latest methodologies is presented by (Maradey Lázaro & Borrás Pinilla, 2020), who present a comprehensive review for 14 publications regarding fault detection in axial piston pumps. They also develop a new methodology for volumetric efficiency decrease based on vibration signal acquisition, filtering, wavelet feature extraction, feature selection and artificial neural network training. They study how the classification performance is affected by different wavelet families and the choice of the classifier. As classifiers, they train four different neural networks -an Adaline (linear transfer function), two nonlinear mapping (tansig and logsig) networks and one multi-layer perceptron (softmax). They also get test results with a very high accuracy between 83.3 % and 100 %. The sampling rate in (Maradey Lázaro & Borrás Pinilla, 2020) and (Torikka, 2011) is 50 kHz.
From the analyzed sources it can be concluded that machine learning approaches, which learn by extracting information from data and pattern recognition, are advantageous in mapping indicator data in higher dimensional space data to fault data patterns (Biggio & Kastanis, 2020). The higher dimensional data space is usually formed by spectral analysis features of the vibration signal. The main benefit of this is that already slightly occurring faults can have an impact on the vibration signal and hence enable early diagnosis while sensors are relatively cheap and usually easy to apply onto the housing.
However, the mentioned sources suffer from one or more drawbacks. One major limitation which hinders the transfer to field applications is that the variability of operating conditions -mainly caused by different load profiles and their effect on the condition indicators -are not considered. As current stateof-the-art sensing is still done with sampling rates only up to 50 kHz, the indicators in these frequency bands are masked by loading profile artifacts and often order tracking or filtering out frequencies from other stimulation systems (e.g. diesel engine) needs to be implemented. This can need a lot of computing power on sensors or processing devices. Therefore, it remains unclear whether the high-frequency signal components are a better condition indicator in terms of generalization ability among different fault types and independence of the loading characteristics. Another point is the use of supervised learning algorithms that suit perfectly to test-bench scenarios where all the necessary data patterns can be acquired. As mentioned earlier, unsupervised or semi-supervised algorithms which can be used for anomaly detection, are more field-orientated, as they benefit from the significantly higher proportion of healthy pump data.
Our goal is to address those key-challenges, for which we propose a methodology for investigating: 1. Identify broadband condition indicators for axial piston pumps.
2. Show how feature behavior for the whole operating parameter space influences the condition indicators.
3. Investigate the suitability of high-frequency condition indicators in frequency range up to 500 kHz for two different fault types.
4. Investigate how the faults' detectability is influenced by different fault intensities.
5. Exemplarily show how the position of the sensor influences the results.
6. Compare different sensor types and different frequency ranges for their suitability to develop anomaly detection models out of them.

EXPERIMENTAL DESIGN
In the following section, the experimental setup and data acquisition process is described in detail.

Axial piston pump and operating points
In this work we characterize the operating behavior of an APU (a Bosch Rexroth A10VSO28) as a combination of the shaft speed, the swivel angle and the pressure in the hydraulic circuit. A single combination of these three parameters we define as an operating point. The pressure-flow-controller and a load valve, representing the hydraulic resistance in the circuit, enable us to set any combination within the pump's operating specification.
While in industrial applications these operating points usually are fixed to a specific cycle, in mobile machinery domain they can vary regarding the load and driver's behavior. Therefore, we use a stochastic test plan to sample the operating parameters' combinations in order to depict the functional relationship to the signal characteristics. The sampling technique is Latin Hypercube Sampling (Siebertz, van Bebber, & Hochkirchen, 2017), a statistical method to maximize the minimum distance between points within predefined intervals.
By doing so, we receive 96 operating points within the intervals over the three-dimensional operating space. The limits for sampling are listed in the following table (detailed operating points can be found in Table 4 in the appendix). On the test-bench, the order of the operating points is randomized to prevent our dataset from having order-induced effects. Each operating point is held for 15 s and 10 s are given for the transition to the next operating point. From a methodological point of view, this sampling technique can be extended by more dimensions (oil temperature, different hose lengths etc.) and is especially suitable when a lot of parameters influence the data acquisition process.

Measurement Setup
For data acquisition step we use a variable displacement axial piston pump as the unit under test. The corresponding equipment we use for logging the data is an imc CronosFlex standard data acquisition system and an imc EOS high-speed measurement device. All experiments are conducted on an acoustic test-bench, which provides vibration decoupling from other vibrating elements. The test fault order was randomized and fixture installation was performed using a torque wrench to minimize effects on testing. In addition, we run the cycle three times in healthy condition in order to assess the repeatability of the test-bench measurement. The piezo-ceramic patch transducers for the broadband indicators (type Invent K025) are attached with specific glue to the housing surface of the pump and labeled with a tag for better visibility. We mount one sensor in a radial position on the housing (position 1), close to swash plate. One has its normal vector in an axial direction close to the port plate and one is located at the edge of the port plate, further from the fault source (position 3, normal vector pointing in radial direction). The sampling rate here is 1 MHz. In case of a piezo-ceramic patch transducer, the electrical signal is proportional to the applied deformation of the transducer (Physik Instrumente, 2012). The deformation of the surface is due to the force excitation in the axial piston pump and the structure that conducts structure-borne sound. This is mainly caused by the sudden change of pressure (compression) when piston chambers filled with low suction pressure fluid enters the high-pressure kidney of the valve plate.
The other positions depicted in Figure 6 are piezoelectric uniaxial accelerometers (type Endevco 7295B-25) where we set the sampling rate as 20 kHz. Anti-alias filtering is applied to all six vibration signals to prevent disturbances from higher frequencies.The lowpass filter cut-off frequency is the half of the sampling rate. Furthermore, and according to Figure 2, we collect the quantities of the pump's swivel angle (↵), the pressure (p HD ) occurring directly at the discharge connector and the shaft speed (n). These three dimensions describe the operating point of the pump and are logged synchronously. During the whole data acquisition process the hydraulic oil temperature remains constant at 50 C. Figure 2 shows the hydraulic circuit scheme, which includes the individual subsystems of the pump and the controller.

Fault types and different fault intensities
In this study, two different fault types are exemplarily investigated -loose slipper, and cavitation erosion on valve plate and on cylinder. The fault types are selected based on different aspects, the first one is based on a survey where field technicians were interviewed about the relative frequency of occurrence of fault types in field applications and about their economic and safety relevance. The second aspect is the capability to reproduce faults on the test-bench with artificially generated fault intensities, since reproducing time-history datasets is hardly realizable because of the high test bench costs and difficulty to reproduce isolated and accelerated fault mechanisms. Note, that all PHM stages in the analysis part make more sense for continuously developing faults, since abrupt failures cannot be predicted.
The first fault type, loose slipper, which is depicted in Figure 3, describes an increasing clearance in the slipper-piston pairing when an increased tensile load is occurring in open-circuit applications i.e. due to low suction-pressure, critical operation or wrong oil type. As a result, the softer material of the slipper shoe is deforming more and more and, thus, the play between piston and slipper increases, until the slipper is pulled off. This causes a failure of the pump immediately or within a Slipper Slipper clearance Piston Figure 3. Slipper clearance sketch short period of time. As additional effects, this fault type also induces higher internal leakage, lower performance and a difference in vibration signals (Du et al., 2013). The different stages (fault extends) that we use for our study are listed in Table 2. Here, the fault state EE depicts the APU in an healthy state (run-in condition of the tribological pairings), whereas AS1 (beginning clearance), AS2 (intermediate clearance) and AS3 (right before failure) are the fault states. The other fault type, cavitation erosion, occurs both on the valve plate and the cylinder and is caused by a pressure difference between the individual cylinder chambers and the system pressure port. When the cylinder is entering the system pressure kidney of the valve plate, this pressure difference is compensated by a volumetric flow into the cylinder. Because of the sudden drop in pressure at the tip of the notch the hydraulic liquid outgases, with the bubbles being sucked into the cylinder chamber with the flow and cavitating there. When this is happening close to the cylinder wall, material is eroded. Details can be found in (Kleinbreuer, 1979) and (Backe & Kleinbreuer, 1981).
If this fault type occurs, this results in a geometry change of the valve plate and hence in a change and shift in the opening area of the cavitated cylinder chamber over time, i.e. angle of shaft rotation. Usually, also the valve is affected. To model this effect in the test-bench setup we prolonged the noise-reducing notch at the valve plate by the factor of 1.37. All in all, this means that the temporal excitation inside the pump changes and, thus, a different structure-borne noise behavior can be observed.

DATA-DRIVEN ANALYSIS
Based on the measurements described in Section 3.1, datadriven analysis will be performed to find answers to the key challenges formulated in Section 2. To do this, it is first necessary to preprocess the data. Features are then extracted which allow conclusions to be drawn about the fault condition of the APU. Finally, artificial intelligence (AI) based models are trained which use the calculated features as a basis for the fault assessment.

Data preprocessing
As described in Section 3.2 the measurement consists of a series of operating conditions which are held for 15 s and followed by a transition phase of 10 s. Due to the fact that the operating conditions change during the transition phase the latter is removed before the further analysis. However, the sequence of transitional and steady phases induces another challenge. In case of an increase of either the system pressure, the shaft speed or the swivel angle the resulting system pressure overshoots its nominal value. It takes time until the system pressure reaches a steady state and its deviation from the nominal value is reduced to a minimum. This damped oscillation leads to not comparable conditions during the oscillation phase which induces the need to determine sections in the time series which are comparable across measurements. Hence, it is not possible to use the point in time where the nominal value reaches the next steady phase for detecting the begin of the next interval with comparable conditions. Rather it is necessary to determine the end of the oscillation phase by observing the system pressure in order to define the point in time when the axial piston pump has reached the next steady state. For this purpose, a criterion is defined according to Equation 1, which acts as a necessary boundary condition to identify the steady state. This criterion evaluates the relative deviation between the measured pressure (p m ) and the nominal pressure (p n ). Here, however, the measured pressure is additionally reduced by the median of the pressure during the last second before the end of the operating point (⌧ ). This procedure takes into account the fact that the system pressure deviates from the nominal pressure even in steady state, with the magnitude of the deviation caused by the test rig settings and depending on the set operating point as well as the fault pattern.
In addition to this criterion it is checked if the pump reaches a steady state within the 15 s of the operating point. If there is at least one APU that did not meet this criterion, the corresponding operating point is excluded from the following analysis. Given this procedure the operating points 32, 57, 58, 70, 78 and 90 are excluded.

Feature extraction
Feature extraction is used to derive characteristic properties of the time series, which provide information about the fault condition of the APU. The influence of the broadband analysis in the frequency range up to 500 kHz must also be investigated (key-challenges 1 and 3). This requires high-resolution data in the frequency range.
For an initial identification of potential broadband features, we investigate how the power spectral density changes over the entire frequency range depending on the respective fault type and its expression in case of the K025 sensors. In order to be comparable with the analysis for all operating conditions, first the minimum duration of the steady states of all operating conditions is determined. This value is 1.92 s for the present data set. For each operating condition for which a steady state was found, the power spectral density is calculated over this duration, starting from the end of the steady state of the operating condition. Based on the measurements of the healthy APU, the expected value of the power spectral density per frequency band over all operating conditions as well as the deviation of the expected value of the power spectral density between the healthy APU and the respective faulty APUs is determined. This shows that in some frequency bands there are differences between the expected value of the power spectral density for healthy APUs and for faulty APUs (see Figures 8,9 and 10 in the appendix).
For the determination of the broadband damage indicators, the covered frequency range is first subdivided into subranges. Each subrange is in turn broken down into individual frequency bands with a defined bandwidth (see Table 3). Within each frequency band the total power of the signal is determined based on the power spectral density (Bauer & Puente León, 2017). With regard to the training of a machine learning model, it is necessary to split the time series into non-overlapping segments, which on the one hand contain the relevant features and on the other hand are available in a sufficiently high number. The non-overlapping segments are necessary to prevent information leakage between training and test data. For this purpose, the time series of each operating condition is split into segments of 0.1 s, starting with the end of the presence of the operating condition, as long as the APU is in a steady state. This splitting results in a resolution in the frequency domain of 10 Hz as well as a number of training samples for the healthy condition of 57 to 450 per operating point. 10 000 1000 10 000 500 000 10 000

Anomaly Detection
Based on the key-challenges 2, 4, 5 and 6, there is a need for a targeted comparison of the fault detection performance of the anomaly detectors, related to the senor, sensor position, operating condition, fault type and fault intensity. Basically, the training of the anomaly detectors is done separately according to the sensor and the sensor position based on the data of the healthy condition. Due to the different occurrence of the operating conditions, there is a risk that an overarching model would not correctly represent infrequently occurring operating conditions due to insufficient information being available and would therefore produce poorer results for these operating conditions. Hence, when training the anomaly detectors, an additional distinction is made according to the operating condition. Consequently, a separate anomaly detector is trained for each sensor, each sensor position and each operating condition. Additionally, a transformer is designed based on the training data that removes the mean of each feature and scales the variance to value 1.
There exists a wide range of methods that are used for anomaly detection such as one-class SVM (Schölkopf, Platt, Shawe-Taylor, Smola, & Williamson, 2001), Local Outlier Factor (Breunig, Kriegel, Ng, & Sander, 2000), Isolation Forest (Liu, Ting, & Zhou, 2008) and (Liu, Ting, & Zhou, 2012), Principle Component Analysis (PCA) (Hotelling, 1933) and Auto-Encoders (Kramer, 1991). The choice of the underlying model is restricted by the amount of available samples of each sub-dataset. Taking this and the degree of explainability into account, a model based on PCA is chosen to represent the healthy condition of the APU. The anomaly detection with a PCA is performed as follows: 1. Determine a suitable number of principle components. 2. Project the input data onto the selected principle components. 3. Transform the data back into a space with the same dimension as the input feature space. 4. Measure the reconstruction error during the training. 5. Determine a threshold which acts as a decision boundary during the inference. 6. Compare the reconstruction error for each sample with the predefined threshold to determine if the sample is anomalous or not.
The number of principle components is determined by using a scree-plot (Cattell, 1966 As a result the threshold for the added explained variance by a single principle component is set to 0.02 in case of the K025 sensors and to 0.06 in case of the Endevco sensors. Each operating condition leads to a different added explained variance given the number of principle components. An example of the development of the added explained variance distributions per principle component for the Endevco sensor at position 2 is shown in Figure 4. In this case the added explained variance ratio starts to level off between the fourth and eighth principle component depending on the operational condition. The maximum amount of principle components for which the added explained variance ratio exceeds the threshold is used for the further analysis. When projecting the input data onto the selected principal components and then transforming back into a space with the same dimension as the input feature space, the input data can not be reconstructed exactly. If the matrices for the transformations are determined based on data representing only the healthy condition of the APU, it can be assumed that the reconstruction error increases when data of the faulty conditions is processed. An anomaly exists, when the reconstruction error exceeds a threshold based on the reconstruction error during training. The reconstruction error is determined by calculating the root mean squared error (RM SE) (see Equation 3), with the number of samples (N ), the index (t), the measured value (x t ), the reconstruction of the measured value (x t ) and the end of the measurement (T ).

RM SE
If there is available only data of the healthy condition it is tempting to set the threshold to separate normal from anomalous samples to the maximum reconstruction error during the training. This should result theoretically in 0 % false positive rate (FPR). Reducing the threshold on the one hand increases the sensitivity of the anomaly detector, but on the other hand increases the amount of false alerts. In the present case the available data of the faulty APUs is used to determine an optimal threshold to tell anomalous samples apart from samples which depict the healthy condition. Therefore, the percentiles are varied in unit steps in the range from 100 to 90 which results in a maximum false positive rate of 10 %. Based on this, there are 11 thresholds from which the most suitable one is determined. For each model, false positive rate, fault type and fault intensity the Matthews Correlation Coefficient (MCC) (Matthews, 1975) is calculated. Here, only the data of the healthy condition and one fault type as well as fault intensity is used. Given the MCC it is determined which false positive rate leads to a maximum MCC for the analyzed combination of model, fault type and fault intensity. Finally, the overall false positive rate is chosen which has the highest occurrence over all models.

Feature importance
Based on the models trained in advance, the importance of the respective features is examined. The Permutation Feature Importance (Breiman, 2001) is used for this purpose. This measures how much the performance of a model is reduced if the samples of a feature are mixed. Due to the mixing, the connection to the other values within the sample is broken. Thus, it becomes quantifiable how strongly the model considers a certain feature.
In order to evaluate all features, each feature is individually shuffled and subsequently the performance decrease of the model is calculated. To reduce random effects when mixing the samples, the process of mixing and evaluating is repeated 30 times for each feature. Finally, the mean value of the performance decrease is determined from all 30 repetitions. This analysis is performed separately for each model, always using the data of the healthy units and one faulty unit each. Although this gives seven results for each model, it opens up the comparison with the preliminary analysis of the features as well as the possibility to explain differences with respect to the MCC. 300 Hz and 1 kHz as well as a significant increase in the ex-pected signal power between 200 and 300 kHz. Especially for the increased slipper clearance the deviation is high and clearly distinguishable from the cavitation erosion and the healthy pump.

RESULTS
For position 2, which is the sensor with the closest distance to the valve plate and, thus, with the closest distance to the cavitation erosion fault, we can see an increase of spectral power almost in the whole spectrum. Furthermore, Figures 9 and 10 also show the dependency for different intensities of cavitation erosion, serving as an indicator that the power level is not only dependent on the frequency band but also dependent on the intensity of fault. For the fault type cavitation erosion, position 2 and position 3 show the best decision boundary in power level in the frequency range between 5 to 10 kHz and 200 to 500 kHz. In positions 2 and 3 the fault type loose slipper is hardly distinguishable from the healthy pump behaviour.
The threshold which is used to tell an anomalous sample apart from a sample which represents the healthy condition is determined by evaluating the occurrence of false positive rates that maximize the MCC for each model, fault type and fault intensity. As described in Section 4.3 each model corresponds with a sensor and a sensor position as well.  Figure 5 shows the counts for each false positive rate which maximizes the MCC for each sensor, sensor position, fault type and fault intensity. For both sensors (Endevco and K025) a false positive rate of 0 % leads to an optimal MCC, in the most cases. Therefore, the maximum reconstruction error during training is chosen as the threshold to separate the healthy condition from the anomalous condition, during the further analysis. It can also be seen in the figure that in case of the Endevco sensor higher false positive rates occur more often compared to the K025 sensor. This indicates that the models which are based on the Endevco sensor have difficulties to mimic the healthy condition as well as the models which are based on the K025 sensor. Figure 11 in the appendix depicts in detail how well the anomaly detection algorithm works depending on different operating conditions, fault types, fault intensities and sensor positions. In general, two effects are particularly evident for the left heatmap column. First, positions 1 and 2 have the highest detection rates, while position 3 falls behind for both loose slipper and cavitation erosion. Figure 12 underlines this observation by depicting the average detection rates and their standard deviations. Second, the MCC is generally higher for the cavitation erosion than for the loose slipper. Regarding the fault intensities, the MCCs also increase with the increasing faults. Interestingly, this is not true for every operating point, as there are a few ones where the detection rate for the initial fault intensity (S1) is higher than the advanced intensities (S2, S3 and S4). Also we can observe one outlier in the barplot for the intensity S1 in cavitation erosion at position 2, that does not follow this trend. For the Endevco sensors, the positions 1 and 2 do not have a significant effect on the MCC values, which also matches with the observations and conclusion of (Ramdén, 1998).
For the piezo-ceramic patch transducers the results differ in several aspects, first of all showing higher overall detection rates than the Endevco comparison group. Second, the position here makes a significant difference for the loose slipper fault type, indicating that position 1, which has the shortest transfer path to the actual fault, has higher detection rates for all intensities than in the other two positions. For position 2 and 3 an increasing trend in MCCs for the increasing slipper clearance is also visible. For position 2 and 3 the standard deviation bars are also higher, indicating, that the detection rate in terms of MCC is also depending on the operating point.
For the cavitation erosion, detection rates are on a high average level in general, only a few operating points with a high shaft speed (6, 7, 9 and 34) are out of line for the K025. Comparing both fault types for both sensors and all operating conditions, the cavitation erosion fault type can better be detected.
The comparison of the preliminary analysis as well as the permutation feature importance of the K025 sensors shows, matching results at position 1 (see Figures 8 and 16 in the appendix). For positions 2 and 3, the differences between the healthy condition and the various intensities of cavitation erosion are better visible in the preliminary analysis than in the case of the permutation feature importance, especially in the frequency range above 100 kHz (see Figures 9 and 17 as well as Figures 10 and 18 in the appendix). The models for positions 2 and 3 based on the data of the K025 sensor rely only to a small extent on these frequency ranges and make their decisions primarily on the basis of the frequency ranges up to 40 kHz for position 2 and the frequency ranges up to 2 kHz for position 3. The models based on data from the Endevco sensors primarily consider the frequency range up to 2 kHz (see Figures 13, 14 and 13 in the appendix). Frequency ranges above this value play only a subordinate role for the models.

DISCUSSION
In general, the detection rates of our study are within the range of the literature sources. We also have shown, that the condition indicators depend on the operating point, while the detection rates of our anomaly detection rates are within a close range within the specified operating space. The higher detection rates for cavitation erosion can be explained by the exciting cause of the structure-borne noise. Here the signal characteristics' change are not caused by a changing momentum due to two bodies colliding, instead here the pressure and, thus, the force profile in the cylinder change due to the caviation damaged geometry and resulting in an different time profile in the housing's surface vibration. Since the axial excitation due to pressure is the main contribution to pump's vibration according to (Münch, 2021), it is plausible that a change here causes a larger effect. The loose slipper, which in theory causes a clattering effect, is masked by this excitation. Hypothetically, this effect can change when increasing the size of the investigated pump, since larger pump models have greater mass (signature of clearance can be stronger) but are not necessarily exposed to higher pressure profiles.
While investigating the reason for significantly lower MCCs at position 3 across both fault types and all fault intensities, we found out, that the obtained power spectra are strongly masked by noise which means that the piston induced frequency and their harmonics are not as clear as they are in the other positions. We assume, that this is either due to this specific position, meaning that the sensor is placed in a node of the structural movement. Another plausible reason might be impulsive shocks in this direction (e.g. caused due to reflexions of the pressure pulsation) that result in a broadband frequency excitation and masking the signal part. Since it is a one-directional sensor, it is also possible that it was pointing in a direction where less deflection happens.
As the evaluation of the difference of the expected signal power between the healthy units and the faulty units shows that an identification as well as a distinction of both types of fault is only possible at sensor position 1 and in the frequency range between 200 and 300 kHz. At the other positions only the cavitation erosion can be identified. This is underlined by the results of the anomaly detection, which show that the cavitation erosion at these positions is better detected by the K025 sensors than the loose slipper fault. This shows that the position of a sensor can have a critical impact on the detection rate of fault and contrasts with (Ramdén, 1998). Moreover, this result is underlined by the similar MCC values of both sensors at position 2, since at this position both sensors are positioned close to each other. However, an analysis of the permutation feature importance of both sensors shows that the K025 sensor at position 2 hardly considers the frequency ranges above 40 kHz. Compared to the difference in expected signal power, however, this range proves to be informative with regard to cavitation erosion. Hence, it is reasonable to conclude that taking a higher frequency range into account would have caused better detection results. This hypothesis is underlined by the nearly perfect MCC values of the K025 sensor, especially in combination with the results of the permutation feature importance for the K025 sensor at position 1, since only at this position also features above 100 kHz were evaluated as particularly relevant by the models.
An open question that still remains is, how assembly and manufacturing tolerances affect the detection rates in this measurement setup and how we can verify that the fault and not the state of installation or other external effects have actually been measured by the sensors and recognized by the algorithm. With our experimental design, we cannot completely refute this, but the increasing detection rates in the piezo-ceramic patch transducers and the order randomization strongly indicates that there is causality between the effects in the data and the fault. With more test samples and a larger dataset in terms of more variance, we could statistically underline this further.

CONCLUSION
We have shown that the signal power in a defined frequency band changes as a function of the damage to an APU and, thus, represents a broadband condition indicator of an APU. Furthermore, it is obvious that the difference between the healthy and faulty APUs depends on both the analyzed frequency range and the operating condition. Provided that data in the frequency range above 100 kHz is available and sufficiently accounted for by the model, the influence of operating condition on the condition indicator decreases. This shows that, with regard to the analyzed fault types cavitation erosion and loose slipper, an analysis in the frequency range up to 500 kHz is beneficial. Furthermore, we have shown that more severe fault intensities are easier to detect than less severe ones. However, the trained anomaly detectors are already able to detect the smallest intensity of both types of fault, whereby the positioning of the sensors is a major influence. If the sensors are positioned optimally and the frequency range above 100 kHz is taken into account, even the lowest level of fault is detected correctly in over 99 % of cases.
Finally, our results show, that using high-frequency condition indicators and piezo-ceramic patch transducers are suitable for PHM applications. In our case study, they show less operating condition dependence than low frequency signals and show higher detection rates than using piezoelectric accelerometers. With the proposed method and measurement setup we are now able to investigate multiple circumstances including fault diagnosis, quantification and the dependency in the dimensions of sensor technology and positioning. We are keen to evaluate this setting in a mobile machinery setup to verify the robustness of our methodology in its final application.