Fault Detection via Sparsity-based Blind Filtering on Experimental Vibration Signals

The detection of incipient rolling element bearing faults is a challenging task since the impulsive pattern of bearing faults often fades into the noise. Moreover, tracking the health conditions of rotating machinery generally requires the characteristic frequencies of the components of interest, which can be a cumbersome constraint for large industrial applications because of the extensive number of machine components. One recent method proposed in literature addresses these difficulties by aiming to increase the sparsity of the squared envelope spectrum of the vibration signal via blind filtering. As the name indicates, this method requires no prior knowledge about the machine. Sparsity measures of Hoyer index, l2 l1 norm, and spectral negentropy are optimized in the blind filtering approach using generalized Rayleigh quotient iteration. Even though the proposed method has demonstrated a promising performance, it has only been applied to vibration signals of an academic experimental test rig. This paper focuses on the real-world performance of the sparsitybased blind filtering approach on a complex industrial machine. One of the challenges is to ensure the numerical stability and the convergence of the generalized Rayleigh quotient optimization. Enhancements are thus made by identifying a quasi-optimal filter parameter range within which blind filtering tackles these issues. Finally, filtering is applied to certain frequency ranges in order to prevent the blind filtering optimization from getting skewed by dominant deterministic healthy signal content. The outcomes prove that sparsitybased blind filters are effective in tracking rolling element bearing faults on real-world rotating machinery without any prior knowledge of characteristic frequencies. Kayacan Kestel et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

cause of the extensive number of machine components. One recent method proposed in literature addresses these difficulties by aiming to increase the sparsity of the squared envelope spectrum of the vibration signal via blind filtering. As the name indicates, this method requires no prior knowledge about the machine. Sparsity measures of Hoyer index, l2 l1norm, and spectral negentropy are optimized in the blind filtering approach using generalized Rayleigh quotient iteration. Even though the proposed method has demonstrated a promising performance, it has only been applied to vibration signals of an academic experimental test rig. This paper focuses on the real-world performance of the sparsitybased blind filtering approach on a complex industrial machine. One of the challenges is to ensure the numerical stability and the convergence of the generalized Rayleigh quotient optimization. Enhancements are thus made by identifying a quasi-optimal filter parameter range within which blind filtering tackles these issues. Finally, filtering is applied to certain frequency ranges in order to prevent the blind filtering optimization from getting skewed by dominant deterministic healthy signal content. The outcomes prove that sparsitybased blind filters are effective in tracking rolling element bearing faults on real-world rotating machinery without any prior knowledge of characteristic frequencies.

INTRODUCTION
Early detection of the anomalous behaviour of rotating machinery has drawn significant attention, since in large industrial applications, maintenance and downtime costs can add up to substantial amounts (Lu, Li, Wu, & Yang, 2009). Furthermore, the complexity of rotating machinery has rapidly increased thanks to technological developments in the recent years. Accordingly, such machines are comprised of an immense amount of components. Hence, it might complicate keeping track of all the kinematic information of every component. However, monitoring the health conditions of rotating machinery, in general, requires the knowledge of the characteristic frequencies of dynamic components such as bearings, shafts or gears. Thus, in the case of the lack or the paucity of the kinematic information about the machine, fault detection algorithms which are capable of functioning blindly are needed.
Blind approaches that require no a-priori knowledge about the machine kinematics have already been employed for prognostic and diagnostic purposes. A basic form of blind approaches is monitoring the statistical indicators of the vibration signal in the time domain, examples include the root-mean-square or the kurtosis of the vibration amplitude. On the other hand, time-domain indicators do not provide information regarding the type of the faulty component. Furthermore, assessing the health status based on statistical indicators may be misleading as they are sensitive to operating conditions such as shaft rotation speed or load. A more advanced example of tracking the time waveform statistics is Minimum Entropy Deconvolution (MED) filtering (Wiggins, 1978), which aims to maximize the kurtosis of the time waveform. Various attempts to improve MED have been presented in literature. The higher order moments than the fourth order, which is linked to kurtosis, of the vibration signal are implemented by (Gray, 1979) with MED approach. An enhanced way of estimating kurtosis is multi-point kurtosis proposed by (McDonald & Zhao, 2017) and a new indicator based on Jarque-Bera statistics is also utilized as a detection measure on the time-domain signal via blind deconvolution (Obuchowski, Zimroz, & Wyłomańska, 2016). Recently new blind methods tend to exploit cyclostationarity rather than the time-domain statistics. Particularly the detection of incipient roller bearing faults, which constitutes the majority of the machine faults (Graney & Starry, 2012), entails scrutinizing the stochastic nature of roller bearing impacts as their impulsive pattern demonstrates a second order cyclostationary behavior. Therefore, several studies have focused on utilizing this content of the signal for blind approaches. A novel design of a blind deconvolution filter maximizes the cyclostationary content of the signal by exploiting the generalized Rayleigh quotient (Buzzoni, Antoni, & D'Elia, 2018), albeit that it requires a-priori knowledge of the machine components. While this paper does not aim to include an exhaustive literature survey, the aforementioned studies are a summary of the blind approaches that have been employed for fault detection. (Peeters, Antoni, & Helsen, 2020) proposed a blind filtering method which maximizes the sparsity of the squared envelope spectrum (SES) of the vibration signal. A brief introduction to the concept of sparsity and sparsity measures is laid out in the next section.
The present study focuses on the applicability of the proposed method (Peeters et al., 2020) to vibration signals measured on complex industrial rotational machinery. The real-world measurements from a complex industrial gearbox are used for investigating the developed methods. Moreover, a parameter study is performed to obtain a quasi-optimal filter parameter space within which blind filters are numerically stable and function in a time-efficient way. The theory of the blind filtering method is laid out in the second section, along with a brief derivation of the blind filter equations revealing the generalized Rayleigh quotient. In the third section, results demonstrating the parameter study and the performance of blind filtering to diagnose incipient roller bearing fault detection are shown and discussed. The last section stresses that blind filtering methods can be used as an effective tool in large industrial applications as a health monitoring tool for cases where knowledge of characteristic frequency is lacking or minimal. (Peeters et al., 2020) already stated that in order to increase the sparsity of the SES, the vibration signal is filtered to maximize the value of the sparsity measure. Three sparsity measures are chosen based on the criteria discussed by (Hurley & Rickard, 2009) and based on their mathematical convenience for the derivation of the blind filters. Three sparsity measures discussed in (Peeters et al., 2020), namely l2 l1 -norm, Hoyer index and spectral negentropy, are used in this study.

Brief Introduction to Sparsity Measures
A signal representation can be considered sparse when a limited number of samples contains the majority of the energy (Hurley & Rickard, 2009). While in this study the concept of sparsity is utilized with a particular interest of signal processing for vibration based condition monitoring of rotating machinery, it also can be applied in a variety of other domains, such as oceanic engineering (Li & Preisig, 2007), image processing (Krishnan, Tay, & Fergus, 2011) or medical imaging (Leung et al., 2008). The concept of sparsity in this study is utilized in a way to maximize the sparseness of the SES of the blind filtered signal, with the aim of detecting the presence of peaks associated with a fault in the squared envelope spectra.
Envelope analysis is one of the most widely used techniques to detect rolling element bearing faults (Abboud, Antoni, Sieg-Zieba, & Eltabach, 2017). The envelope spectrum can be obtained by estimating the spectrum of the signal envelope. It is proven that monitoring the squared envelope spectrum of a signal is a more effective way of detecting rolling element bearing faults compared to the non-squared envelope spectrum (Ho & Randall, 2000), particularly for signals whose envelope spectrum has a signal-to-noise ratio which is more than unity. The amplitude of the analytic signal results in the envelope of the signal, and the analytic signal can be simply formed by summing the signal itself and its Hilbert transform (Ho & Randall, 2000). However, in this study, to reduce the complexity of the mathematical derivations, we estimated the squared envelope spectrum by multiplying the signal spectrum with its conjugate. Accordingly, following derivations are made.
In order to increase the sparsity measures of interest of a noisy signal x, it is convolved with a filter h to estimate the ideal input s (Peeters et al., 2020): The vector and matrix quantities are represented in bold characters, in order to distinguish them from the scalar ones. In matrix form, the convolution operation can be written as: and the elements in the convolution operation correspond to: where lengths of s and h are L and N , respectively. Accordingly the squared envelope x can be defined as: with diag(s H ) being diagonal matrix of the Hermitian transpose of s. Finally, squared envelope spectrum is estimated as the Fourier transform of x : and the Fourier matrix F is: with the basis function being ω = e −2πj/L , n = 0, .. L − 1 and k = 0, .. K − 1. Accordingly, Fourier matrix F has the dimensions of (K, L) and K is the number of frequency index of the Fourier spectrum. An endeavour is made to find an optimum blind filter h, which maximizes the sparsity measures of the quantity E x .

Sparsity Measures
One of the most well-known sparsity measures is l2 l1 -norm. The expression to estimate this norm is: where L is the number of samples in the SES, i.e. signal length.
The second measure employed to quantify the sparsity is Hoyer index (HI) (Hoyer, 2004), which is essentially a normalized version of l2 l1 -norm (Hurley & Rickard, 2009), and can be expressed as: A shortcoming of the l2 l1 -norm is being unbounded, hence, it tends to result in different values for signals with different lengths but with the same sparsity. On the other hand, Hoyer index is bounded between unity and zero. For signals where the energy is accumulated into a single point, the Hoyer index results in unity, and it converges to zero when the energy content becomes more equally distributed (Hoyer, 2004).
The third sparsity measure is spectral negentropy. Based on the definition of entropy, one can define the spectral entropy of a signal as the probability distribution of its power spectrum. Accordingly, the lowest entropy arises for the cases in which the energy of the signal accumulates into a single impulse (Antoni, 2016). In other words, the lower the value of spectral entropy of a signal, the more sparse the signal is, which the makes negative of the spectral entropy a plausible candidate to measure sparsity. This definition is already proposed in literature as the spectral negentropy (Antoni, 2016) and it is expressed for the SES as:

Generalized Rayleigh Quotient Derivation
The optimal filter coefficient estimation is achieved employing generalized Rayleigh quotient iteration. A brief derivation of the blind filter equations is presented for the l2 l1 -norm. Rewriting the equation Eq. 7 in vector notation results in: Further manipulation of both numerator and denominator of Eq. 10 gives: and the details of which are presented in (Peeters et al., 2020). Plugging Eq. 5 into Eq. 11 results in: where A and B correspond to: and respectively. Equation 12 reveals the generalized Rayleigh quotient (Horn & Johnson, 1985). Thus, Eq. 12 can be simplified utilizing the definition of that as: with: and As the proposed approach in (Peeters et al., 2020) is to increase the sparsity of the squared envelope spectra of the blind filtered signal, an iterative solution is performed to solve for h to maximize l2 l1 -norm in Eq. 15. With regards to the maximization, the maximum value of the generalized Rayleigh quotient with respect to h is equivalent to its largest eigenvalue and corresponding eigenvector. Therefore, the associated generalized eigenvalue problem with the Rayleigh quotient defined in Eq. 15 can be written as: and it is iteratively solved to estimate the optimal filter coefficients which are inherently equal to eigenvector of the problem. For the detailed explanation of the iteration steps and the derivation of the two remaining sparsity measures, readers can refer to (Peeters et al., 2020).
One of the strong aspects of generalized Rayleigh quotient iteration is its drastic convergence rate (Parlett, 1974), as long as the numerical stability is satisfied. Thus, as a part of the present study, numerical stability is also investigated by performing a broad parametric study. The sensitivity of the convergence characteristics of the iteration process to filter length and its initialization is also scrutinized.

Filter Initialization
While it is mentioned that the convergence rate of Rayleigh quotient iteration is rapid, an initial guess considerably far from the solution domain of the filter may result in divergence. In this study, iterations initialized with two different approaches are also compared. The filters are initialized by either a differentiation filter or an autoregressive (AR) model. The differentiation filter is basically comprised of zeros except for the second and the fourth coefficients which are set to be 1 and −1, respectively, regardless of the filter length. The differentiation filter is further normalized prior to the iteration process. The coefficients of AR model, on the other hand, are estimated using Levinson-Durbin recursion (Franke, 1985).

Experimental Problem Definition
The vibration signals investigated in this study were sampled at 40 kHz for 2 seconds at every 10 minutes and a dataset containing 73 measurements was obtained. The average rotational speed of the high-speed shaft was 279 Hz. In general angular resampling is necessary for a proper vibration based condition monitoring of a rotating machine. The angular resampling requires the knowledge of the instantaneous speed of the shaft so that the data can be transformed to the angular domain in order to compensate for the speed variations (Peeters et al., 2019). Nevertheless, the present dataset was sampled under negligible speed variations, hence the angular resampling is not needed. Accordingly, the signal spectrum is dominated by pronounced shaft harmonics. Another potential pre-processing method to improve the effectiveness of blind filtering is the deterministic content removal. The highenergy deterministic content protrudes in the squared envelope spectrum and might mask the bearing fault signature.
Since the localized bearing faults manifest themselves as a stochastic process, the energy level of which are lower in the squared envelope spectrum compared to deterministic content originated from the other machine components (Antoni & Randall, 2003). Thus, in order to ensure that the sparsity measure of the blind filtered signal is not skewed by the highenergy harmonics, the deterministic content of the signal can be removed prior to blind filtering using several techniques discussed in literature, i.e. cepstral editing or discrete-random separation (Peeters et al., 2020). However, the signals are not pre-whitened in this study, but instead bandwidths to which blind filtering is applied are adjusted in such a way as to exclude the prominent amplitudes related to the shaft harmonics in squared envelope spectrum. This step is explained in the next section.
The last measurement in the data set corresponds the one where the machine has failed. The damaged inner ring of the rolling element bearing can be seen in Fig. 1. The machine is stressed at full capacity for the duration of the experiment which induces high loads on the bearings and significantly accelerates the bearing degradation. The evolution of the bearing fault in the zoomed squared envelope spectra is discernable in Fig. 2 with an increase in the amplitude at the ballpass frequency inner race (BPFI) order of 8.29. Approximately after the measurement 30, which is indicated with the vertical dashed line, the SES amplitude around order 8.29 becomes distinct, which can be easily monitored by tracking the BPFI order. Two basic statistical indicators estimated using time-domain waveform of the signal are demonstrated in Fig. 3. An increasing trend for both the root-mean-square and the kurtosis of the signal is observed from measurement 43 onwards, which is an indication of a potential bearing fault. While these statistical indicators can be considered 'blind', they do not include any filter optimization nor do they try to embed engineering knowledge of the vibration signal into the analysis.

RESULTS
The experimental results are presented in two sections. The first section discusses the salient points of the parameter study. This study aims to find quasi-optimal parameter ranges where the blind filtering approach can provide robust results. The second section presents the performance of the three sparsity measures.

Parametric Study
There exist three main parameter settings which may change the numerical stability or the convergence rate of the blind filtering method, which are the filter length, filter initialization, and frequency range that is to be filtered.
The decision of the length of the filter is significant as it affects both the computation time and the convergence of the iteration process. While a longer filter is capable of resolving the finer frequency content compared to a shorter one, the latter tends to improve the convergence. This is due to the fact that longer filters inherently require more coefficients to be optimized, accordingly, the global minimum of a larger domain may never be achieved. Figure 4 displays the evolution of the Hoyer indexes for different filter lengths N . The Hoyer index of the signals that are blind filtered by the filters with the lengths of 5, 10, 20, 50 and 100 are compared. Even though the evolutions of the indicator estimated for different filter lengths do not demonstrate any promising trend for the fault detection means, the curves do not exhibit any arbitrary outlier peaks or drops, which indicates that all five lengths provide a numerically stable solution. Like in any iterative solution, initial guess can play an important role for both the convergence itself and its rate. However, Fig. 5 reveals that there is no considerable distinction between the results obtained for the two different filter initialization methods. Despite there being a slight discrepancy between the two curves in Fig. 5, the general trend in both curves is nearly identical. Therefore, both initialization methods provide a stable result. Moreover, no distinction is detected for the computation time of the iteration processes initiated with these two settings. One may also find out that the line corresponding to the filter length 10 in Fig. 4 is identical to the AR line in Fig. 5, since these two curves processed with the same parameter settings. Hence, numerical stability of the Rayleigh quotient iteration is achieved for filters initialized with both the differentiation filter and the AR model. The last setting mentioned is the frequency range to which the blind filter is applied. The optimization can be narrowed down to only take into account the sparsity of a certain frequency band in the squared envelope spectrum. This is an important setting because of the fact that narrower frequency band in the SES means shorter convergence time whereas it also poses the risk of excluding the frequency content of interest which is assumed to be unknown. In order to prevent such a case, the initial attempts started by employing the full available frequency range (from zero to Nyquist). The upper limit of the frequency range gradually decreased to 10, 5, and 3 kHz and signals filtered within these frequency ranges are displayed in Fig. 6. It can be stated that frequency range has no effect on the numerical stability of the Rayleigh quotient optimization for reasonably wide frequency bands. The numerical stability is not studied for the very narrow frequency band filtering, because it offers no use for the blind approaches in which very narrow band filtering might jeopardize fault detection by excluding the faulty frequency con-tent. Nonetheless, as expected, computation time decreases as the frequency range shrinks because it contains less points to process. Having this in mind, applying blind filtering to wisely selected narrow bands appears to be an effective solution, especially for the industrial applications where execution time of the operations is important. On the other hand, except for 1 − 3000 Hz line, none of the other lines provides a promising trend in terms of fault detection in Fig. 6. The line of 1 − 3000 Hz demonstrates an increasing trend from measurement 50 onwards, albeit that increment is considerable small. Also one can realize that the curve for the full frequency range is identical to the mentioned curve in Figs. 4 and 5. Hence, results presented in Fig. 6 are obtained using the same parameter settings.
0HDVXUHPHQW 1R > @ +R\HU LQGH[ Figure 6. The evolution of Hoyer index of the SES of the signals blind filtered using different frequency bands.
As the conclusion of this parameter study, it is observed that while longer filters might result in non-convergence and require considerably long computation time, filter initialization and the frequency range settings do not impose a significant effect on the generalized Rayleigh quotient maximization process with this regard. The reason as to why longer signal may complicate the optimization process, as mentioned above, is because the increase in the length of the filter means more constants to be estimated. Therefore, even though convergence of the optimization may be satisfied, trend of the signals' sparsity measure becomes quite noisy for longer filters, which is undesirable in terms of alarming point of view.
Considering the outcome the parametric study, for the rest of the study blind filters are initialized with AR method for the filter length of 10.

Fault Detection
Now that it is proven that there exists a quasi-optimal parameter settings range for a stable numerical convergence, the rest of the endeavour is made to employ sparsity-based blind filtering approach to diagnose the fault from the experimental vibration signal. To do so, evolutions of the sparsity measures must be tracked. Referring back to the Fig. 6, none of the trends displays an increase to indicate any change in the sparsity of the squared envelope spectrum, albeit that these curves are achieved by blind filtering with the quasioptimal parameter settings. Reminding the discussion about deterministic content removal, a potential culprit for the underperformance of the blind filtering approach might be the presence of strong shaft harmonics masking the bearing fault signature in the squared envelope spectrum. As mentioned above, signal spectrum is dominated with the shaft harmonics which may skew the sparseness of the SES as a result of the blind filtering. It is known that the experimental vibration signals of interest are sampled with a negligible variation of the shaft speed. Therefore, given the average speed of the high-speed shaft, the frequency span is divided into (frequency) bands enclosed by the shaft harmonics. In order to ensure that the sparsity measure is not skewed due to the high amplitude frequency bins adjacent to the shaft harmonics, frequency bins in the range of ±3% of the shaft harmonics are also excluded. The exception is made for the 0th harmonic, as it contains no useful information for diagnostic purposes. Thus, setting the lower and the upper bounds of the frequency bands with the shaft harmonics and ±3% off-set, Table 1 is generated. In each row, the lower and the upper bounds of the frequency span are shown, along with the grey row which is where the BPFI information is embedded. In Figs. 7, 8, and 9, the evolutions of sparsity measures of the spectral negentropy, Hoyer index and the l2 l1 -norm are demonstrated, respectively. The figures include the sparsity measures estimated on the different frequency bands of the squared envelope spectrum. Hence, in each figure, curves correspond to the information contained in the frequency bins enclosed by shaft harmonics of 7 − 8, 8 − 9, 9 − 10 as well as in the frequency band of 1 − 3000 Hz, which is also demonstrated in Fig. 6, are shown. The purpose of presenting the latter is to emphasize the difference in the evolution of trends. In order to ease the representation of Figs. 7,8,and 9, lines corresponding to the sparsity measures obtained by filtering the frequency bands shown in Table 1 are coloured with black, whereas grey line represents the results for filtering the frequency band of 1 − 3000 Hz. Figure 7 displays the trend of the spectral negentropy of the squared envelope spectrum of the blind filtered signals. Even at first glance, a sudden increase in the trend after a plateau phase can be observed for the black lines. The effectiveness of the proposed approach of shaft harmonic exclusion is clearly more pronounced compared to the trend of the solid grey line which corresponds to filtered signal frequency content within 1 − 3000 Hz. As mentioned in the previous sections, the impulsive pattern of incipient rolling element bearing fault results in the increase in the amplitude in the related frequency bin, which also increases the sparsity. Given that the spectral negentropy increases almost 4 times its original value, it forms a clear indication of a potential bearing fault. On the other hand, the solid grey curve demonstrates a slight increase after the measurement 50, which may not be directly linked to a faulty component of the machine as the increase is not clear compared to the mean of the trend of the earlier measurements. This is an important observation particularly for the industrial applications where the condition monitoring is performed autonomously and the detection of a fault requires clear elevating trend exceeding a threshold. Hence, such minuscule increase in the trend may not be considered as a clear indication of a fault. Similar comments can be made for the trends of Hoyer index and l2 l1 -norm shown in Figs. 8 and 9, respectively. Considering the Hoyer index, the indicator almost doubles from the 27th measurement to the 40th one for the solid black curve.
Although the faulty signal content is embedded in the fre-quency range of 2240 − 2503 Hz, as highlighted in Table  1, trends with clear indication of the fault are present in the curves of 1961−2224 Hz and 2519−2782 Hz as well. On the other hand, blind filtering the signals for the frequency bands numbered in Table 1 from 1 to 3 does not demonstrate any clear increase in the sparsity measures. The Hoyer index evolutions of signals blind filtered for the frequency bandwidths number 2 and 3 are depicted in Fig. 10, both of which are flat, thus, no indication of a fault. Similarly, for the frequency band of number 4 in Table 1, the onset of the surge in Hoyer index is delayed relative to the that of the line corresponding to the frequency range 2240 − 2503 Hz, as shown in Fig. 10. Filtering the signal within the frequency band shown in row 6 in Table 1 also demonstrates a similar performance as the cases presented with the black lines in the figures, yet it is not shown for the sake of the clarity of the figures. Therefore, the indication of the faulty bearing is apparent in the trends of the sparsity measures estimated for several frequency bands. A final comparison is also made between the condition monitoring methods where the characteristic frequency knowledge is utilized and the blind filtering approach. Figure 11 demonstrates evolutions of the sum of amplitudes (SoA) of the squared envelope spectrum around the BPFI order 8.29 and of the Hoyer index for the frequency range number 9 in Table 1 where the faulty frequency content is contained. Albeit that it is not a clear-cut distinction, the earliest sudden increase in SoA occurs later than that in Hoyer index does so. This may imply that blind filtering approach to maximize the sparsity of the squared envelope spectrum performs as effective as classical condition monitoring tools based on tracking the characteristic frequencies, if not outperforms.
While the focus of this study is not aiming to compare the performances of the sparsity measures, for an industrial au- tonomous condition monitoring application, Hoyer index can be said to be the most useful, since it is scaled in between zero to unity. For the vibration signal dataset investigated in this study, the last measurement where the Hoyer index reached a value around 0.8 is already the one where the machine failed. Considering that, further studies may be performed to set a threshold to assess the severity of the damage and to warn the end-user of the potential degradation of a component of interest.

CONCLUSION
This paper investigated the performance of sparsity-based blind filtering approaches on experimental data. It is proven that, with a known the average or the instantaneous shaft speed, a blind filtering approach is capable of detecting the incipient rolling element bearing fault of a complex industrial rotating machine. Furthermore, results of the broad parameter study stress that filter length is a significant parameter that influences the convergence characteristic of the iterative solution that maximizes the generalized Rayleigh quotient. For the given dataset, the conclusion can be drawn that filter initialization induces no effect on either the numerical stability or the computation time of the iteration process. In summary, the blind filtering method can be used as an autonomous health surveillance tool for complex industrial applications in which exact characteristic frequencies of mechanical components are lacking.