Leakage-Safe, Reproducible Benchmarking for Vibration-Based Fault Diagnosis

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jul 3, 2026
Pawel Knap
Urszula Jachymczyk
Krzysztof Lalik

Abstract

Vibration-based bearing fault diagnosis is a widely studied predictive maintenance problem, but reported results are often difficult to compare. Performance depends not only on the model itself, but also on the evaluation protocol, the train--test split, and the type of domain shift considered. In particular, leakage-prone window-level splitting and loosely defined source--target settings can lead to overly optimistic conclusions that do not reflect real transfer performance across changing operating conditions, acquisition regimes, or bearing identities. To address this issue, this paper introduces a leakage-safe and reproducible benchmark for cross-domain bearing fault diagnosis on the Case Western Reserve University and Paderborn University datasets. The benchmark defines six fixed source--target scenarios, enforces recording-level train--test separation, and evaluates both machine-learning and deep-learning baselines under a common protocol. Final reporting is based on a consistent evaluation setup, with repeated-seed follow-up used where necessary to support reliable conclusions for deep-learning models. The results show that scenario difficulty is highly heterogeneous. Some transfer settings are effectively saturated, while others remain substantially more challenging. Deep-learning models often achieve stronger performance, but their conclusions can be sensitive to initialization and require repeated-seed validation. Overall, the benchmark provides a reproducible basis for scenario-level evaluation and more reliable comparison of cross-domain bearing diagnosis methods. The code for this study is publicly available at https://github.com/1Sensor/pdm-bench.

How to Cite

Knap, P., Jachymczyk, U., & Lalik, K. (2026). Leakage-Safe, Reproducible Benchmarking for Vibration-Based Fault Diagnosis. PHM Society European Conference, 9(1), 1–8. https://doi.org/10.36001/phme.2026.v9i1.4924
Abstract 0 | PDF Downloads 0

##plugins.themes.bootstrap3.article.details##

Keywords

Fault diagnosis, Leakage-safe benchmarking, Cross-domain generalization, Vibration analysis, Predictive maintenance

References
Abburi, H., Chaudhary, T., Ilyas, S. H. W., Manne, L., Mittal, D., Williams, D., ... Veeramani, B. (2023). A closer look at bearing fault classification approaches. In Annual Conference of the PHM Society (Vol. 15). doi: 10.36001/phmconf.2023.v15i1.3473

Apicella, A., Isgrò, F., & Prevete, R. (2025, August). Don’t push the button! Exploring data leakage risks in machine learning and transfer learning. Artificial Intelligence Review, 58(11). doi: 10.1007/s10462-025-11326-3

Chen, X., Yang, R., Xue, Y., Huang, M., Ferrero, R., & Wang, Z. (2023). Deep transfer learning for bearing fault diagnosis: A systematic review since 2016. IEEE Transactions on Instrumentation and Measurement, 72, 1–21. doi: 10.1109/TIM.2023.3244237

Hendriks, J., Dumond, P., & Knox, D. (2022). Towards better benchmarking using the CWRU bearing fault dataset. Mechanical Systems and Signal Processing, 169, 108732. doi: https://doi.org/10.1016/j.ymssp.2021.108732

Knap, P., & Jachymczyk, U. (2026). PdMBench: Benchmark for predictive maintenance. Retrieved from https://github.com/1Sensor/pdm-bench

Lei, Y., Yang, B., Jiang, X., Jia, F., Li, N., & Nandi, A. K. (2020). Applications of machine learning to machine fault diagnosis: A review and roadmap. Mechanical Systems and Signal Processing, 138, 106587. doi: https://doi.org/10.1016/j.ymssp.2019.106587

Lessmeier, C., Kimotho, J. K., Zimmer, D., & Sextro, W. (2016). Condition monitoring of bearing damage in electromechanical drive systems by using motor current signals of electric motors: A benchmark data set for data-driven classification. In PHM Society European Conference (Vol. 3). doi: 10.36001/phme.2016.v3i1.1577

Matania, O., Cohen, R., Bechhoefer, E., & Bortman, J. (2024). Test-training leakage in evaluation of machine learning algorithms for condition-based maintenance. In Proceedings of the PHM Society European Conference (PHME 2024) (p. 13). Prague, Czech Republic.

Neupane, D., Bouadjenek, M. R., Dazeley, R., & Aryal, S. (2025). Data-driven machinery fault diagnosis: A comprehensive review. Neurocomputing, 627, 129588.

Rosa, R. K., Braga, D., & Silva, D. (2024). Benchmarking deep learning models for bearing fault diagnosis using the CWRU dataset: A multi-label approach. Retrieved from https://arxiv.org/abs/2407.14625

Smith, W. A., & Randall, R. B. (2015a). Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mechanical Systems and Signal Processing, 64–65, 100–131. doi: https://doi.org/10.1016/j.ymssp.2015.04.021

Smith, W. A., & Randall, R. B. (2015b). Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mechanical Systems and Signal Processing, 64, 100–131.

Vieira, J. P., Bauler, V. A., Rosa, R. K., & Silva, D. (2026). Towards a more realistic evaluation of machine learning models for bearing fault diagnosis. Retrieved from https://arxiv.org/abs/2509.22267

Zhao, Z., Li, T., Wu, J., Sun, C., Wang, S., Yan, R., & Chen, X. (2020). Deep learning algorithms for rotating machinery intelligent diagnosis: An open-source benchmark study. ISA Transactions, 107, 224–255.
Section
Technical Papers