AI for Sustainable Building Operations: Data-Driven Anomaly Detection in Ventilation Systems

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jul 3, 2026
Anahid Wachsenegger Adam Buruzs Anabel Dautović Miloš Šipetić Laura Bernadó Pedro Casas

Abstract

Detecting deviations in building time series data is essential for robust heating, ventilation, and air conditioning (HVAC) operation and energy-efficient facility management. In practice, however, building management system (BMS) data are often incomplete, heterogeneous, and lack reliable fault labels.
This paper presents a benchmarking and feasibility study of data-driven anomaly detection on multivariate air-handling unit (AHU) time series data under realistic deployment constraints. We construct a unified dataset and define a domain-informed rule-based baseline as an interpretable operational reference and source of weak labels. We further evaluate classical unsupervised methods and representation-learning approaches using Temporal Convolutional Network (TCN) and Time Series Mixer (TSMixer) autoencoders, considering both a joint multivariate representation of all selected sensors and subsystem-based representations in which sensors are grouped by AHU function. Additionally, SHapley Additive exPlanations-based (SHAP) attribution is used to improve interpretability by identifying the sensor-level contributions to detected deviations.
The results show that rule-based methods capture explicitly defined conditions, while data-driven approaches identify additional statistically unusual and temporally structured deviations, with representation-learning models flagging 1.1–1.4% of windows in the global setting and up to 4.7% in subsystem-based analyses. High-consensus events (~0.8%) occur during temporally localized episodes with agreement across multiple models, indicating robust, structured deviations. These detections represent candidate anomalies that require further validation.
Our results show that combining rule-based, classical, and representation-learning methods provides complementary insights into AHU behavior and helps screen for relevant deviations in performance and energy use.

How to Cite

Wachsenegger, A., Buruzs, A., Dautović, A., Šipetić, M., Bernadó, L., & Casas, P. (2026). AI for Sustainable Building Operations: Data-Driven Anomaly Detection in Ventilation Systems. PHM Society European Conference, 9(1), 1–14. https://doi.org/10.36001/phme.2026.v9i1.5012
Abstract 0 | PDF Downloads 0

##plugins.themes.bootstrap3.article.details##

Keywords

HVAC anomaly detection, multivariate time series, unsupervised learning, representation learning, energy-efficient building operation

References
Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271.

Bellanco, I., Fuentes, E., Vallès, M., & Salom, J. (2021). A review of the fault behavior of heat pumps and measurements, detection and diagnosis methods including virtual sensors. Journal of Building Engineering, 39, 102254. doi: https://doi.org/10.1016/j.jobe.2021.102254

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi: 10.1023/A:1010933404324

Chen, S.-A., Li, C.-L., Arik, S. O., Yoon, J., & Pfister, T. (2023). TSMixer: An all-MLP architecture for time series forecasting. arXiv preprint arXiv:2303.06053.

Chen, Z., O’Neill, Z., Wen, J., Pradhan, O., Yang, T., Lu, X., ... Herr, T. (2023). A review of data-driven fault detection and diagnostics for building HVAC systems. Applied Energy, 339, 121030. doi: 10.1016/j.apenergy.2023.121030

Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–38.

El Mokhtari, K., & McArthur, J. (2024). Autoencoder-based fault detection using building automation system data. Advanced Engineering Informatics, 62, 102810. doi: 10.1016/j.aei.2024.102810

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96) (pp. 226–231). AAAI Press.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. doi: 10.1162/neco.1997.9.8.1735

Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417–441. doi: 10.1037/h0071325

Katipamula, S., & Brambley, M. R. (2005a). Methods for fault detection, diagnostics, and prognostics for building systems: A review, Part II. HVAC&R Research, 11(2), 169–187. doi: 10.1080/10789669.2005.10391133

Katipamula, S., & Brambley, M. R. (2005b). Methods for fault detection, diagnostics, and prognostics for building systems: A review, Part I. HVAC&R Research, 11(1), 3–25. doi: 10.1080/10789669.2005.10391123

Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114.

Liao, H., Cai, W., Cheng, F., Dubey, S., & Rajesh, P. B. (2021). An online data-driven fault diagnosis method for air handling units by rule and convolutional neural networks. Sensors, 21(13), 4358. doi: 10.3390/s21134358

Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2008). Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining (pp. 413–422). IEEE. doi: 10.1109/ICDM.2008.17

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (Vol. 30).

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 281–297). University of California Press.

Matetić, I., Štajduhar, I., Wolf, I., & Ljubić, S. (2023). A review of data-driven approaches and techniques for fault detection and diagnosis in HVAC systems. Sensors, 23(1), 1. doi: 10.3390/s23010001

Mirnaghi, M. S., & Haghighat, F. (2020). Fault detection and diagnosis of large-scale HVAC systems in buildings using data-driven methods: A comprehensive review. Energy and Buildings, 229, 110492. doi: 10.1016/j.enbuild.2020.110492

Ranade, A., Provan, G., El-Din Mady, A., & O’Sullivan, D. (2020). A computationally efficient method for fault diagnosis of fan-coil unit terminals in building heating, ventilation and air conditioning systems. Journal of Building Engineering, 27, 100955. doi: https://doi.org/10.1016/j.jobe.2019.100955

Ratner, A., Bach, S. H., Ehrenberg, H., Fries, J., Wu, S., & Ré, C. (2017). Snorkel: Rapid training data creation with weak supervision. Proceedings of the VLDB Endowment, 11(3), 269–282. doi: 10.14778/3157794.3157797

Saeed, W., & Omlin, C. (2023). Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities. Knowledge-Based Systems, 263, 110273. doi: 10.1016/j.knosys.2023.110273

Sipetic, M., Schöny, M., & Catal, J. (2024). Application of autoencoders on multivariate anomaly detection in building automation systems with variable selection based on semantic metadata of the facility. In Proceedings of the 7th International Conference on Efficiency, Cost, Optimization, Simulation and Environmental Impact of Energy Systems (ECOS 2024).

Troncoso-García, A. R., Martínez-Ballesteros, M., Martínez-Álvarez, F., & Troncoso, A. (2023). A new approach based on association rules to add explainability to time series forecasting models. Information Fusion, 94, 169–180. doi: 10.1016/j.inffus.2023.01.021

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 6000–6010). Curran Associates.

Youssef, M. E., Guarino, F., Sibilio, S., & Rosato, A. (2023). Experimental assessment of a preliminary rule-based data-driven method for fault detection and diagnosis of coils, fans and sensors in air-handling units. In Sustainability in Energy and Buildings 2022 (Vol. 336, pp. 359–370). Singapore: Springer. doi: 10.1007/978-981-19-8769-4_34

Zamanzadeh Darban, Z., Webb, G. I., Pan, S., Aggarwal, C., & Salehi, M. (2024). Deep learning for time series anomaly detection: A survey. ACM Computing Surveys, 57(1). doi: 10.1145/3691338

Zhang, F., Saeed, N., & Sadeghian, P. (2023). Deep learning in fault detection and diagnosis of building HVAC systems: A systematic review with meta-analysis. Energy and AI, 12, 100235. doi: https://doi.org/10.1016/j.egyai.2023.100235
Section
Technical Papers