Machine Learning Based Approach for EVAP System Early Anomaly Detection Using Connected Vehicle Data
From automobile manufacturers’ perspective, reduction of warranty cost leads to less expenditures, which then yields higher profits. Hence, it is crucial to leverage the different methods and available tools to achieve such outcome. Connected vehicle data is one critical resource that can be a gamechanger, reducing the associated costs and improving the business profitability. This project uses Mode06 (On-Board diagnostics reported tests results) connected vehicle data along with contextual data to early detect EVAP and purge monitors’ anomalies. Early detection allows fixing the issue through software (SW) and/or hardware (HW) upgrades before it turns into a failure (preventive maintenance), yielding then system quality improvement. Root cause analysis, which can be developed based on the anomaly detection outcomes and which is not within the scope of this paper, allows diagnostics of HW and/or SW related issues in a timely manner and eventually be prepared ahead of time for system failures. In this paper, statistics-based early anomaly detection models, based on vehicle data and fleet data, are developed. The proposed solution is a generic tool that does not make assumptions on data distribution and can be adapted to other systems by tweaking mainly the data cleaning process. It also incorporates specific system definitions of abnormal behavior, which makes it more accurate compared to conventional anomaly detection tools, which are mainly affected by the imbalanced data and the EVAP and purge definition of an anomaly. When deployed with field data, the algorithm showed higher performance, compared to popular anomaly detection techniques, and proved that failures can be prevented through detection of the anomalies several weeks/miles before the actual fail.
Early anomaly detection, connected vehicle data, customer experience, machine learning
Al-Garadi, M. A., Mohamed, A., Khalid, A., Du, X., Ali, I., & Guizani, M. (2020). A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security. IEEE COMMUNICATIONS SURVEYS & TUTORIALS, 1646-1685.
Bonett, D. (2006). Confidence interval for a coefficient of quartile variation. Computational Statistics & Data Analysis, 50(11), 2953-2957.
Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society.
Godwin, H. J. (1955). On Generalizations of Tchebychef's Inequality. Journal of the American Statistical Association.
İbrahim, R., Altinişik, K., & Keskin, A. (2015). The pollutant emissions from diesel-engine vehicles and exhaust aftertreatment systems. Clean Technologies and Environmental Policy, 15-27.
Lauer, M. (2001). A Mixture Approach to Novelty Detection Using Training Data with Outliers. Proceedings of the 12th European Conference on Machine Learning, (pp. 300–311).
Montgomery, D. (2005). Introduction to Statistical Quality Control. New Jersey: John Wiley & Sons.
Shewhart, W. A. (1931). Economic control of quality of manufactured product. London: Macmillan And Co Ltd.
Tukey, J. (1977). Exploratory data analysis. Reading, Mass.: Addison-Wesley Pub. Co.
Wheeler, D. J. (1995). Advanced topics in statistical process control. Knoxville, TN: SPC press.
Zarpelão, B., Rodrigo, S., Cláudio, T., & Sean, C. (2017). A survey of intrusion detection in Internet of Things. Journal of Network and Computer Applications, 25-37.