Deep Detector Health Management under Adversarial Campaigns



Javier Echauz Keith Kenemer Sarfaraz Hussein Jay Dhaliwal Saurabh Shintre Slawomir Grzonkowski Andrew Gardner


Machine learning models are vulnerable to adversarial inputs that induce seemingly unjustifiable errors. As automated classifiers are increasingly used in industrial control systems and machinery, these adversarial errors could grow to be a serious problem. Despite numerous studies over the past few years, the field of adversarial ML is still considered alchemy, with no practical unbroken defenses demonstrated to date, leaving PHM practitioners with few meaningful ways of addressing the problem. We introduce turbidity detection as a practical superset of the adversarial input detection problem, coping with adversarial campaigns rather than statistically invisible one-offs. This perspective is coupled with ROCtheoretic
design guidance that prescribes an inexpensive domain adaptation layer at the output of a deep learning model during an attack campaign. The result aims to approximate the Bayes optimal mitigation that ameliorates the detection model’s degraded health. A proactively reactive type of prognostics is achieved via Monte Carlo simulation of various adversarial campaign scenarios, by sampling from the model’s own turbidity distribution to quickly deploy the correct mitigation during a real-world campaign.

Abstract 14 | PDF Downloads 16



Asset health management, binary classifier, deep convolution neural network, adversarial

Anderson, H. S., Kharkar, A., Filar, B., Evans, D., & Roth, P. (2018). Learning to evade static PE machine learning malware models via reinforcement learning. arXiv preprint arXiv:1801.08917.
Carlini, N., & Wagner, D. (2017a). Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (p. 3-14).
Carlini, N., & Wagner, D. (2017b). Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (p. 39-57).
Dhaliwal, J., & Shintre, S. (2018). Gradient similarity: An explainable approach to detect adversarial attacks against deep learning. arXiv preprint
Echauz, J. (2019). 10 ML Gotchas for AI/ML Security!
Evans, D. (2018). Keynote presentation: Is “adversarial examples” an Adversarial Example? In IEEE 1st Deep Learning and Security Workshop.
Gilmer, J., Adams, R. P., Goodfellow, I., Andersen, D., & Dahl, G. E. (2018). Motivating the rules of the game for adversarial example research. arXiv preprint arXiv:1807.06732.
Goodfellow, I., McDaniel, P., & Papernot, N. (2018). Making machine learning robust against adversarial inputs. Communications of the ACM, 61(7), 56-66.
Krˇc´al, M., ˇSvec, O., Jaˇsek, O., & B´alek, M. (2018). Deep convolutional malware classifiers can learn from raw executables and labels only. In Proceedings of the International Conference on Learning Representations.
Kuppa, A., Grzonkowski, S., & LeKhac, N. A. (2018). Enabling trust in deep learning models: A digital forensics case study. In 17th IEEE International Conference on Trust, Security and Privacy (p. 1250-1255).
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.
National Cybersecurity and Communications Integration Center. (2017). ICS Cert Monitor. Nov-Dec2017 S508C.pdf.
Papernot, N., Faghri, F., Carlini, N., Goodfellow, I., Feinman, R., Kurakin, A., . . . others (2016). Technical report on the cleverhans v2. 1.0 adversarial examples library. arXiv preprint arXiv:1610.00768.
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., & Nicholas, C. (2017). Malware detection by eating a whole exe. arXiv preprint arXiv:1710.09435v1.
Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press.
Suciu, O., Coull, S., & Johns, J. (2019). Exploring adversarial examples in malware detection. In IEEE 2nd Deep Learning and Security Workshop.
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. arXiv preprint
Yan, W., Mestha, L., John, J., Holzhauer, D., Abbaszadeh, M., & McKinley, M. (2018). Cyberattack Detection for Cyber Physical Systems Security–A Preliminary Study. In Proceedings of the annual conference of the PHM society (Vol. 10).
Technical Papers