Interactive Anomaly Identification with Erroneous Feedback

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Mar 24, 2021
Takaaki Tagawa Yukihiro Tadokoro Takehisa Yairi

Abstract

The difficulties in analyzing large and extensive systems necessitate the use of efficient machine-learning tools to identify unknown system anomalies in order to avoid critical problems and ensure high reliability. Given that data logged by a system include unknown anomalies, anomaly identification models aim to simultaneously identify the time of occurrence and the features that contributed to these anomalies. To maximize accuracy, it is important to utilize the data as well as the domain knowledge of the system. However, it is difficult for a system analyst to possess not only machine-learning capabilities but also domain knowledge to incorporate into the model. In this paper, we propose a new anomaly identification framework capable of utilizing feedback based on domain knowledge without requiring any machine-learning capabilities. We also propose a novel method, the so-called rank ensemble method, to improve the accuracy of anomaly identification with erroneous feedback, that is, feedback that in- cludes incorrect information. Our method enables erroneous information to be adaptively ignored by assuming consistency between the data and the user feedback. An intensive parameter study using benchmark datasets and a case study with real vehicle data demonstrate the applicability of our framework.

Abstract 257 | PDF Downloads 266

##plugins.themes.bootstrap3.article.details##

Keywords

anomaly detection, errorneous feedback

References
Alpaydin, E., & Kaynak, C. (1998). Optical recognition of handwritten digits data set. Retrieved from https://archive.ics.uci.edu/ml/ datasets/Optical+Recognition+of+ Handwritten+Digits
Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000, May). Lof: Identifying density-based local outliers. SIGMOD Rec., 29(2), 93–104.
Cande`s, E. J., Li, X., Ma, Y., & Wright, J. (2011, June). Robust principal component analysis? J. ACM, 58(3), 11:1–11:37.
Chandola, V., Banerjee, A., & Kumar, V. (2009). /em anomaly detection: A survey. ACM Comput. Surv., 41(3), 15:1–15:58.
Dua, D., & Graff, C. (2017). UCI machine learning repository. Retrieved from http://archive.ics.uci.edu/ml
Duin, R. P. (n.d.). Multiple features data set. Retrieved from https://archive.ics.uci.edu/ ml/datasets/Multiple+Features
Elahi, M., Ricci, F., & Rubens, N. (2014). Active learning in collaborative filtering recommender systems. In E- commerce and web technologies (pp. 113–124). Cham: Springer International Publishing.
Emmott, A. F., Das, S., Dietterich, T., Fern, A., & Wong, W.-K. (2013). Systematic construction of anomaly detection benchmarks from real data. In Proceedings of the acm sigkdd workshop on outlier detection and description (pp. 16–21). New York, NY, USA: ACM.
Guvenir, H. A., Acar, B., & Muderrisoglu, H. (1998). Arrhythmia data set. Retrieved from https://archive.ics.uci.edu/ml/ datasets/arrhythmia
Hero, A. O. (2007). Geometric entropy minimization (gem) for anomaly detection and localization. In B. Scholkopf, J. C. Platt, & T. Hoffman (Eds.), Advances in neural information processing systems 19 (pp. 585–592). MIT Press.
Hsieh, C., Natarajan, N., & Dhillon, I. S. (2014). PU learning for matrix completion. CoRR, abs/1411.6081. Retrieved from http://arxiv.org/abs/1411.6081
Keerthi, S. S., Duan, K. B., Shevade, S. K., & Poo, A. N. (2005, November). A fast dual algorithm for kernel logistic regression. Mach. Learn., 61(1-3), 151–165.
Kriegel, H.-P., Kro¨ger, P., Schubert, E., & Zimek, A. (2009). Loop: Local outlier probabilities. In Proceedings of the 18th acm conference on information and knowledge management (pp. 1649–1652). New York, NY, USA: ACM.
Lin, Z., Chen, M., & Ma, Y. (2010). The augmented lagrange multiplier method for exact recovery of corrupted low- rank matrices. CoRR, abs/1009.5055.
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., & Ma, Y. (2013). Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell., 35(1), 171-184.
Malerba, D. (1995). Page blocks classification dataset. Retrieved from https:// archive.ics.uci.edu/ml/datasets/ Page+Blocks+Classification
Mardani, M., Mateos, G., & Giannakis, G. B. (2013a). Dynamic anomalography: Tracking network anomalies via sparsity and low rank. J. Sel. Topics Signal Processing, 7(1), 50–66.
Mardani, M., Mateos, G., & Giannakis, G. B. (2013b, August). Recovery of low-rank plus compressed sparse matrices with application to unveiling traffic anomalies. IEEE Trans. Inf. Theor., 59(8), 5186–5205.
Papadimitriou, S., Kitagawa, H., B. Gibbons, P., & Faloutsos, C. (2003, 01). Loci: Fast outlier detection using the local correlation integral. In (p. 315-326).
Parikh, A. P., Saluja, A., Dyer, C., & Xing, E. (2014, October). Language modeling with power low rank ensembles. In Proceedings of the 2014 conference on empirical methods in natural language processing (emnlp) (pp. 1487–1498). Doha, Qatar: Association for Computational Linguistics.
Pham, D.-S., Venkatesh, S., Lazarescu, M., & Budhaditya, S. (2014, January). Anomaly detection in large-scale data stream networks. Data Min. Knowl. Discov., 28(1), 145–189.
Raghavan, H., Madani, O., & Jones, R. (2006, December). Active learning with feedback on features and instances. J. Mach. Learn. Res., 7, 1655–1686.
Scholkopf, B., Platt, J. C., Shawe-Taylor, J. C., Smola, A. J., & Williamson, R. C. (2001, July). Estimating the support of a high-dimensional distribution. Neural Comput., 13(7), 1443–1471.
Scott, C., & Nowak, R. (2006). Learning minimum volume sets. In Y. Weiss, B. Scho¨lkopf, & J. C. Platt (Eds.), Advances in neural information processing systems 18 (pp. 1209–1216). MIT Press.
Settles, B. (2011). Closing the loop: Fast, interactive semi- supervised annotation with queries on features and instances. In Proceedings of the 2011 conference on empirical methods in natural language processing (pp. 1467–1478). Association for Computational Linguistics.
Siddiqui, M. A., Fern, A., Dietterich, T. G., & Wong, W.-K. (2019, January). Sequential feature explanations for anomaly detection. ACM Trans. Knowl. Discov. Data, 13(1), 1:1–1:22.
Sindhwani, V., Bucak, S. S., Hu, J., & Mojsilovic, A. (2010). One-class matrix completion with low-density factorizations. In ICDM (pp. 1055–1060). IEEE Computer Society.
Slate, D. J. (1991). Letter recognition data set. Retrieved from https://archive.ics.uci.edu/ ml/datasets/Letter+Recognition
Sricharan, K., & Hero, A. O. (2011). Efficient anomaly detection using bipartite k-nn graphs. In J. Shawe-Taylor,
R. S. Zemel, P. L. Bartlett, F. Pereira, & K. Q. Weinberger (Eds.), Advances in neural information processing systems 24 (pp. 478–486). Curran Associates, Inc.
Stewart, G. W. (1991). Perturbation theory for the singular value decomposition. SVD and Signal Processing, II: Algorithms, Analysis and Applications, 99–109.
Su, X., & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Adv. Artificial Intellegence, 2009, 421425:1–421425:19.
Subba, B., Biswas, S., & Karmakar, S. (2016, 03). A neural network based system for intrusion detection and attack classification. In (p. 1-6).
Tagawa, T., Tadokoro, Y., & Yairi, T. (2015, 26–28 Nov). Structured denoising autoencoder for fault detection and analysis. In D. Phung & H. Li (Eds.), Proceedings of the sixth asian conference on machine learning (Vol. 39, pp. 96–111). Nha Trang City, Vietnam: PMLR.
Tax, D. M. J., & Duin, R. P. W. (2004, January). Support vector data description. Mach. Learn., 54(1), 45–66.
Tipping, M. E., & Bishop, C. M. (1999). Mixtures of probabilistic principal component analyzers. Neural Computation, 11(2), 443-482.
Yairi, T., Takeishi, N., Oda, T., Nakajima, Y., Naoki, N., & Takata, N. (2017, June). A data-driven health monitoring method for satellite housekeeping data based on probabilistic clustering and dimensionality reduction. IEEE Transactions on Aerospace and Electronic Systems, 53(3), 1384-1401.
Zhao, M., & Saligrama, V. (2009). Anomaly detection with score functions based on nearest neighbor graphs. In Y. Bengio, D. Schuurmans, J. D. Lafferty,
C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems 22 (pp. 2250– 2258). Curran Associates, Inc.
Section
Technical Papers