Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Anomaly detection (AD) plays a crucial role in time series applications, primarily because time series data is employed across real-world scenarios. Detecting anomalies poses significant challenges since anomalies take diverse forms making them hard to pinpoint accurately. Previous research has explored different AD models, making specific assumptions with varying sensitivity toward particular anomaly types. To address this issue, we propose a novel model selection for unsupervised AD using a combination of time series forest (TSF) and reinforcement learning (RL) approaches that dynamically chooses an AD technique. Our approach allows for effective AD without explicitly depending on ground truth labels that are often scarce and expensive to obtain. Results from the realtime series dataset demonstrate that the proposed model selection approach outperforms all other AD models in terms of the F1 score metric. For the synthetic dataset, our proposed model surpasses all other AD models except for KNN, with an impressive F1 score of 0.989. The proposed model selection framework also exceeded the performance of GPT-4 when prompted to act as an anomaly detector on the synthetic dataset. Exploring different reward functions revealed that the original reward function in our proposed AD model selection approach yielded the best overall scores. We evaluated the performance of the six AD models on an additional three datasets, having global, local, and clustered anomalies respectively, showing that each AD model exhibited distinct performance depending on the type of anomalies. This emphasizes the significance of our proposed AD model selection framework, maintaining high performance across all datasets, and showcasing superior performance across different anomaly types.
Wydawca
Rocznik
Tom
Strony
5--24
Opis fizyczny
Bibliogr. 68 poz., tab.
Twórcy
autor
- Department of Electrical and Computer Engineering, American University of Beirut, Beirut, Lebanon
autor
- Department of Electrical and Computer Engineering, American University of Beirut, Beirut, Lebanon
Bibliografia
- [1] J. E. Zhang, D. Wu, and B. Boulet, Time Series Anomaly Detection via Reinforcement Learning-Based Model Selection, 2022 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), 2022, 193–199.
- [2] A.N. Alkuwari, S. Al-Kuwari, M. Qaraqe, Anomaly Detection in Smart Grids: A Survey from Cybersecurity Perspective, 2022 3rd International Conference on Smart Grid and Renewable Energy (SGRE), 2022, 1-7.
- [3] A. Chatterjee, B.S. Ahmed, IoT Anomaly Detection Methods and Applications: A Survey, Internet of Things, 19, 2022, 100568.
- [4] A.A. Khalil, F. E Ibrahim, M.Y. Abbass, N. Haggag, Y. Mahrous, A. Sedik, Z. Elsherbeeny, A.M. Khalaf, M. Rihan, W. El-Shafai, Efficient Anomaly Detection from Medical Signals and Images with Convolutional Neural Networks for Internet of Medical Things (IoMT) Systems, International Journal for Numerical Methods in Biomedical Engineering, 38(1), 2022, e3530.
- [5] W. Hilal, S.A. Gadsden, J. Yawney, Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances, Expert Systems with Applications, 193, 2022, 116429.
- [6] M. U. Hassan, M. H. Rehmani, and J. Chen, Anomaly Detection in Blockchain Networks: A Comprehensive Survey, IEEE Communications Surveys & Tutorials, 25(1), 2022, 289–318.
- [7] A. Singh and K. Chatterjee, Cloud Security Issues and Challenges: A Survey, Journal of Network and Computer Applications, 79, 2017, 88–115.
- [8] D. Jung, N. Ramanan, M. Amjadi, S. R. Karingula, J. Taylor, and C. N. Coelho Jr, Time Series Anomaly Detection with Label-Free Model Selection, arXiv preprint arXiv:2106.07473, 2021.
- [9] V. Barnett and T. Lewis, Outliers in Statistical Data, 3rd ed., Wiley, New York 1994.
- [10] L. Ruff, J. R. Kauffmann, R. A. Vandermeulen, G. Montavon, W. Samek, M. Kloft, T. G. Dietterich, and K.-R. Müller, A Unifying Review of Deep and Shallow Anomaly Detection, Proceedings of the IEEE, 109(5), 2021, 756–795.
- [11] M. Gunduz and A. M. A. Yahya, Analysis of Project Success Factors in Construction Industry, Technological and Economic Development of Economy, 24(1), 2018, 67–80.
- [12] P. M. Tehrani, Cyber Resilience Strategy and Attribution in the Context of International Law, European Conference on Cyber Warfare and Security, 2019, 501–XVI.
- [13] J. Ghanim, M. Issa, and M. Awad, An Asymmetric Loss with Anomaly Detection LSTM Framework for Power Consumption Prediction, 2022 IEEE 21st Mediterranean Electrotechnical Conference (MELECON), 2022, 819–824.
- [14] N. B. Aissa and M. Guerroumi, Semi-Supervised Statistical Approach for Network Anomaly Detection, Procedia Computer Science, 83, 2016, 1090–1095.
- [15] S. Akcay, A. Atapour-Abarghouei, and T. P. Breckon, Ganomaly: Semi-Supervised Anomaly Detection via Adversarial Training, Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part III 14, 2019, 622–637.
- [16] S. Han, X. Hu, H. Huang, M. Jiang, and Y. Zhao, Adbench: Anomaly Detection Benchmark, Advances in Neural Information Processing Systems, 35, 2022, 32142–32159.
- [17] V. Chandola, A. Banerjee, and V. Kumar, Anomaly Detection: A Survey, ACM Computing Surveys (CSUR), 41(3), 2009, 1–58.
- [18] J. P. S. Chhabra and G. P. Warn, A Method for Model Selection Using Reinforcement Learning When Viewing Design as a Sequential Decision Process, Structural and Multidisciplinary Optimization, 59, 2019, 1521–1542.
- [19] R.S. Sutton and A.G. Barto, Reinforcement Learning, second edition: An Introduction, MIT Press, 2018.
- [20] V. Kosana, K. Teeparthi, S. Madasthu, and S. Kumar, A Novel Reinforced Online Model Selection Using Q-learning Technique for Wind Speed Prediction, Sustainable Energy Technologies and Assessments, 49, 2022, 101780.
- [21] Y. Fu, D. Wu, and B. Boulet, Reinforcement Learning Based Dynamic Model Combination for Time Series Forecasting, Proceedings of the AAAI Conference on Artificial Intelligence, 36(6), 2022, 6639–6647.
- [22] K. Christophe, J. El Zini, and M. Awad, A Progressive and Cross-Domain Deep Transfer Learning Framework for Wrist Fracture Detection, Journal of Artificial Intelligence and Soft Computing Research, 12(2), 2021, 101-120.
- [23] H. Deng, G. Runger, E. Tuv, and V. Martyanov, A Time Series Forest for Classification and Feature Extraction, Information Sciences, 239, 2013, 142–153.
- [24] Lim, Bryan, Zohren, and Stefan, Time-Series Forecasting with Deep Learning: A Survey. Philosophical Transactions of the Royal Society A, 379(2194), 2021, 20200209.
- [25] M. Braei and S. Wagner, Anomaly Detection in Univariate Time-Series: A Survey on the State-ofthe-Art, arXiv preprint arXiv:2004.00433, 2020.
- [26] C.C. Aggarwal, Outlier Analysis, Springer International Publishing, 2016.
- [27] N. Görnitz, M. Kloft, K. Rieck, and U. Brefeld, Toward Supervised Anomaly Detection, Journal of Artificial Intelligence Research, 46, 2013, 235–262.
- [28] V. N. Vapnik, An Overview of Statistical Learning Theory, IEEE Transactions on Neural Networks, 10(5), 1999, 988–999.
- [29] V. Sindhwani, P. Niyogi, and M. Belkin, Beyond the Point Cloud: From Transductive to Semi-Supervised Learning, Proceedings of the 22nd International Conference on Machine Learning, 2005, 824–831.
- [30] E. M. Knorr, R. T. Ng, and V. Tucakov, Distance-Based Outliers: Algorithms and Applications, The VLDB Journal, 8(3), 2000, 237–253.
- [31] S. Ramaswamy, R. Rastogi, and K. Shim, Efficient Algorithms for Mining Outliers from Large Data Sets, Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, 427–438.
- [32] F. Angiulli and C. Pizzuti, Fast Outlier Detection in High Dimensional Spaces, European Conference on Principles of Data Mining and Knowledge Discovery, 2002, 15–27.
- [33] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, LOF: Identifying Density-Based Local Outliers, Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, 93–104.
- [34] G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent, and M. E. Houle, On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study, Data Mining and Knowledge Discovery, 30, 2016, 891–927.
- [35] L. Xiong, X. Chen, and J. Schneider, Direct Robust Matrix Factorization for Anomaly Detection, 2011 IEEE 11th International Conference on Data Mining, 2011, 844–853.
- [36] L. Li, J. McCann, N. S. Pollard, and C. Faloutsos, Dynammo: Mining and Summarization of Co-evolving Sequences with Missing Values, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, 507–516.
- [37] N. Görnitz, M. Braun, and M. Kloft, Hidden Markov Anomaly Detection, International Conference on Machine Learning, 2015, 1833–1842.
- [38] A. P. Dawid and A. M. Skene, Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm, Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1), 1979, 20–28.
- [39] M. Awad, and R. Khanna, Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers, Springer Nature, 2015.
- [40] L. M. Manevitz and M. Yousef, One-Class SVMs for Document Classification, Journal of Machine Learning Research, 2(Dec), 2001, 139–154.
- [41] B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, Estimating the Support of a High-Dimensional Distribution, Neural Computation, 13(7), 2001, 1443–1471.
- [42] Z. Li, Y. Zhao, X. Hu, N. Botta, C. Ionescu, and G. H. Chen, ECOD: Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions, IEEE Transactions on Knowledge and Data Engineering, 35(12), 2022, 12181–12193.
- [43] Z. Li, Y. Zhao, N. Botta, C. Ionescu, and X. Hu, COPOD: Copula-Based Outlier Detection, 2020 IEEE International Conference on Data Mining (ICDM), 2020, 1118–1123.
- [44] A. Kharitonov, A. Nahhas, M. Pohl, and K. Turowski, Comparative Analysis of Machine Learning Models for Anomaly Detection in Manufacturing, Procedia Computer Science, 200, 2022, 1288–1297.
- [45] Z. Xu, D. Kakde, and A. Chaudhuri, Automatic Hyperparameter Tuning Method for Local Outlier Factor, with Applications to Anomaly Detection, 2019 IEEE International Conference on Big Data (Big Data), 2019, 4201–4207.
- [46] F. T. Liu, K. M. Ting, and Z.-H. Zhou, Isolation Forest, 2008 Eighth IEEE International Conference on Data Mining, 2008, 413–422.
- [47] P. J. Rousseeuw and M. Hubert, Anomaly Detection by Robust Statistics, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(2), 2018, e1236.
- [48] H. Hoffmann, Kernel PCA for Novelty Detection, Pattern Recognition, 40(3), 2007, 863–874.
- [49] P. Malhotra, A. Ramakrishnan, G. Anand, L. Vig, P. Agarwal, and G. Shroff, LSTM-Based Encoder-Decoder for Multi-Sensor Anomaly Detection, arXiv preprint arXiv:1607.00148, 2016.
- [50] Y. Su, Y. Zhao, C. Niu, R. Liu, W. Sun, and D. Pei, Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, 2828–2837.
- [51] S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, 9(8), 1997, 1735–1780.
- [52] J. El Zini, Y. Rizk, and M. Awad, An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks, Journal of Artificial Intelligence and Soft Computing Research, 11(1), 2021, 33-50.
- [53] J. Audibert, P. Michiardi, F. Guyard, S. Marti, and M. A. Zuluaga, USAD: Unsupervised Anomaly Detection on Multivariate Time Series, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, 3395–3404.
- [54] S. González-Carvajal and E. C. Garrido-Merchán, Comparing BERT Against Traditional Machine Learning Text Classification, arXiv preprint arXiv:2005.13012, 2020.
- [55] W. Dang, B. Zhou, W. Zhang, and S. Hu, Time Series Anomaly Detection Based on Language Model, Proceedings of the Eleventh ACM International Conference on Future Energy Systems, 2020, 544–547.
- [56] M. Dong, H. Huang, and L. Cao, Can LLMs Serve As Time Series Anomaly Detectors?, arXiv preprint arXiv:2408.03475, 2024.
- [57] J. Su, C. Jiang, X. Jin, Y. Qiao, T. Xiao, H. Ma, R. Wei, Z. Jing, J. Xu, and J. Lin, Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review, arXiv preprint arXiv:2402.10350, 2024.
- [58] Y. Li, Z. Chen, D. Zha, K. Zhou, H. Jin, H. Chen, and X. Hu, AUTOOD: Neural Architecture Search for Outlier Detection, 2021 IEEE 37th International Conference on Data Engineering (ICDE), 2021, 2117–2122.
- [59] K.-H. Lai, D. Zha, G. Wang, J. Xu, Y. Zhao, D. Kumar, Y. Chen, P. Zumkhawaka, M. Wan, D. Martinez, et al., TODS: An Automated Time Series Outlier Detection System, Proceedings of the AAAI Conference on Artificial Intelligence, 35(18), 2021, 16060–16062.
- [60] Y. Li, D. Zha, P. Venugopal, N. Zou, and X. Hu, PYODDS: An End-to-End Outlier Detection System with Automated Machine Learning, Companion Proceedings of the Web Conference 2020, 2020, 153–157.
- [61] Y. Zhao, R. Rossi, and L. Akoglu, Automatic Un-supervised Outlier Model Selection, Advances in Neural Information Processing Systems, 34, 2021, 4489–4502.
- [62] M. Gulati and P. Arjunan, LEAD1.0: A Large-Scale Annotated Dataset for Energy Anomaly Detection in Commercial Buildings, Proceedings of the Thirteenth ACM International Conference on Future Energy Systems, 2022, 485–488.
- [63] A Platform for Open Data of the European Power System, available at https://open-power-system-data.org/, Accessed on: Aug. 9, 2024.
- [64] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Scikit-Learn: Machine Learning in Python, The Journal of Machine Learning Research, 12, 2011, 2825–2830.
- [65] Y. Zhao, Z. Nasrullah, and Z. Li, PYOD: A Python Toolbox for Scalable Outlier Detection, Journal of Machine Learning Research, 20(96), 2019, 1–7.
- [66] M. Galati, USAD: UnSupervised Anomaly Detection on Multivariate Time Series, available at https://github.com/manigalati/usad, 2020. Accessed on: Aug. 9, 2024.
- [67] J. Bergstra, B. Komer, C. Eliasmith, D. Yamins, and D. D. Cox, Hyperopt: A Python Library for Model Selection and Hyperparameter Optimization, Computational Science & Discovery, 8(1), 2015, 014008.
- [68] A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann, Stable-Baselines3: Reliable Reinforcement Learning Implementations, Journal of Machine Learning Research, 22(268), 2021, 1–8.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-3a30e51f-07b6-490e-9c11-1bd15520945b
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.