SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data

Uloženo v:
Podrobná bibliografie
Název: SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data
Autoři: Kashif Alam, Kashif Kifayat, Gabriel Avelino Sampedro, Vincent Karovič, Tariq Naeem
Zdroj: IEEE Access, Vol 12, Pp 95659-95672 (2024)
Informace o vydavateli: Institute of Electrical and Electronics Engineers (IEEE), 2024.
Rok vydání: 2024
Témata: machine learning, Shapley additive explanation, Hadoop distributed file system, 0202 electrical engineering, electronic engineering, information engineering, Electrical engineering. Electronics. Nuclear engineering, 02 engineering and technology, Explainable artificial intelligence, TK1-9971
Popis: Artificial Intelligence (AI) has made tremendous progress in anomaly detection. However, AI models work as a black-box, making it challenging to provide reasoning behind their judgments in a Log Anomaly Detection (LAD). To the rescue, Explainable Artificial Intelligence (XAI) improves system log analysis. It follows a white-box model for transparency, understandability, trustworthiness, and dependability of Machine Learning (ML) and Deep Learning (DL) Models. In addition, Shapely Additive Explanation (SHAP), added to system dynamics, makes informed judgments and adoptable proactive methods to optimize system functionality and reliability. Therefore, this paper proposed the Shapely eXplainable Anomaly Detection (SXAD) framework to identify different events (features) that impact the models’ interpretability, trustworthiness, and explainability. The framework utilizes the Kernel SHAP approach, which is based on Shapley values principle, providing an innovative approach to event selection and identifying specific events causing abnormal behavior. This study addresses the LAD by transforming it from a black-box model into a white-box one, leveraging XAI to make it transparent, interpretable, explainable, and dependable. It utilizes benchmark data from the Hadoop Distributed File System (HDFS), organized using a Drain parser, and employs several ML models, such as Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB). These models achieve impressive accuracy rates of 99.99%, 99.85%, and 99.99%, respectively. Our contribution are novel because no earlier work has been done in the area of Log Anomaly Detection (LAD) with integration of XAI-SHAP.
Druh dokumentu: Article
ISSN: 2169-3536
DOI: 10.1109/access.2024.3425472
Přístupová URL adresa: https://doaj.org/article/aa543fc4b5c64677bd47ed3e3fafc5c8
Rights: CC BY NC ND
Přístupové číslo: edsair.doi.dedup.....ead08e9ecb51cd26d15a3fd31e5d5a60
Databáze: OpenAIRE
Popis
Abstrakt:Artificial Intelligence (AI) has made tremendous progress in anomaly detection. However, AI models work as a black-box, making it challenging to provide reasoning behind their judgments in a Log Anomaly Detection (LAD). To the rescue, Explainable Artificial Intelligence (XAI) improves system log analysis. It follows a white-box model for transparency, understandability, trustworthiness, and dependability of Machine Learning (ML) and Deep Learning (DL) Models. In addition, Shapely Additive Explanation (SHAP), added to system dynamics, makes informed judgments and adoptable proactive methods to optimize system functionality and reliability. Therefore, this paper proposed the Shapely eXplainable Anomaly Detection (SXAD) framework to identify different events (features) that impact the models’ interpretability, trustworthiness, and explainability. The framework utilizes the Kernel SHAP approach, which is based on Shapley values principle, providing an innovative approach to event selection and identifying specific events causing abnormal behavior. This study addresses the LAD by transforming it from a black-box model into a white-box one, leveraging XAI to make it transparent, interpretable, explainable, and dependable. It utilizes benchmark data from the Hadoop Distributed File System (HDFS), organized using a Drain parser, and employs several ML models, such as Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB). These models achieve impressive accuracy rates of 99.99%, 99.85%, and 99.99%, respectively. Our contribution are novel because no earlier work has been done in the area of Log Anomaly Detection (LAD) with integration of XAI-SHAP.
ISSN:21693536
DOI:10.1109/access.2024.3425472