SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data

Saved in:
Bibliographic Details
Title: SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data
Authors: Kashif Alam, Kashif Kifayat, Gabriel Avelino Sampedro, Vincent Karovič, Tariq Naeem
Source: IEEE Access, Vol 12, Pp 95659-95672 (2024)
Publisher Information: Institute of Electrical and Electronics Engineers (IEEE), 2024.
Publication Year: 2024
Subject Terms: machine learning, Shapley additive explanation, Hadoop distributed file system, 0202 electrical engineering, electronic engineering, information engineering, Electrical engineering. Electronics. Nuclear engineering, 02 engineering and technology, Explainable artificial intelligence, TK1-9971
Description: Artificial Intelligence (AI) has made tremendous progress in anomaly detection. However, AI models work as a black-box, making it challenging to provide reasoning behind their judgments in a Log Anomaly Detection (LAD). To the rescue, Explainable Artificial Intelligence (XAI) improves system log analysis. It follows a white-box model for transparency, understandability, trustworthiness, and dependability of Machine Learning (ML) and Deep Learning (DL) Models. In addition, Shapely Additive Explanation (SHAP), added to system dynamics, makes informed judgments and adoptable proactive methods to optimize system functionality and reliability. Therefore, this paper proposed the Shapely eXplainable Anomaly Detection (SXAD) framework to identify different events (features) that impact the models’ interpretability, trustworthiness, and explainability. The framework utilizes the Kernel SHAP approach, which is based on Shapley values principle, providing an innovative approach to event selection and identifying specific events causing abnormal behavior. This study addresses the LAD by transforming it from a black-box model into a white-box one, leveraging XAI to make it transparent, interpretable, explainable, and dependable. It utilizes benchmark data from the Hadoop Distributed File System (HDFS), organized using a Drain parser, and employs several ML models, such as Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB). These models achieve impressive accuracy rates of 99.99%, 99.85%, and 99.99%, respectively. Our contribution are novel because no earlier work has been done in the area of Log Anomaly Detection (LAD) with integration of XAI-SHAP.
Document Type: Article
ISSN: 2169-3536
DOI: 10.1109/access.2024.3425472
Access URL: https://doaj.org/article/aa543fc4b5c64677bd47ed3e3fafc5c8
Rights: CC BY NC ND
Accession Number: edsair.doi.dedup.....ead08e9ecb51cd26d15a3fd31e5d5a60
Database: OpenAIRE
Description
Abstract:Artificial Intelligence (AI) has made tremendous progress in anomaly detection. However, AI models work as a black-box, making it challenging to provide reasoning behind their judgments in a Log Anomaly Detection (LAD). To the rescue, Explainable Artificial Intelligence (XAI) improves system log analysis. It follows a white-box model for transparency, understandability, trustworthiness, and dependability of Machine Learning (ML) and Deep Learning (DL) Models. In addition, Shapely Additive Explanation (SHAP), added to system dynamics, makes informed judgments and adoptable proactive methods to optimize system functionality and reliability. Therefore, this paper proposed the Shapely eXplainable Anomaly Detection (SXAD) framework to identify different events (features) that impact the models’ interpretability, trustworthiness, and explainability. The framework utilizes the Kernel SHAP approach, which is based on Shapley values principle, providing an innovative approach to event selection and identifying specific events causing abnormal behavior. This study addresses the LAD by transforming it from a black-box model into a white-box one, leveraging XAI to make it transparent, interpretable, explainable, and dependable. It utilizes benchmark data from the Hadoop Distributed File System (HDFS), organized using a Drain parser, and employs several ML models, such as Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB). These models achieve impressive accuracy rates of 99.99%, 99.85%, and 99.99%, respectively. Our contribution are novel because no earlier work has been done in the area of Log Anomaly Detection (LAD) with integration of XAI-SHAP.
ISSN:21693536
DOI:10.1109/access.2024.3425472