Pre-LogMGAE: Identification of Log Anomalies Using a Pre-Trained Masked Graph Autoencoder
Log-based anomaly detection in software systems is becoming increasingly crucial for monitoring network operations and ensuring system security. Deep learning-based methods are widely used for large-scale log anomaly detection due to their capacity to learn complex features. However, current researc...
Saved in:
| Published in: | Proceedings - Symposium on Reliable Distributed Systems pp. 294 - 306 |
|---|---|
| Main Authors: | , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
30.09.2024
|
| Subjects: | |
| ISSN: | 2575-8462 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Log-based anomaly detection in software systems is becoming increasingly crucial for monitoring network operations and ensuring system security. Deep learning-based methods are widely used for large-scale log anomaly detection due to their capacity to learn complex features. However, current research predominantly treats original logs as simple sequences, ignoring their complex structure and dynamic dependency relationships. Additionally, these methods often rely on extensive labeled data or domain-specific vectors to represent logs for model training, which can be labor-intensive to label manually and ineffective across various domains within a system. To address these challenges, this paper proposes Pre-LogMGAE, a universal masked graph autoencoder (GAE) framework with contrastive learning for self-supervised pre-training for log anomaly detection. In contrast to graph or link reconstruction, Pre-LogMGAE focuses on node feature reconstruction using a masking strategy to reduce the impact of excessive redundant information. Furthermore, we introduce Graph Attention Networks (GAT) with the Gated Recurrent Unit (GRU) to incorporate sequence modeling, allowing for capturing long-term and short-term dependencies in log events. We include contrastive learning objectives in finetuning to extract diverse features and enhance the algorithm's robustness. Through an extensive evaluation of three real-world datasets and specific case studies with configuration error, Pre-LogMGAE demonstrates superior performance compared to the six baselines, including PCA, IM, DeepLog, LogRobust, LogBERT, and DeepTraLog. This superiority is evident in terms of precision, recall, F1 score, and time efficiency, highlighting Pre-LogMGAE's stability and reliability in anomaly detection. The study aims to improve anomaly detection capabilities in multi-source system logs, offering innovative technical support to enhance system security and reliability. |
|---|---|
| ISSN: | 2575-8462 |
| DOI: | 10.1109/SRDS64841.2024.00036 |