EagerLog: Active Learning Enhanced Retrieval Augmented Generation for Log-based Anomaly Detection

Logs record essential information about system operations and serve as a critical source for anomaly detection, which has generated growing research interest. Utilizing large language models (LLMs) within a retrieval-augmented generation (RAG) framework for log-based anomaly detection is an effectiv...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 1 - 5
Main Authors: Duan, Chiming, Jia, Tong, Yang, Yong, Liu, Guiyang, Liu, Jinbu, Zhang, Huxing, Zhou, Qi, Li, Ying, Huang, Gang
Format: Conference Proceeding
Language:English
Published: IEEE 06.04.2025
Subjects:
ISSN:2379-190X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Logs record essential information about system operations and serve as a critical source for anomaly detection, which has generated growing research interest. Utilizing large language models (LLMs) within a retrieval-augmented generation (RAG) framework for log-based anomaly detection is an effective approach due to its strong generalization capabilities and efficient few-shot performance. However, the effectiveness of this method hinges on the quality of the knowledge source, which can be impacted by noise and changes within the software systems. Facing these problems, in this paper, we propose a novel log-based anomaly detection method named EagerLog, employing active learning to choose the logs for humans to label, thereby adding them to the knowledge source, thus enhancing the knowledge source and maintaining its quality. Our experiments on three open datasets (BGL, Thunderbird, Zookeeper) and one industrial dataset demonstrate that EagerLog can achieve 93.65% F1 score with approximately 10 labeled log sequences, surpassing existing methods by 15.32%.
ISSN:2379-190X
DOI:10.1109/ICASSP49660.2025.10888663