An online log template extraction method based on hierarchical clustering

The raw log messages record extremely rich system, network, and application running dynamic information that is a good data source for abnormal detection. Log template extraction is an important prerequisite for log sequence anomaly detection. The problems of the existing log template extraction met...

Full description

Saved in:
Bibliographic Details
Published in:EURASIP journal on wireless communications and networking Vol. 2019; no. 1; pp. 1 - 12
Main Authors: Yang, Ruipeng, Qu, Dan, Qian, Yekui, Dai, Yusheng, Zhu, Shaowei
Format: Journal Article
Language:English
Published: Cham Springer International Publishing 28.05.2019
Springer Nature B.V
SpringerOpen
Subjects:
ISSN:1687-1499, 1687-1472, 1687-1499
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The raw log messages record extremely rich system, network, and application running dynamic information that is a good data source for abnormal detection. Log template extraction is an important prerequisite for log sequence anomaly detection. The problems of the existing log template extraction methods are mostly offline, and the few online methods have insufficient F1-score in multi-source log data. In view of the shortcomings of the existing methods, an online log template extraction method called LogOHC is proposed. Firstly, the raw log messages are preprocessed, and the word distributed representation (word2vec) is used to vectorize the log messages online. Then, the online hierarchical clustering algorithm is applied, and finally, log templates are generated. The experimental analysis shows that LogOHC has a higher F1-score than the existing log template extraction methods, is suitable for multi-source log data sets, and has a shorter single-step execution time, which can meet the requirements of online real-time processing.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1687-1499
1687-1472
1687-1499
DOI:10.1186/s13638-019-1430-4