AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection

The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] S. 497 - 509
Hauptverfasser: Huo, Yintong, Li, Yichen, Su, Yuxin, He, Pinjia, Xie, Zifan, Lyu, Michael R.
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 11.09.2023
Schlagworte:
ISSN:2643-1572
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality public log resources as training datasets. While some log datasets are available for anomaly detection, they suffer from limitations in (1) comprehensiveness of log events; (2) scalability over diverse systems; and (3) flexibility of log utility. To address these limitations, we propose AUTOLOG, the first automated log generation methodology for anomaly detection. AUTOLOG uses program analysis to generate runtime log sequences without actually running the system. AUTOLOG starts with probing comprehensive logging statements associated with the call graphs of an application. Then, it constructs execution graphs for each method after pruning the call graphs to find log-related execution paths in a scalable manner. Finally, AUTOLOG propagates the anomaly label to each acquired execution path based on human knowledge. It generates flexible log sequences by walking along the log execution paths with controllable parameters. Experiments on 50 popular Java projects show that AUTOLOG acquires significantly more (9x-58x) log events than existing log datasets from the same system, and generates log messages much faster (15x) with a single machine than existing passive data collection approaches. AUTOLOG also provides hyper-parameters to adjust the data size, anomaly rate, and component indicator for simulating different real-world scenarios. We further demonstrate AUTOLOG's practicality by showing that AUTOLOG enables log-based anomaly detectors to achieve better performance (1.93%) compared to existing log datasets. We hope AUTOLOG can facilitate the benchmarking and adoption of automated log analysis techniques.
AbstractList The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality public log resources as training datasets. While some log datasets are available for anomaly detection, they suffer from limitations in (1) comprehensiveness of log events; (2) scalability over diverse systems; and (3) flexibility of log utility. To address these limitations, we propose AUTOLOG, the first automated log generation methodology for anomaly detection. AUTOLOG uses program analysis to generate runtime log sequences without actually running the system. AUTOLOG starts with probing comprehensive logging statements associated with the call graphs of an application. Then, it constructs execution graphs for each method after pruning the call graphs to find log-related execution paths in a scalable manner. Finally, AUTOLOG propagates the anomaly label to each acquired execution path based on human knowledge. It generates flexible log sequences by walking along the log execution paths with controllable parameters. Experiments on 50 popular Java projects show that AUTOLOG acquires significantly more (9x-58x) log events than existing log datasets from the same system, and generates log messages much faster (15x) with a single machine than existing passive data collection approaches. AUTOLOG also provides hyper-parameters to adjust the data size, anomaly rate, and component indicator for simulating different real-world scenarios. We further demonstrate AUTOLOG's practicality by showing that AUTOLOG enables log-based anomaly detectors to achieve better performance (1.93%) compared to existing log datasets. We hope AUTOLOG can facilitate the benchmarking and adoption of automated log analysis techniques.
Author Li, Yichen
Su, Yuxin
Huo, Yintong
Xie, Zifan
Lyu, Michael R.
He, Pinjia
Author_xml – sequence: 1
  givenname: Yintong
  surname: Huo
  fullname: Huo, Yintong
  email: ythuo@cse.cuhk.edu.hk
  organization: The Chinese University of Hong Kong,Department of Computer Science and Engineering,Hong Kong,China
– sequence: 2
  givenname: Yichen
  surname: Li
  fullname: Li, Yichen
  email: ycli21@cse.cuhk.edu.hk
  organization: The Chinese University of Hong Kong,Department of Computer Science and Engineering,Hong Kong,China
– sequence: 3
  givenname: Yuxin
  surname: Su
  fullname: Su, Yuxin
  email: suyx35@mail.sysu.edu.cn
  organization: School of Software Engineering, Sun Yat-sen University,Zhuhai,China
– sequence: 4
  givenname: Pinjia
  surname: He
  fullname: He, Pinjia
  email: hepinjia@cuhk.edu.cn
  organization: School of Data Science, The Chinese University of Hong Kong,Shenzhen (CUHK Shenzhen),China
– sequence: 5
  givenname: Zifan
  surname: Xie
  fullname: Xie, Zifan
  email: xzff@hust.edu.cn
  organization: Huazhong University of Science and Technology,Wuhan,China
– sequence: 6
  givenname: Michael R.
  surname: Lyu
  fullname: Lyu, Michael R.
  email: lyu@cse.cuhk.edu.hk
  organization: The Chinese University of Hong Kong,Department of Computer Science and Engineering,Hong Kong,China
BookMark eNotj8FOwkAQQFejiYB8gR72B1pnZ3bbrrcGQU1IPKBnMrSzWoWutiWEv5dETy_v8pI3VhdtbEWpGwOpMeDvytXcZYg-RUBKAQzRmZr63BfkgND7zJ6rEWaWEuNyvFLjvv8EcCfJR6os90Ncxvd7XeoT9Ep-9tJWolfHdviQvun1ouOdHGL3pUPsdNnGHW-P-kEGqYYmttfqMvC2l-k_J-ptMX-dPSXLl8fnWblMGAs7JBQkUB2EJPgaQmVqX-CGGTPJUKCo2G-K00FtmCvLdU5cOwvImZXKOkMTdfvXbURk_d01O-6OawPoC0dAv5oYTKE
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ASE56229.2023.00133
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350329964
EISSN 2643-1572
EndPage 509
ExternalDocumentID 10298530
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a284t-3fef3dfe3ef9d0fc1d982baa26e62e08ca9b8023d1aac4ad73ad5402a64ec4513
IEDL.DBID RIE
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200040&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:41 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a284t-3fef3dfe3ef9d0fc1d982baa26e62e08ca9b8023d1aac4ad73ad5402a64ec4513
PageCount 13
ParticipantIDs ieee_primary_10298530
PublicationCentury 2000
PublicationDate 2023-Sept.-11
PublicationDateYYYYMMDD 2023-09-11
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-11
  day: 11
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev ASE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0051577
ssib057256115
Score 2.3242936
Snippet The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have...
SourceID ieee
SourceType Publisher
StartPage 497
SubjectTerms Industries
Java
Legged locomotion
Runtime
Scalability
Software algorithms
Training
Title AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection
URI https://ieeexplore.ieee.org/document/10298530
WOSCitedRecordID wos001103357200040&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywNrIHZcO2GLoBVDVVUqoG7VxT6jSpCgJkXqv8dOk8LCwGTLk3XJ6Z3t994RcqOkEQCZDBRCGAhUMoiNDAMlMgs8wkjV8rHXkRqP49ksmTRi9VoLg4g1-Qxv_bR-yzeFXvmrMpfhPHHw4k7ou0rJjVir_Xn6yoE3Y9va1-G0Uo3NEAuTu3Q6cFDPvTaFe1NT5lvl_mqoUuPJsPvPnRyQ3o8yj062mHNIdjA_It22NQNtMvWYpOmqKkbF2z1NqRvotGFM0-k6dyVfuSjpsKVlUVe30jQvPuB9TR-xqslZeY-8DAfPD09B0y0hAAcxVRBZtJGxGKFNTGg1M0nMMwAuUXIMYw1J5t3eDAPQAoyKwLhyjYMUqEWfRSekkxc5nhIaau876JLVKhRCaFAxojXKa9dEjPyM9HxI5p8bQ4x5G43zP9YvyL6PuqdZMHZJOtVyhVdkT39Vi3J5XX_GbxUinMA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA6igp7mj4m_zcFrtUmzpvVWdGNiHYNN2W28JS8y0Fa2Tth_b9K104sHTwk5hSSP7yX5vu8Rci1DLQAmoScRfE-gDL1Ih74nxcQADzCQpXzsNZW9XjQaxf1KrF5qYRCxJJ_hjeuWf_k6Vwv3VGYjnMcWXuwNfaslBPdXcq36-LSkhW_G1tmvRWopK6Mh5se3yaBtwZ47dQp3tqbMFcv9VVKlRJRO459z2SPNH20e7a9RZ59sYHZAGnVxBlrF6iFJkkWRp_nbHU2obeig4kzTwTKzSd98OqedmphFbeZKkyz_gPclfcCipGdlTfLSaQ_vu15VL8EDCzKFFxg0gTYYoIm1bxTTccQnADzEkKMfKYgnzu9NMwAlQMsAtE3YOIQClWix4IhsZnmGx4T6yjkP2nA1EoUQCmSEaLR06jURIT8hTbck48-VJca4Xo3TP8avyE53-JyO08fe0xnZdTvgSBeMnZPNYrbAC7KtvorpfHZZbuk3bkegBw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=AutoLog%3A+A+Log+Sequence+Synthesis+Framework+for+Anomaly+Detection&rft.au=Huo%2C+Yintong&rft.au=Li%2C+Yichen&rft.au=Su%2C+Yuxin&rft.au=He%2C+Pinjia&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=497&rft.epage=509&rft_id=info:doi/10.1109%2FASE56229.2023.00133&rft.externalDocID=10298530