AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection
The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality...
Gespeichert in:
| Veröffentlicht in: | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] S. 497 - 509 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
11.09.2023
|
| Schlagworte: | |
| ISSN: | 2643-1572 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality public log resources as training datasets. While some log datasets are available for anomaly detection, they suffer from limitations in (1) comprehensiveness of log events; (2) scalability over diverse systems; and (3) flexibility of log utility. To address these limitations, we propose AUTOLOG, the first automated log generation methodology for anomaly detection. AUTOLOG uses program analysis to generate runtime log sequences without actually running the system. AUTOLOG starts with probing comprehensive logging statements associated with the call graphs of an application. Then, it constructs execution graphs for each method after pruning the call graphs to find log-related execution paths in a scalable manner. Finally, AUTOLOG propagates the anomaly label to each acquired execution path based on human knowledge. It generates flexible log sequences by walking along the log execution paths with controllable parameters. Experiments on 50 popular Java projects show that AUTOLOG acquires significantly more (9x-58x) log events than existing log datasets from the same system, and generates log messages much faster (15x) with a single machine than existing passive data collection approaches. AUTOLOG also provides hyper-parameters to adjust the data size, anomaly rate, and component indicator for simulating different real-world scenarios. We further demonstrate AUTOLOG's practicality by showing that AUTOLOG enables log-based anomaly detectors to achieve better performance (1.93%) compared to existing log datasets. We hope AUTOLOG can facilitate the benchmarking and adoption of automated log analysis techniques. |
|---|---|
| AbstractList | The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality public log resources as training datasets. While some log datasets are available for anomaly detection, they suffer from limitations in (1) comprehensiveness of log events; (2) scalability over diverse systems; and (3) flexibility of log utility. To address these limitations, we propose AUTOLOG, the first automated log generation methodology for anomaly detection. AUTOLOG uses program analysis to generate runtime log sequences without actually running the system. AUTOLOG starts with probing comprehensive logging statements associated with the call graphs of an application. Then, it constructs execution graphs for each method after pruning the call graphs to find log-related execution paths in a scalable manner. Finally, AUTOLOG propagates the anomaly label to each acquired execution path based on human knowledge. It generates flexible log sequences by walking along the log execution paths with controllable parameters. Experiments on 50 popular Java projects show that AUTOLOG acquires significantly more (9x-58x) log events than existing log datasets from the same system, and generates log messages much faster (15x) with a single machine than existing passive data collection approaches. AUTOLOG also provides hyper-parameters to adjust the data size, anomaly rate, and component indicator for simulating different real-world scenarios. We further demonstrate AUTOLOG's practicality by showing that AUTOLOG enables log-based anomaly detectors to achieve better performance (1.93%) compared to existing log datasets. We hope AUTOLOG can facilitate the benchmarking and adoption of automated log analysis techniques. |
| Author | Li, Yichen Su, Yuxin Huo, Yintong Xie, Zifan Lyu, Michael R. He, Pinjia |
| Author_xml | – sequence: 1 givenname: Yintong surname: Huo fullname: Huo, Yintong email: ythuo@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Department of Computer Science and Engineering,Hong Kong,China – sequence: 2 givenname: Yichen surname: Li fullname: Li, Yichen email: ycli21@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Department of Computer Science and Engineering,Hong Kong,China – sequence: 3 givenname: Yuxin surname: Su fullname: Su, Yuxin email: suyx35@mail.sysu.edu.cn organization: School of Software Engineering, Sun Yat-sen University,Zhuhai,China – sequence: 4 givenname: Pinjia surname: He fullname: He, Pinjia email: hepinjia@cuhk.edu.cn organization: School of Data Science, The Chinese University of Hong Kong,Shenzhen (CUHK Shenzhen),China – sequence: 5 givenname: Zifan surname: Xie fullname: Xie, Zifan email: xzff@hust.edu.cn organization: Huazhong University of Science and Technology,Wuhan,China – sequence: 6 givenname: Michael R. surname: Lyu fullname: Lyu, Michael R. email: lyu@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Department of Computer Science and Engineering,Hong Kong,China |
| BookMark | eNotj8FOwkAQQFejiYB8gR72B1pnZ3bbrrcGQU1IPKBnMrSzWoWutiWEv5dETy_v8pI3VhdtbEWpGwOpMeDvytXcZYg-RUBKAQzRmZr63BfkgND7zJ6rEWaWEuNyvFLjvv8EcCfJR6os90Ncxvd7XeoT9Ep-9tJWolfHdviQvun1ouOdHGL3pUPsdNnGHW-P-kEGqYYmttfqMvC2l-k_J-ptMX-dPSXLl8fnWblMGAs7JBQkUB2EJPgaQmVqX-CGGTPJUKCo2G-K00FtmCvLdU5cOwvImZXKOkMTdfvXbURk_d01O-6OawPoC0dAv5oYTKE |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ASE56229.2023.00133 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350329964 |
| EISSN | 2643-1572 |
| EndPage | 509 |
| ExternalDocumentID | 10298530 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a284t-3fef3dfe3ef9d0fc1d982baa26e62e08ca9b8023d1aac4ad73ad5402a64ec4513 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 6 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200040&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a284t-3fef3dfe3ef9d0fc1d982baa26e62e08ca9b8023d1aac4ad73ad5402a64ec4513 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10298530 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-11 |
| PublicationDateYYYYMMDD | 2023-09-11 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0051577 ssib057256115 |
| Score | 2.3242936 |
| Snippet | The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 497 |
| SubjectTerms | Industries Java Legged locomotion Runtime Scalability Software algorithms Training |
| Title | AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection |
| URI | https://ieeexplore.ieee.org/document/10298530 |
| WOSCitedRecordID | wos001103357200040&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywNrIHZcO2GLoBVDVVUqoG7VxT6jSpCgJkXqv8dOk8LCwGTLk3XJ6Z3t994RcqOkEQCZDBRCGAhUMoiNDAMlMgs8wkjV8rHXkRqP49ksmTRi9VoLg4g1-Qxv_bR-yzeFXvmrMpfhPHHw4k7ou0rJjVir_Xn6yoE3Y9va1-G0Uo3NEAuTu3Q6cFDPvTaFe1NT5lvl_mqoUuPJsPvPnRyQ3o8yj062mHNIdjA_It22NQNtMvWYpOmqKkbF2z1NqRvotGFM0-k6dyVfuSjpsKVlUVe30jQvPuB9TR-xqslZeY-8DAfPD09B0y0hAAcxVRBZtJGxGKFNTGg1M0nMMwAuUXIMYw1J5t3eDAPQAoyKwLhyjYMUqEWfRSekkxc5nhIaau876JLVKhRCaFAxojXKa9dEjPyM9HxI5p8bQ4x5G43zP9YvyL6PuqdZMHZJOtVyhVdkT39Vi3J5XX_GbxUinMA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA6igp7mj4m_zcFrtUmzpvVWdGNiHYNN2W28JS8y0Fa2Tth_b9K104sHTwk5hSSP7yX5vu8Rci1DLQAmoScRfE-gDL1Ih74nxcQADzCQpXzsNZW9XjQaxf1KrF5qYRCxJJ_hjeuWf_k6Vwv3VGYjnMcWXuwNfaslBPdXcq36-LSkhW_G1tmvRWopK6Mh5se3yaBtwZ47dQp3tqbMFcv9VVKlRJRO459z2SPNH20e7a9RZ59sYHZAGnVxBlrF6iFJkkWRp_nbHU2obeig4kzTwTKzSd98OqedmphFbeZKkyz_gPclfcCipGdlTfLSaQ_vu15VL8EDCzKFFxg0gTYYoIm1bxTTccQnADzEkKMfKYgnzu9NMwAlQMsAtE3YOIQClWix4IhsZnmGx4T6yjkP2nA1EoUQCmSEaLR06jURIT8hTbck48-VJca4Xo3TP8avyE53-JyO08fe0xnZdTvgSBeMnZPNYrbAC7KtvorpfHZZbuk3bkegBw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=AutoLog%3A+A+Log+Sequence+Synthesis+Framework+for+Anomaly+Detection&rft.au=Huo%2C+Yintong&rft.au=Li%2C+Yichen&rft.au=Su%2C+Yuxin&rft.au=He%2C+Pinjia&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=497&rft.epage=509&rft_id=info:doi/10.1109%2FASE56229.2023.00133&rft.externalDocID=10298530 |