Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques
Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based analysis, such as anomaly detection and model inference. While many techniques have been proposed in the literature, only two recent studies pr...
Saved in:
| Published in: | 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) pp. 1095 - 1106 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
ACM
01.05.2022
|
| Subjects: | |
| ISSN: | 1558-1225 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based analysis, such as anomaly detection and model inference. While many techniques have been proposed in the literature, only two recent studies provide a comprehensive evaluation and comparison of the techniques using an established benchmark composed of real-world logs. Nevertheless, we argue that both studies have the following issues: (1) they used different accuracy metrics without comparison between them, (2) some ground-truth (oracle) templates are incorrect, and (3) the accuracy evaluation results do not provide any information regarding incorrectly identified templates. In this paper, we address the above issues by providing three guidelines for assessing the accuracy of log template identification techniques: (1) use appropriate accuracy metrics, (2) perform oracle template correction, and (3) perform analysis of incorrect templates. We then assess the application of such guidelines through a comprehensive evaluation of 14 existing template identification techniques on the established benchmark logs. Results show very different insights than existing studies and in particular a much less optimistic outlook on existing techniques. |
|---|---|
| AbstractList | Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based analysis, such as anomaly detection and model inference. While many techniques have been proposed in the literature, only two recent studies provide a comprehensive evaluation and comparison of the techniques using an established benchmark composed of real-world logs. Nevertheless, we argue that both studies have the following issues: (1) they used different accuracy metrics without comparison between them, (2) some ground-truth (oracle) templates are incorrect, and (3) the accuracy evaluation results do not provide any information regarding incorrectly identified templates. In this paper, we address the above issues by providing three guidelines for assessing the accuracy of log template identification techniques: (1) use appropriate accuracy metrics, (2) perform oracle template correction, and (3) perform analysis of incorrect templates. We then assess the application of such guidelines through a comprehensive evaluation of 14 existing template identification techniques on the established benchmark logs. Results show very different insights than existing studies and in particular a much less optimistic outlook on existing techniques. |
| Author | Khan, Zanis Ali Shin, Donghwan Briand, Lionel Bianculli, Domenico |
| Author_xml | – sequence: 1 givenname: Zanis Ali surname: Khan fullname: Khan, Zanis Ali email: zanis-ali.khan@uni.lu organization: University of Luxembourg Luxembourg,Luxembourg – sequence: 2 givenname: Donghwan surname: Shin fullname: Shin, Donghwan email: donghwan.shin@uni.lu organization: University of Luxembourg Luxembourg,Luxembourg – sequence: 3 givenname: Domenico surname: Bianculli fullname: Bianculli, Domenico email: domenico.bianculli@uni.lu organization: University of Luxembourg Luxembourg,Luxembourg – sequence: 4 givenname: Lionel surname: Briand fullname: Briand, Lionel email: lionel.briand@uni.lu organization: University of Luxembourg Luxembourg, Luxembourg University of Ottawa,Ottawa,Canada |
| BookMark | eNotjD1PwzAURQ0CibZ0ZmDxH0ix4_hrjCoolYJYihgr1-85NWqdEqdD_z1BoDsc6Z6rOyU3qUtIyANnC84r-SQkZ4yJxS_HXJG51WYUTNiy5PyaTLiUpuBlKe_INOevca0qayfkc3WOgIeYMNPQ9bTOGXOOqaXDHmnt_bl3_kK7QJuupW-jcy3SDR5PBzcgXQOmIYbo3RC7NPZ-n-L3GfM9uQ3ukHH-zxn5eHneLF-L5n21XtZN4QRXQyGYBK4UBB6kcaDAeA475gTTFXIjjZfGBqeBQQAw1isLTCvnK7mToEHMyOPfb0TE7amPR9dftlZbYbUSPxm0U3k |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK ESBDL RIE RIO |
| DOI | 10.1145/3510003.3510101 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Open Access Journals IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781450392211 1450392210 |
| EISSN | 1558-1225 |
| EndPage | 1106 |
| ExternalDocumentID | 9793976 |
| Genre | orig-research |
| GroupedDBID | -~X .4S .DC 123 23M 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ AFFNX ALMA_UNASSIGNED_HOLDINGS APO ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO ESBDL FEDTE I-F I07 IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS XOL |
| ID | FETCH-LOGICAL-a316t-305d166df1f58ad6d8c1db0a3074e1858c589fa7d0dfdd89c69d076ac45b5d7d3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 46 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000832185400089&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:28:32 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a316t-305d166df1f58ad6d8c1db0a3074e1858c589fa7d0dfdd89c69d076ac45b5d7d3 |
| OpenAccessLink | https://ieeexplore.ieee.org/document/9793976 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_9793976 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-May |
| PublicationDateYYYYMMDD | 2022-05-01 |
| PublicationDate_xml | – month: 05 year: 2022 text: 2022-May |
| PublicationDecade | 2020 |
| PublicationTitle | 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) |
| PublicationTitleAbbrev | ICSE |
| PublicationYear | 2022 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0006499 ssj0002871777 |
| Score | 2.4896157 |
| Snippet | Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1095 |
| SubjectTerms | Analytical models Anomaly detection Benchmark testing Guidelines logs Measurement metrics Software engineering template identification |
| Title | Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques |
| URI | https://ieeexplore.ieee.org/document/9793976 |
| WOSCitedRecordID | wos000832185400089&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQNTgRbxLQ-MpI1J_DUiRGEoVYciulWOz66QoKnaBol_j52khYGFKVEkS5Hjyz2f770HcO1SlmRosihLdRKl1PiQUpZH0m8_kFPULCuJwkMxGsnpVI0bcLPjwlhry-Yz2wu35Vk-5qYIpbK-8ovJp88mNIUQFVdrV08JyL-Utqv_wtxD-VrKh6asn7BQyE564UqDA8wvL5UylQza_3uJA-j-cPLIeJdtDqFhF0fQ3poykDpGO_D6WATlqtDNTjwgJdWprh9CPNQjd8YUK22-SO7IMJ-T5-CAMrdkYj-W7x52koq46-pKHplsJV7XXXgZPEzun6LaPSHSCeWbyAcyUs7RUcekRo7SUMxi7YM6tT5LS8OkclpgjA5RKsMVxoJrk7KMocDkGFqLfGFPgAjjt7GKY2adSBl1ynkY4AeyWGtjOJ5CJ8zTbFkJZMzqKTr7-_E57N8GDkHZNXgBrc2qsJewZz43b-vVVflVvwFJz6Qu |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH5BNNETKhh_24NHB5S13XY0RsQ4CIcZuZGuryUmyggwE_97222iBy-etixpsnR9e19f3_d9ANeGcT9FlXopk77HqLIhFWnhhXb7gYKi5GlBFI6D0SicTKJxDW42XBitddF8ptvutjjLx0zlrlTWiexisulzC7Y5Yz1asrU2FRWH_Qtxu-o_LCyYr8R8KOMdn7tStt92V-o8YH65qRTJpN_432vsQ-uHlUfGm3xzADU9P4TGty0DqaK0CS8PudOucv3sxEJSUp7r2iHEgj1yq1S-lOqTZIbE2YwMnQfKTJNEvy_eLPAkJXXXVLU8knyLvK5a8Ny_T-4GXuWf4EmfirVnQxmpEGio4aFEgaGimHalDWumbZ4OFQ8jIwPsokEMIyUi7AZCKsZTjgH6R1CfZ3N9DCRQdiMbCUy1CRinJjIWCNiBvCulUgJPoOnmabooJTKm1RSd_v34CnYHyTCexo-jpzPY6zlGQdFDeA719TLXF7CjPtavq-Vl8YW_AAJop3U |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE%2FACM+44th+International+Conference+on+Software+Engineering+%28ICSE%29&rft.atitle=Guidelines+for+Assessing+the+Accuracy+of+Log+Message+Template+Identification+Techniques&rft.au=Khan%2C+Zanis+Ali&rft.au=Shin%2C+Donghwan&rft.au=Bianculli%2C+Domenico&rft.au=Briand%2C+Lionel&rft.date=2022-05-01&rft.pub=ACM&rft.eissn=1558-1225&rft.spage=1095&rft.epage=1106&rft_id=info:doi/10.1145%2F3510003.3510101&rft.externalDocID=9793976 |