Adaptive Performance Anomaly Detection for Online Service Systems via Pattern Sketching
To ensure the performance of online service systems, their status is closely monitored with various software and system metrics. Performance anomalies represent the performance degradation issues (e.g., slow response) of the service systems. When performing anomaly detection over the metrics, existi...
Gespeichert in:
| Veröffentlicht in: | 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) S. 61 - 72 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
ACM
01.05.2022
|
| Schlagworte: | |
| ISSN: | 1558-1225 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | To ensure the performance of online service systems, their status is closely monitored with various software and system metrics. Performance anomalies represent the performance degradation issues (e.g., slow response) of the service systems. When performing anomaly detection over the metrics, existing methods often lack the merit of interpretability, which is vital for engineers and analysts to take remediation actions. Moreover, they are unable to effectively accommodate the ever-changing services in an online fashion. To address these limitations, in this paper, we propose ADSketch, an interpretable and adaptive performance anomaly detection approach based on pattern sketching. ADSketch achieves interpretability by identifying groups of anomalous metric patterns, which represent particular types of performance issues. The underlying issues can then be immediately recognized if similar patterns emerge again. In addition, an adaptive learning algorithm is designed to embrace unprecedented patterns induced by service updates or user behavior changes. The proposed approach is evaluated with public data as well as industrial data collected from a representative online service system in Huawei Cloud. The experimental results show that ADSketch outperforms state-of-the-art approaches by a significant margin, and demonstrate the effectiveness of the online algorithm in new pattern discovery. Furthermore, our approach has been successfully deployed in industrial practice. |
|---|---|
| AbstractList | To ensure the performance of online service systems, their status is closely monitored with various software and system metrics. Performance anomalies represent the performance degradation issues (e.g., slow response) of the service systems. When performing anomaly detection over the metrics, existing methods often lack the merit of interpretability, which is vital for engineers and analysts to take remediation actions. Moreover, they are unable to effectively accommodate the ever-changing services in an online fashion. To address these limitations, in this paper, we propose ADSketch, an interpretable and adaptive performance anomaly detection approach based on pattern sketching. ADSketch achieves interpretability by identifying groups of anomalous metric patterns, which represent particular types of performance issues. The underlying issues can then be immediately recognized if similar patterns emerge again. In addition, an adaptive learning algorithm is designed to embrace unprecedented patterns induced by service updates or user behavior changes. The proposed approach is evaluated with public data as well as industrial data collected from a representative online service system in Huawei Cloud. The experimental results show that ADSketch outperforms state-of-the-art approaches by a significant margin, and demonstrate the effectiveness of the online algorithm in new pattern discovery. Furthermore, our approach has been successfully deployed in industrial practice. |
| Author | Liu, Jinyang Su, Yuxin Ling, Xiao Chen, Zhuangbin Lyu, Michael R. Zhang, Hongyu |
| Author_xml | – sequence: 1 givenname: Zhuangbin surname: Chen fullname: Chen, Zhuangbin organization: The Chinese University of Hong Kong,Hong Kong,China – sequence: 2 givenname: Jinyang surname: Liu fullname: Liu, Jinyang organization: The Chinese University of Hong Kong,Hong Kong,China – sequence: 3 givenname: Yuxin surname: Su fullname: Su, Yuxin email: suyx35@mail.sysu.edu.cn organization: School of Software Engineering, Sun Yat-sen University,Zhuhai,China – sequence: 4 givenname: Hongyu surname: Zhang fullname: Zhang, Hongyu organization: The University of Newcastle,NSW,Australia – sequence: 5 givenname: Xiao surname: Ling fullname: Ling, Xiao organization: Yongqiang Yang Huawei Cloud BU,Beijing,China – sequence: 6 givenname: Michael R. surname: Lyu fullname: Lyu, Michael R. organization: The Chinese University of Hong Kong,Hong Kong,China |
| BookMark | eNotjM9LwzAYQKMouM2dPXjJP9CZL2na5ljmTxhsUMXj-NJ-0eiajjYM-t9b1NM7vMebs4vQBWLsBsQKINV3SoMQQq1-WegztjR5MQmhjJQA52wGWhcJSKmv2HwYvqY6S42ZsfeywWP0J-I76l3Xtxhq4mXoWjyM_J4i1dF3gU-Kb8PBB-IV9Sc_RdU4RGoHfvLIdxgj9YFX3xTrTx8-rtmlw8NAy38u2Nvjw-v6Odlsn17W5SZBmeuYZOgypdIcm8Y2JrdgJGorXQqFaFSdUQbaFiIVNaBD3QgkLJwFa6wGkzu1YLd_X09E-2PvW-zHvclNKjKtfgBOVlOn |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1145/3510003.3510085 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781450392211 1450392210 |
| EISSN | 1558-1225 |
| EndPage | 72 |
| ExternalDocumentID | 9794065 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Australian Research Council (ARC) grantid: DP200102940,DP220103044 funderid: 10.13039/501100000923 |
| GroupedDBID | -~X .4S .DC 123 23M 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ AFFNX ALMA_UNASSIGNED_HOLDINGS APO ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO FEDTE I-F I07 IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS XOL |
| ID | FETCH-LOGICAL-a275t-6af63347addbd97b192a5b2f4180d3c6e615b8040c1afa5d0aea8fb1b9b5197f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 31 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000832185400006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:28:32 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a275t-6af63347addbd97b192a5b2f4180d3c6e615b8040c1afa5d0aea8fb1b9b5197f3 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_9794065 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-May |
| PublicationDateYYYYMMDD | 2022-05-01 |
| PublicationDate_xml | – month: 05 year: 2022 text: 2022-May |
| PublicationDecade | 2020 |
| PublicationTitle | 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) |
| PublicationTitleAbbrev | ICSE |
| PublicationYear | 2022 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0006499 ssj0002871777 |
| Score | 2.4329133 |
| Snippet | To ensure the performance of online service systems, their status is closely monitored with various software and system metrics. Performance anomalies... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 61 |
| SubjectTerms | Adaptation models Adaptive learning Cloud computing Measurement online learning performance anomaly detection Production Software Software algorithms Time series analysis |
| Title | Adaptive Performance Anomaly Detection for Online Service Systems via Pattern Sketching |
| URI | https://ieeexplore.ieee.org/document/9794065 |
| WOSCitedRecordID | wos000832185400006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQNTgRbxLQ-MpG1iO3ZGBFQMqIoEiG6VHZ-lCkirNq3Ev8d2Q8vAwpQoGRKdcx--3HsP4JoWIkEpbZQKwyNGMYs0MzIy1qoUGWqOQbXkSYxGcjzO8gbcbLEwiBiGz7DnT8O_fDMrVr5V1s_cx-NSZhOaQqQbrNa2n-Ir_0BtV0fh1JXyNZVPzHifct_Ipr1w9MLJv7RUQioZtv_3EgfQ3WHySL7NNofQwPII2j-iDKT20Q683Ro19zGM5DtMAHG7_E_18UXusQqzVyVxt8iGZ5TU8YLU7OVkPVUkD7ybJXl-9-vqntiF1-HDy91jVKsnRCoRvIpSZVNKmXABTJtMaFfKKa4Ty2I5MLRI0dUyWjofLmJlFTcDhUpaHetMezCrpcfQKmclngApXBkZa8HReT9DZ2q3rRRK6IFUCQpWnELH22ky3xBkTGoTnf19-Rz2E48hCFODF9CqFiu8hL1iXU2Xi6uwqt_CRaO- |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG8QTfSECsZve_DogK3t2h2NH8GIhESM3Ei7viZEHQQHif-9bangwYunLdthy-veR9_e7_dD6JLkPAEhTJRyzSJKIIsU1SLSxsgUKCgGXrWky3s9MRxm_Qq6WmFhAMAPn0HTnfp_-XqSz12rrJXZj8emzA206ZSzAlpr1VFxtb8ntwtxOLXFfCDziSlrEeZa2aTpj046-Zeaik8m97X_vcYuaqxRebi_yjd7qALFPqr9yDLg4KV19Hqt5dRFMdxfowKw3ed_yPcvfAuln74qsL2Fl0yjOEQMHPjL8WIscd8zbxb4-c2trH1iA73c3w1uOlHQT4hkwlkZpdKkhFBuQ5jSGVe2mJNMJYbGoq1JnoKtZpSwXpzH0kim2xKkMCpWmXJwVkMOULWYFHCIcG4LyVhxBtb_KVhT240ll1y1hUyA0_wI1Z2dRtMlRcYomOj478sXaLszeOqOug-9xxO0kzhEgZ8hPEXVcjaHM7SVL8rx5-zcr_A3efenBw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE%2FACM+44th+International+Conference+on+Software+Engineering+%28ICSE%29&rft.atitle=Adaptive+Performance+Anomaly+Detection+for+Online+Service+Systems+via+Pattern+Sketching&rft.au=Chen%2C+Zhuangbin&rft.au=Liu%2C+Jinyang&rft.au=Su%2C+Yuxin&rft.au=Zhang%2C+Hongyu&rft.date=2022-05-01&rft.pub=ACM&rft.eissn=1558-1225&rft.spage=61&rft.epage=72&rft_id=info:doi/10.1145%2F3510003.3510085&rft.externalDocID=9794065 |