Perfce: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis

Uloženo v:
Podrobná bibliografie
Název: Perfce: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis
Autoři: Ji, Zhenlan, Ma, Pingchuan, Wang, Shuai
Informace o vydavateli: Institute of Electrical and Electronics Engineers Inc.
Rok vydání: 2023
Sbírka: The Hong Kong University of Science and Technology: HKUST Institutional Repository
Témata: Causality Analysis, Chaos Engineering, Performance Debugging
Popis: Debugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrades. Nevertheless, causality analysis is challenging in practice, particularly due to limited observability. Recently, chaos engineering (CE) has been applied to test complex software systems. CE frameworks mutate chaos variables to inject catastrophic events (e.g., network slowdowns) to stress-test these software systems. The systems under chaos stress are then tested (e.g., via differential testing) to check if they retain normal functionality, such as returning correct SQL query outputs even under stress. To date, CE is mainly employed to aid software testing. This paper identifies the novel usage of CE in diagnosing performance anomalies in databases. Our framework, PERFCE, has two phases - offline and online. The offline phase learns statistical models of a database using both passive observations and proactive chaos experiments. The online phase diagnoses the root cause of performance anomalies from both qualitative and quantitative aspects on-the-fly. In evaluation, Perfce outperformed previous works on synthetic datasets and is highly accurate and moderately expensive when analyzing real-world (distributed) databases like MySQL and TiDB.
Druh dokumentu: conference object
Jazyk: English
Relation: https://doi.org/10.1109/ASE56229.2023.00106; http://gateway.isiknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=LinksAMR&SrcApp=PARTNER_APP&DestLinkType=FullRecord&DestApp=WOS&KeyUT=001103357200116
DOI: 10.1109/ASE56229.2023.00106
Dostupnost: http://repository.hkust.edu.hk/ir/Record/1783.1-135693
https://doi.org/10.1109/ASE56229.2023.00106
http://lbdiscover.ust.hk/uresolver?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rfr_id=info:sid/HKUST:SPI&rft.genre=article&rft.issn=&rft.volume=&rft.issue=&rft.date=2023&rft.spage=1454&rft.aulast=Ji&rft.aufirst=Zhenlan&rft.atitle=Perfce%3A+Performance+Debugging+on+Databases+with+Chaos+Engineering-Enhanced+Causality+Analysis&rft.title=Proceedings+-+2023+38th+IEEE%2FACM+International+Conference+on+Automated+Software+Engineering,+ASE+2023
http://gateway.isiknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=LinksAMR&SrcApp=PARTNER_APP&DestLinkType=FullRecord&DestApp=WOS&KeyUT=001103357200116
http://www.scopus.com/record/display.url?eid=2-s2.0-85179001772&origin=inward
Přístupové číslo: edsbas.B3283DF0
Databáze: BASE
Popis
Abstrakt:Debugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrades. Nevertheless, causality analysis is challenging in practice, particularly due to limited observability. Recently, chaos engineering (CE) has been applied to test complex software systems. CE frameworks mutate chaos variables to inject catastrophic events (e.g., network slowdowns) to stress-test these software systems. The systems under chaos stress are then tested (e.g., via differential testing) to check if they retain normal functionality, such as returning correct SQL query outputs even under stress. To date, CE is mainly employed to aid software testing. This paper identifies the novel usage of CE in diagnosing performance anomalies in databases. Our framework, PERFCE, has two phases - offline and online. The offline phase learns statistical models of a database using both passive observations and proactive chaos experiments. The online phase diagnoses the root cause of performance anomalies from both qualitative and quantitative aspects on-the-fly. In evaluation, Perfce outperformed previous works on synthetic datasets and is highly accurate and moderately expensive when analyzing real-world (distributed) databases like MySQL and TiDB.
DOI:10.1109/ASE56229.2023.00106