Perfce: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis

Debugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrades. Nevertheless, causality analysis is challenging in practice, particularly due to limited observability. Recently, chaos engineer...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] pp. 1454 - 1466
Main Authors: Ji, Zhenlan, Ma, Pingchuan, Wang, Shuai
Format: Conference Proceeding
Language:English
Published: IEEE 11.09.2023
Subjects:
ISSN:2643-1572
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Debugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrades. Nevertheless, causality analysis is challenging in practice, particularly due to limited observability. Recently, chaos engineering (CE) has been applied to test complex software systems. CE frameworks mutate chaos variables to inject catastrophic events (e.g., network slowdowns) to stress-test these software systems. The systems under chaos stress are then tested (e.g., via differential testing) to check if they retain normal functionality, such as returning correct SQL query outputs even under stress. To date, CE is mainly employed to aid software testing. This paper identifies the novel usage of CE in diagnosing performance anomalies in databases. Our framework, PERFCE, has two phases - offline and online. The offline phase learns statistical models of a database using both passive observations and proactive chaos experiments. The online phase diagnoses the root cause of performance anomalies from both qualitative and quantitative aspects on-the-fly. In evaluation, Perfce outperformed previous works on synthetic datasets and is highly accurate and moderately expensive when analyzing real-world (distributed) databases like MySQL and TiDB.
AbstractList Debugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrades. Nevertheless, causality analysis is challenging in practice, particularly due to limited observability. Recently, chaos engineering (CE) has been applied to test complex software systems. CE frameworks mutate chaos variables to inject catastrophic events (e.g., network slowdowns) to stress-test these software systems. The systems under chaos stress are then tested (e.g., via differential testing) to check if they retain normal functionality, such as returning correct SQL query outputs even under stress. To date, CE is mainly employed to aid software testing. This paper identifies the novel usage of CE in diagnosing performance anomalies in databases. Our framework, PERFCE, has two phases - offline and online. The offline phase learns statistical models of a database using both passive observations and proactive chaos experiments. The online phase diagnoses the root cause of performance anomalies from both qualitative and quantitative aspects on-the-fly. In evaluation, Perfce outperformed previous works on synthetic datasets and is highly accurate and moderately expensive when analyzing real-world (distributed) databases like MySQL and TiDB.
Author Ji, Zhenlan
Wang, Shuai
Ma, Pingchuan
Author_xml – sequence: 1
  givenname: Zhenlan
  surname: Ji
  fullname: Ji, Zhenlan
  email: zjiae@cse.ust.hk
  organization: The Hong Kong University of Science and Technology
– sequence: 2
  givenname: Pingchuan
  surname: Ma
  fullname: Ma, Pingchuan
  email: pmaab@cse.ust.hk
  organization: The Hong Kong University of Science and Technology
– sequence: 3
  givenname: Shuai
  surname: Wang
  fullname: Wang, Shuai
  email: shuaiw@cse.ust.hk
  organization: The Hong Kong University of Science and Technology
BookMark eNotkF1LwzAUhqMouM39Ar3IH-hMTpqk8W509QMGCuqljNPmtKtsqTQdsn9vh149vLwfF--UXYQuEGM3UiykFO5u-VZoA-AWIEAthJDCnLG5sy5TWihwzqTnbAImVYnUFq7YNMYvIfQo7IR9vlJfV3TPT-z6PYaK-IrKQ9O0oeFd4CscsMRIkf-0w5bnW-wiL8JoE_VjJinC9tTyPMdDxF07HPky4O4Y23jNLmvcRZr_c8Y-Hor3_ClZvzw-58t1gpClQ-IzQdpra8inqddlJgGsttJbTRl6J72qpPV1ZWpRASlvJDhDOnNIrkZQM3b7t9sS0ea7b_fYHzdSwHiCTdUv1xtWbg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ASE56229.2023.00106
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350329964
EISSN 2643-1572
EndPage 1466
ExternalDocumentID 10298374
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a284t-d80e5d576ed44d5b81227571d75e8ad91d3c17dfc6f0c2e3d61296e589ae9fa23
IEDL.DBID RIE
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200116&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:28 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a284t-d80e5d576ed44d5b81227571d75e8ad91d3c17dfc6f0c2e3d61296e589ae9fa23
PageCount 13
ParticipantIDs ieee_primary_10298374
PublicationCentury 2000
PublicationDate 2023-Sept.-11
PublicationDateYYYYMMDD 2023-09-11
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-11
  day: 11
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev ASE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0051577
ssib057256115
Score 2.296513
Snippet Debugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance...
SourceID ieee
SourceType Publisher
StartPage 1454
SubjectTerms Causality Analysis
Chaos
Chaos Engineering
Debugging
Distributed databases
Observability
Performance Debugging
Root cause analysis
Software systems
Software testing
Title Perfce: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis
URI https://ieeexplore.ieee.org/document/10298374
WOSCitedRecordID wos001103357200116&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywOrIXadOGZDpRVTFQmQuqDKsS-UJUVNwu_n7CZtFwa2KFN0_nh3l3vvEXJXuNxxl6TMJDpnUuaSIe5oJnCvOExQnLU2mE2o6TSdzXTWktUDFwYAwvAZ3PvH8C_fLW3jW2V4woXGgkruk32lkjVZq9s8sULw5nyT-yJOK9XKDPFIPzy9jhHqheemCC9qyr3H0Y6hSsCTSf-fX3JEBltmHs02mHNM9qA8If3OmoG2J_WUfGSwKiw80mxLDKB4tzS-wfxJlyV9NrXxEFZR34qlo4VZVnRHnZCNy0WYDqAj01QhW6edgsmAvE_Gb6MX1jopMIPwUzOMOsQOSwtwUro4R1QXKlbcqRhS4zR3Q8uVK2xSRFbA0GHeoxOIUy_dXRgxPCO9clnCOaEqUoUXAMLbyUgbgTYaJGARBUUSC5VekIEP1_x7LZYx7yJ1-cf7K3LoV8SPYHB-TXr1qoEbcmB_6q9qdRuW-BdxTqbL
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgIMFUPor4xgNrIHadOGZDpVURpYpEkbqgyrEvlCVBTcLvx3aTtgsDW5QpOn-8u8u99xC6TXWiiQ4jT4Yi8RhLmGdwR3jU7BVtEhStlHJmE3w8jqZTEddkdceFAQA3fAZ39tH9y9e5qmyrzJxwKkxBxbbRTsAY9Zd0rWb7BNzANyGr7NcgNee10BDxxf3jW9-APbXsFGplTYl1OdqwVHGIMmj_81sOUGfNzcPxCnUO0RZkR6jdmDPg-qweo48YFqmCBxyvqQHY3C6VbTF_4jzDT7KUFsQKbJuxuDeXeYE39Am9fjZ38wG4J6vC5eu40TDpoPdBf9IberWXgicNAJWeiTsE2hQXoBnTQWJwnfKAE80DiKQWRHcV4TpVYeorCl1tMh8RQhBZ8e5U0u4JamV5BqcIc5-nVgLI3E-SKR-EFMDAlFGQhgHl0Rnq2HDNvpdyGbMmUud_vL9Be8PJ62g2eh6_XKB9uzp2IIOQS9QqFxVcoV31U34Vi2u33L9NqqoS
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=Perfce%3A+Performance+Debugging+on+Databases+with+Chaos+Engineering-Enhanced+Causality+Analysis&rft.au=Ji%2C+Zhenlan&rft.au=Ma%2C+Pingchuan&rft.au=Wang%2C+Shuai&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=1454&rft.epage=1466&rft_id=info:doi/10.1109%2FASE56229.2023.00106&rft.externalDocID=10298374