Gamma: Revisiting Template-Based Automated Program Repair Via Mask Prediction

Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance. Template-based APR has been widely investigated and shown promising results. However, it is challenging for template-based APR to select the app...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 535 - 547
Hlavní autoři: Zhang, Quanjun, Fang, Chunrong, Zhang, Tongke, Yu, Bowen, Sun, Weisong, Chen, Zhenyu
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 11.09.2023
Témata:
ISSN:2643-1572
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance. Template-based APR has been widely investigated and shown promising results. However, it is challenging for template-based APR to select the appropriate donor code, which is an important repair ingredient for generating candidate patches. Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns, limiting the repair performance. In this paper, we aim to revisit template-based APR, and propose Gamma, to directly leverage large pre-trained language models for donor code generation. Our main insight is that instead of retrieving donor code in the local buggy file, we can directly predict the correct code tokens based on the context code snippets and repair patterns by a cloze task. Specifically, (1) Gamma revises a variety of fix templates from state-of-the-art template-based APR techniques (i.e., TBar) and transforms them into mask patterns. (2) Gamma adopts a pre-trained language model to predict the correct code for masked code as a fill-in-the-blank task. Although our idea is general and can be built on various existing pre-trained language models, we have implemented Gamma as a practical APR tool based on the recent UniXcoder model. The experimental results demonstrate that Gamma correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59% (14 bugs) and 26.15% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder. Furthermore, Gamma repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs, indicating the generalizability of Gamma in addressing the dataset overfitting issue. We also prove that adopting other pre-trained language models can provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based Gamma is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of Gamma. Overall, our study highlights the promising future of adopting pre-trained models to generate correct patches on top of fix patterns in practice.
AbstractList Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance. Template-based APR has been widely investigated and shown promising results. However, it is challenging for template-based APR to select the appropriate donor code, which is an important repair ingredient for generating candidate patches. Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns, limiting the repair performance. In this paper, we aim to revisit template-based APR, and propose Gamma, to directly leverage large pre-trained language models for donor code generation. Our main insight is that instead of retrieving donor code in the local buggy file, we can directly predict the correct code tokens based on the context code snippets and repair patterns by a cloze task. Specifically, (1) Gamma revises a variety of fix templates from state-of-the-art template-based APR techniques (i.e., TBar) and transforms them into mask patterns. (2) Gamma adopts a pre-trained language model to predict the correct code for masked code as a fill-in-the-blank task. Although our idea is general and can be built on various existing pre-trained language models, we have implemented Gamma as a practical APR tool based on the recent UniXcoder model. The experimental results demonstrate that Gamma correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59% (14 bugs) and 26.15% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder. Furthermore, Gamma repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs, indicating the generalizability of Gamma in addressing the dataset overfitting issue. We also prove that adopting other pre-trained language models can provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based Gamma is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of Gamma. Overall, our study highlights the promising future of adopting pre-trained models to generate correct patches on top of fix patterns in practice.
Author Zhang, Tongke
Sun, Weisong
Yu, Bowen
Zhang, Quanjun
Chen, Zhenyu
Fang, Chunrong
Author_xml – sequence: 1
  givenname: Quanjun
  surname: Zhang
  fullname: Zhang, Quanjun
  email: quanjun.zhang@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,China
– sequence: 2
  givenname: Chunrong
  surname: Fang
  fullname: Fang, Chunrong
  email: fangchunrong@nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,China
– sequence: 3
  givenname: Tongke
  surname: Zhang
  fullname: Zhang, Tongke
  email: 201250032@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,China
– sequence: 4
  givenname: Bowen
  surname: Yu
  fullname: Yu, Bowen
  email: 201250070@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,China
– sequence: 5
  givenname: Weisong
  surname: Sun
  fullname: Sun, Weisong
  email: weisongsun@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,China
– sequence: 6
  givenname: Zhenyu
  surname: Chen
  fullname: Chen, Zhenyu
  email: zychen@nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,China
BookMark eNotjt1Kw0AUhFdRsK19Ar3ICySePfuXeBdLrUKLotXbcpI9KatNUpIo-PYG9Gr4hplhpuKsaRsW4kpCIiVkN_nr0ljELEFAlQCAVSdinrksVQYUZpnVp2KCVqtYGocXYtr3HwBmBDcRmxXVNd1GL_wd-jCEZh9tuT4eaOD4jnr2Uf41tPWIPnru2n1H9Zg9Uuii90DRhvrP0WcfyiG0zaU4r-jQ8_xfZ-LtfrldPMTrp9XjIl_HhKke4pI8GW1Tq9kDVLbQoNhJbW2lgRQ6TMvxvqx8Ray0KrUjw8iucEXqkdRMXP_tBmbeHbtQU_ezk4BjSxn1C5OzUAg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ASE56229.2023.00063
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350329964
EISSN 2643-1572
EndPage 547
ExternalDocumentID 10298335
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61932012,62141215
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a284t-cada546864ed00f6b403e71466f40a32728c9831fdfae343c47a5e2e7b7b8d2a3
IEDL.DBID RIE
ISICitedReferencesCount 31
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200043&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:41 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a284t-cada546864ed00f6b403e71466f40a32728c9831fdfae343c47a5e2e7b7b8d2a3
PageCount 13
ParticipantIDs ieee_primary_10298335
PublicationCentury 2000
PublicationDate 2023-Sept.-11
PublicationDateYYYYMMDD 2023-09-11
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-11
  day: 11
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev ASE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0051577
ssib057256115
Score 2.494709
Snippet Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance....
SourceID ieee
SourceType Publisher
StartPage 535
SubjectTerms Automated Program Repair
Codes
Computer bugs
Fix Pattern
LLM4SE
Maintenance engineering
Manuals
Predictive models
Pretrained Model
Scalability
Transforms
Title Gamma: Revisiting Template-Based Automated Program Repair Via Mask Prediction
URI https://ieeexplore.ieee.org/document/10298335
WOSCitedRecordID wos001103357200043&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6EePCED4zv9OB1ta9td72hQT0IIYqEG5ltuwlRHoHF3-90WdCLB29N00Mz0-k3aef7hpBrTNlTmRsexYy5SGmNIWWzoIxoGRhtE172IRu8mG43GQ7TXkVWL7kw3vuy-MzfhGH5l-9mdhWeyjDCRRpIQjVSM0avyVqbwxMbBG_Ot7kv4rQxlcwQZ-lt662NUC8CN0UEUVMWhD9_NVQp8eSx8c-d7JPmDzOP9raYc0B2_PSQNDatGWgVqUek8wSTCdzR15I8Hkqbad9P5p-YWUb3CFyOtlbFDLNVHPXWJVq4dg7jBR2MgXZg-YHz4RMnOK5J3h_b_YfnqOqcEAHCTRFZcBArnWjlHWO5zhST3uClqHPFQAojEov757nLwUslrTIQe-FNZrLECZDHpD6dTf0JoYhXiciVS3QsVMZTEEZrJwGN6bw08pQ0g3lG87U4xmhjmbM_5s_JXvBAKLng_ILUi8XKX5Jd-1WMl4ur0qXfyCee-Q
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4UTfSED4xve_C62te2u97QgBiBEEXCjXTbbkKQR2Dx9ztdFvTiwVvT9NDMdPpN2vm-QegWUvaYp4oGISE2EFJCSJnEKyMaopU0Ec37kPWaqt2O-v24U5DVcy6Mcy4vPnN3fpj_5dupWfqnMohwFnuS0DbaCYVgZEXXWh-fUAF8U7rJfgGplSqEhiiJ76vvNQB75tkpzMuaEi_9-aulSo4o9fI_93KAKj_cPNzZoM4h2nKTI1ReN2fARaweo9azHo_1A37L6eO-uBl33Xj2Cbll8AjQZXF1mU0hX4VRZ1WkBWtnejjHvaHGLb0Ywbz_xvGuq6CPeq371AiK3gmBBsDJAqOtDoWMpHCWkFQmgnCn4FqUqSCaM8UiA_unqU2144IboXTomFOJSiLLND9Bpcl04k4RBsSKWCpsJEMmEhprpqS0XIMxreOKn6GKN89gtpLHGKwtc_7H_A3aa3RbzUHzpf16gfa9N3wBBqWXqJTNl-4K7ZqvbLiYX-fu_QY-l6JA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=Gamma%3A+Revisiting+Template-Based+Automated+Program+Repair+Via+Mask+Prediction&rft.au=Zhang%2C+Quanjun&rft.au=Fang%2C+Chunrong&rft.au=Zhang%2C+Tongke&rft.au=Yu%2C+Bowen&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=535&rft.epage=547&rft_id=info:doi/10.1109%2FASE56229.2023.00063&rft.externalDocID=10298335