Gamma: Revisiting Template-Based Automated Program Repair Via Mask Prediction
Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance. Template-based APR has been widely investigated and shown promising results. However, it is challenging for template-based APR to select the app...
Saved in:
| Published in: | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] pp. 535 - 547 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
11.09.2023
|
| Subjects: | |
| ISSN: | 2643-1572 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance. Template-based APR has been widely investigated and shown promising results. However, it is challenging for template-based APR to select the appropriate donor code, which is an important repair ingredient for generating candidate patches. Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns, limiting the repair performance. In this paper, we aim to revisit template-based APR, and propose Gamma, to directly leverage large pre-trained language models for donor code generation. Our main insight is that instead of retrieving donor code in the local buggy file, we can directly predict the correct code tokens based on the context code snippets and repair patterns by a cloze task. Specifically, (1) Gamma revises a variety of fix templates from state-of-the-art template-based APR techniques (i.e., TBar) and transforms them into mask patterns. (2) Gamma adopts a pre-trained language model to predict the correct code for masked code as a fill-in-the-blank task. Although our idea is general and can be built on various existing pre-trained language models, we have implemented Gamma as a practical APR tool based on the recent UniXcoder model. The experimental results demonstrate that Gamma correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59% (14 bugs) and 26.15% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder. Furthermore, Gamma repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs, indicating the generalizability of Gamma in addressing the dataset overfitting issue. We also prove that adopting other pre-trained language models can provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based Gamma is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of Gamma. Overall, our study highlights the promising future of adopting pre-trained models to generate correct patches on top of fix patterns in practice. |
|---|---|
| AbstractList | Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance. Template-based APR has been widely investigated and shown promising results. However, it is challenging for template-based APR to select the appropriate donor code, which is an important repair ingredient for generating candidate patches. Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns, limiting the repair performance. In this paper, we aim to revisit template-based APR, and propose Gamma, to directly leverage large pre-trained language models for donor code generation. Our main insight is that instead of retrieving donor code in the local buggy file, we can directly predict the correct code tokens based on the context code snippets and repair patterns by a cloze task. Specifically, (1) Gamma revises a variety of fix templates from state-of-the-art template-based APR techniques (i.e., TBar) and transforms them into mask patterns. (2) Gamma adopts a pre-trained language model to predict the correct code for masked code as a fill-in-the-blank task. Although our idea is general and can be built on various existing pre-trained language models, we have implemented Gamma as a practical APR tool based on the recent UniXcoder model. The experimental results demonstrate that Gamma correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59% (14 bugs) and 26.15% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder. Furthermore, Gamma repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs, indicating the generalizability of Gamma in addressing the dataset overfitting issue. We also prove that adopting other pre-trained language models can provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based Gamma is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of Gamma. Overall, our study highlights the promising future of adopting pre-trained models to generate correct patches on top of fix patterns in practice. |
| Author | Zhang, Tongke Sun, Weisong Yu, Bowen Zhang, Quanjun Chen, Zhenyu Fang, Chunrong |
| Author_xml | – sequence: 1 givenname: Quanjun surname: Zhang fullname: Zhang, Quanjun email: quanjun.zhang@smail.nju.edu.cn organization: Nanjing University,State Key Laboratory for Novel Software Technology,China – sequence: 2 givenname: Chunrong surname: Fang fullname: Fang, Chunrong email: fangchunrong@nju.edu.cn organization: Nanjing University,State Key Laboratory for Novel Software Technology,China – sequence: 3 givenname: Tongke surname: Zhang fullname: Zhang, Tongke email: 201250032@smail.nju.edu.cn organization: Nanjing University,State Key Laboratory for Novel Software Technology,China – sequence: 4 givenname: Bowen surname: Yu fullname: Yu, Bowen email: 201250070@smail.nju.edu.cn organization: Nanjing University,State Key Laboratory for Novel Software Technology,China – sequence: 5 givenname: Weisong surname: Sun fullname: Sun, Weisong email: weisongsun@smail.nju.edu.cn organization: Nanjing University,State Key Laboratory for Novel Software Technology,China – sequence: 6 givenname: Zhenyu surname: Chen fullname: Chen, Zhenyu email: zychen@nju.edu.cn organization: Nanjing University,State Key Laboratory for Novel Software Technology,China |
| BookMark | eNotjt1Kw0AUhFdRsK19Ar3ICySePfuXeBdLrUKLotXbcpI9KatNUpIo-PYG9Gr4hplhpuKsaRsW4kpCIiVkN_nr0ljELEFAlQCAVSdinrksVQYUZpnVp2KCVqtYGocXYtr3HwBmBDcRmxXVNd1GL_wd-jCEZh9tuT4eaOD4jnr2Uf41tPWIPnru2n1H9Zg9Uuii90DRhvrP0WcfyiG0zaU4r-jQ8_xfZ-LtfrldPMTrp9XjIl_HhKke4pI8GW1Tq9kDVLbQoNhJbW2lgRQ6TMvxvqx8Ray0KrUjw8iucEXqkdRMXP_tBmbeHbtQU_ezk4BjSxn1C5OzUAg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ASE56229.2023.00063 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350329964 |
| EISSN | 2643-1572 |
| EndPage | 547 |
| ExternalDocumentID | 10298335 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61932012,62141215 funderid: 10.13039/501100001809 |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a284t-cada546864ed00f6b403e71466f40a32728c9831fdfae343c47a5e2e7b7b8d2a3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 31 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200043&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a284t-cada546864ed00f6b403e71466f40a32728c9831fdfae343c47a5e2e7b7b8d2a3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10298335 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-11 |
| PublicationDateYYYYMMDD | 2023-09-11 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0051577 ssib057256115 |
| Score | 2.494709 |
| Snippet | Automated program repair (APR) aims to fix software bugs without manual debugging efforts and plays a crucial role in software development and maintenance.... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 535 |
| SubjectTerms | Automated Program Repair Codes Computer bugs Fix Pattern LLM4SE Maintenance engineering Manuals Predictive models Pretrained Model Scalability Transforms |
| Title | Gamma: Revisiting Template-Based Automated Program Repair Via Mask Prediction |
| URI | https://ieeexplore.ieee.org/document/10298335 |
| WOSCitedRecordID | wos001103357200043&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxcBUHkVQHvLAGogdJ5ewFdTC0qqCgrpVl_giVdCH2pTfzzlNCwsDymJZN1i-XL7P8X13QtxAHIJKIPPAmsgz1ioPmXZz4MUU8AFa-4hlswno9-PRKBlUYvVSC0NEZfIZ3bpheZdv59na_SrjCNeJEwnVRA0g2oi1ti9PCAzeSu24L-M0QFVmSPnJXfu1w1CvnTZFu6Kmviv8-auhSokn3cY_V3Iomj_KPDnYYc6R2KPZsWhsWzPIKlJPRO8Jp1O8ly-leNylNsshTRefzCy9BwYuK9vrYs5slUeDTYoW2y5wspTvE5Q9XH3wvLvEcY5rirduZ_j47FWdEzxkuCm8DC2GJoojQ9b38yg1fkDAH8UoNz4GGnSc8fpVbnOkwASZAQxJE6SQxlZjcCrqs_mMzoR0FM_EyNZsiGmUhsbysScH4gcgORdNtz3jxaY4xni7M60_5i_EgfOAS7lQ6lLUi-WarsR-9lVMVsvr0qXfSxmfaw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagIMFUHkW88cAasB0nl7AV1FJEW1VQULfqEjtSBX2oTfn9nNMHLAwoi2XdYPly-T7H990xdg1RADKG1AOjQ08bIz0k2k2BF1mfDtBKIBbNJqDdjnq9uLMUqxdaGGttkXxmb9ywuMs343TufpVRhKvYiYQ22VagtRILudbq9QmA4FvKNfslpAZYFhqSIr6tvtYI7JVTpyhX1lS40p-_WqoUiFIv_3Mte6zyo83jnTXq7LMNOzpg5VVzBr6M1UPWesThEO_4SyEfd8nNvGuHk0_ilt49QZfh1Xk-Jr5Ko84iSYtsJziY8vcB8hbOPmjeXeM411XYW73WfWh4y94JHhLg5F6KBgMdRqG2RogsTLTwLdBnMcy0QF-BilJav8xMhtbXfqoBA6ssJJBERqF_xEqj8cgeM-5Ino6QrMkQkzAJtKGDTwaWHoD4hFXc9vQni_IY_dXOnP4xf8V2Gt1Ws998aj-fsV3nDZeAIeU5K-XTub1g2-lXPphNLwv3fgPHMqKy |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=Gamma%3A+Revisiting+Template-Based+Automated+Program+Repair+Via+Mask+Prediction&rft.au=Zhang%2C+Quanjun&rft.au=Fang%2C+Chunrong&rft.au=Zhang%2C+Tongke&rft.au=Yu%2C+Bowen&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=535&rft.epage=547&rft_id=info:doi/10.1109%2FASE56229.2023.00063&rft.externalDocID=10298335 |