CodeTranFix: A Neural Machine Translation Approach for Context-Aware Java Program Repair with CodeBERT

Gespeichert in:
Bibliographische Detailangaben
Titel: CodeTranFix: A Neural Machine Translation Approach for Context-Aware Java Program Repair with CodeBERT
Autoren: Yiwei Lu, Shuxia Ye, Liang Qi
Quelle: Applied Sciences, Vol 15, Iss 7, p 3632 (2025)
Verlagsinformationen: MDPI AG, 2025.
Publikationsjahr: 2025
Schlagwörter: Technology, Chemistry, QH301-705.5, Physics, QC1-999, context-aware patch generation, automated program repair (APR), TA1-2040, Biology (General), Engineering (General). Civil engineering (General), neural machine translation (NMT), QD1-999
Beschreibung: Automated program repair (APR) plays a vital role in enhancing software quality and reducing developer maintenance efforts. Neural Machine Translation (NMT)-based methods demonstrate notable potential by learning translation patterns from bug-fix code pairs. However, traditional approaches are constrained by limited model capacity and training data scale, leading to performance bottlenecks in generalizing to unseen defect patterns. In this paper, we propose CodeTransFix, a novel APR approach that synergistically combines neural machine translation (NMT) methods with code-specific large language models of code (LLMCs) such as CodeBERT. The CodeTransFix approach innovatively learns contextual embeddings of bug-related code through CodeBERT and integrates these representations as supplementary inputs to the Transformer model, enabling context-aware patch generation. The repair performance is evaluated on the widely used Defects4j v1.2 benchmark. Our experimental results showed that CodeTransFix achieved a 54.1% performance improvement compared to the best NMT-based baseline model and a 23.3% performance improvement compared to the best LLMCs for fixing bugs. In addition, CodeTransFix outperformed existing APR methods in the Defects4j v2.0 generalization test.
Publikationsart: Article
Sprache: English
ISSN: 2076-3417
DOI: 10.3390/app15073632
Zugangs-URL: https://doaj.org/article/16b1494334b748a0b2e0208d5ddee76e
Rights: CC BY
Dokumentencode: edsair.doi.dedup.....9e6d4e61775576d1f5407589d84dfd40
Datenbank: OpenAIRE
Beschreibung
Abstract:Automated program repair (APR) plays a vital role in enhancing software quality and reducing developer maintenance efforts. Neural Machine Translation (NMT)-based methods demonstrate notable potential by learning translation patterns from bug-fix code pairs. However, traditional approaches are constrained by limited model capacity and training data scale, leading to performance bottlenecks in generalizing to unseen defect patterns. In this paper, we propose CodeTransFix, a novel APR approach that synergistically combines neural machine translation (NMT) methods with code-specific large language models of code (LLMCs) such as CodeBERT. The CodeTransFix approach innovatively learns contextual embeddings of bug-related code through CodeBERT and integrates these representations as supplementary inputs to the Transformer model, enabling context-aware patch generation. The repair performance is evaluated on the widely used Defects4j v1.2 benchmark. Our experimental results showed that CodeTransFix achieved a 54.1% performance improvement compared to the best NMT-based baseline model and a 23.3% performance improvement compared to the best LLMCs for fixing bugs. In addition, CodeTransFix outperformed existing APR methods in the Defects4j v2.0 generalization test.
ISSN:20763417
DOI:10.3390/app15073632