CodeTranFix: A Neural Machine Translation Approach for Context-Aware Java Program Repair with CodeBERT.

Saved in:
Bibliographic Details
Title: CodeTranFix: A Neural Machine Translation Approach for Context-Aware Java Program Repair with CodeBERT.
Authors: Lu, Yiwei, Ye, Shuxia, Qi, Liang
Source: Applied Sciences (2076-3417); Apr2025, Vol. 15 Issue 7, p3632, 14p
Subject Terms: LANGUAGE models, COMPUTER software quality control, CONTEXTUAL learning, BOTTLENECKS (Manufacturing), GENERALIZATION
Abstract: Automated program repair (APR) plays a vital role in enhancing software quality and reducing developer maintenance efforts. Neural Machine Translation (NMT)-based methods demonstrate notable potential by learning translation patterns from bug-fix code pairs. However, traditional approaches are constrained by limited model capacity and training data scale, leading to performance bottlenecks in generalizing to unseen defect patterns. In this paper, we propose CodeTransFix, a novel APR approach that synergistically combines neural machine translation (NMT) methods with code-specific large language models of code (LLMCs) such as CodeBERT. The CodeTransFix approach innovatively learns contextual embeddings of bug-related code through CodeBERT and integrates these representations as supplementary inputs to the Transformer model, enabling context-aware patch generation. The repair performance is evaluated on the widely used Defects4j v1.2 benchmark. Our experimental results showed that CodeTransFix achieved a 54.1% performance improvement compared to the best NMT-based baseline model and a 23.3% performance improvement compared to the best LLMCs for fixing bugs. In addition, CodeTransFix outperformed existing APR methods in the Defects4j v2.0 generalization test. [ABSTRACT FROM AUTHOR]
Copyright of Applied Sciences (2076-3417) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Complementary Index
Description
Abstract:Automated program repair (APR) plays a vital role in enhancing software quality and reducing developer maintenance efforts. Neural Machine Translation (NMT)-based methods demonstrate notable potential by learning translation patterns from bug-fix code pairs. However, traditional approaches are constrained by limited model capacity and training data scale, leading to performance bottlenecks in generalizing to unseen defect patterns. In this paper, we propose CodeTransFix, a novel APR approach that synergistically combines neural machine translation (NMT) methods with code-specific large language models of code (LLMCs) such as CodeBERT. The CodeTransFix approach innovatively learns contextual embeddings of bug-related code through CodeBERT and integrates these representations as supplementary inputs to the Transformer model, enabling context-aware patch generation. The repair performance is evaluated on the widely used Defects4j v1.2 benchmark. Our experimental results showed that CodeTransFix achieved a 54.1% performance improvement compared to the best NMT-based baseline model and a 23.3% performance improvement compared to the best LLMCs for fixing bugs. In addition, CodeTransFix outperformed existing APR methods in the Defects4j v2.0 generalization test. [ABSTRACT FROM AUTHOR]
ISSN:20763417
DOI:10.3390/app15073632