DALO-APR: LLM-based automatic program repair with data augmentation and loss function optimization: DALO-APR: LLM-based automated program repair...: S. Wang et al.

Saved in:
Bibliographic Details
Title: DALO-APR: LLM-based automatic program repair with data augmentation and loss function optimization: DALO-APR: LLM-based automated program repair...: S. Wang et al.
Authors: Wang, Shaosheng1,2 (AUTHOR), Lu, Lu1,3 (AUTHOR) lul@scut.edu.cn, Qiu, Shaojian4 (AUTHOR), Tian, Qingyan5 (AUTHOR), Lin, Haishan5 (AUTHOR)
Source: Journal of Supercomputing. Apr2025, Vol. 81 Issue 5, p1-30. 30p.
Abstract: Automatic program repair (APR) has made significant strides with the advent of large language models (LLMs) such as T5 and CodeT5. However, LLM-based APR models may rely on repetitive repair patterns due to limited training data diversity, resulting in suboptimal performance. Additionally, common loss functions, such as cross-entropy, may not fully prioritize repair locations or optimize the model’s output probability distribution to favor more accurate repair candidates. To address these challenges, this paper proposes a method for LLM-Based APR with Data Augmentation and Loss Function Optimization (DALO-APR). The data augmentation strategy expands the variety of repair patterns by randomly deleting, inserting, swapping tokens, and injecting errors. The optimized loss function helps the model rank more accurate repair candidates higher. Experimental results on Java, JavaScript, Python, and C datasets demonstrate that DALO-APR improves both error localization and bug fixing. Compared to baseline models, DALO-APR shows improvements across multiple metrics, especially with a 105.65% increase in 100% accuracy. [ABSTRACT FROM AUTHOR]
Database: Academic Search Index
Be the first to leave a comment!
You must be logged in first