Enhancing Code Language Models for Program Repair by Curricular Fine-tuning Framework
Automated program repair (APR) is a key technique for enhancing software maintenance productivity by fixing buggy code automatically. Recently, large code language models (CLMs) have exhibited impressive capabilities in code generation. However, for complex programming tasks, especially program repa...
Uložené v:
| Vydané v: | Proceedings - Conference on Software Maintenance (1987) s. 136 - 146 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.10.2023
|
| Predmet: | |
| ISSN: | 2576-3148 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Automated program repair (APR) is a key technique for enhancing software maintenance productivity by fixing buggy code automatically. Recently, large code language models (CLMs) have exhibited impressive capabilities in code generation. However, for complex programming tasks, especially program repair, the success rate of CLMs is still low. One of the reasons is that CLMs are typically developed for general purpose and their potential for APR applications has yet to be fully explored. In this paper, we propose APRFiT, a general curricular fine-tuning framework that improves the success rate of CLMs for APR. Firstly, APRFiT generates syntactically diverse but semantically equivalent bug-fixing programs via code augmentation operators to enrich the diversity of bug-fixing dataset automatically. Secondly, APRFiT designs a curriculum learning-based mechanism to help CLMs develop deep understanding of program semantics from these augmented bug-fixing code variants and improve the effectiveness of fine-tuning for APR tasks. We implement APRFiT on different CLMs and evaluate them on Bugs2Fix small and medium datasets. The extensive experiments demonstrate that, the existing CLMs implemented with APRFiT substantially outperform original models and generate 2.5 to 14.5 percent more correct patches than baselines both effectively and efficiently. |
|---|---|
| ISSN: | 2576-3148 |
| DOI: | 10.1109/ICSME58846.2023.00024 |