Enhancing Code Language Models for Program Repair by Curricular Fine-tuning Framework

Automated program repair (APR) is a key technique for enhancing software maintenance productivity by fixing buggy code automatically. Recently, large code language models (CLMs) have exhibited impressive capabilities in code generation. However, for complex programming tasks, especially program repa...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Proceedings - Conference on Software Maintenance (1987) s. 136 - 146
Hlavní autori:	Hao, Sichong, Shi, Xianjun, Liu, Hongwei, Shu, Yanjun
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 01.10.2023
Predmet:	Codes Computer architecture Computer bugs Curriculum Learning Large Language Models of Code Productivity Program Repair Semantics Software maintenance Training
ISSN:	2576-3148
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Automated program repair (APR) is a key technique for enhancing software maintenance productivity by fixing buggy code automatically. Recently, large code language models (CLMs) have exhibited impressive capabilities in code generation. However, for complex programming tasks, especially program repair, the success rate of CLMs is still low. One of the reasons is that CLMs are typically developed for general purpose and their potential for APR applications has yet to be fully explored. In this paper, we propose APRFiT, a general curricular fine-tuning framework that improves the success rate of CLMs for APR. Firstly, APRFiT generates syntactically diverse but semantically equivalent bug-fixing programs via code augmentation operators to enrich the diversity of bug-fixing dataset automatically. Secondly, APRFiT designs a curriculum learning-based mechanism to help CLMs develop deep understanding of program semantics from these augmented bug-fixing code variants and improve the effectiveness of fine-tuning for APR tasks. We implement APRFiT on different CLMs and evaluate them on Bugs2Fix small and medium datasets. The extensive experiments demonstrate that, the existing CLMs implemented with APRFiT substantially outperform original models and generate 2.5 to 14.5 percent more correct patches than baselines both effectively and efficiently.
ISSN:	2576-3148
DOI:	10.1109/ICSME58846.2023.00024