DiffCoder: Enhancing Large Language Model on API Invocation via Analogical Code Exercises

The task of code generation aims to generate code solutions based on given programming problems. Recently, code large language models (code LLMs) have shed new light on this task, owing to their formidable code generation capabilities. While these models are powerful, they seldom focus on further im...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the ACM on software engineering Ročník 1; číslo FSE; s. 406 - 426
Hlavní autoři: Zan, Daoguang, Yu, Ailun, Shen, Bo, Chen, Bei, Li, Wei, Gong, Yongshun, Chen, Xiaolin, Yao, Yafen, Luo, Weihua, Guan, Bei, Liu, Yan, Wang, Yongji, Wang, Qianxiang, Cui, Lizhen
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York, NY, USA ACM 12.07.2024
Témata:
ISSN:2994-970X, 2994-970X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The task of code generation aims to generate code solutions based on given programming problems. Recently, code large language models (code LLMs) have shed new light on this task, owing to their formidable code generation capabilities. While these models are powerful, they seldom focus on further improving the accuracy of library-oriented API invocation. Nonetheless, programmers frequently invoke APIs in routine coding tasks. In this paper, we aim to enhance the proficiency of existing code LLMs regarding API invocation by mimicking analogical learning, which is a critical learning strategy for humans to learn through differences among multiple instances. Motivated by this, we propose a simple yet effective approach, namely DiffCoder, which excels in API invocation by effectively training on the differences (diffs) between analogical code exercises. To assess the API invocation capabilities of code LLMs, we conduct experiments on seven existing benchmarks that focus on mono-library API invocation. Additionally, we construct a new benchmark, namely PanNumEval, to evaluate the performance of multi-library API invocation. Extensive experiments on eight benchmarks demonstrate the impressive performance of DiffCoder. Furthermore, we develop a VSCode plugin for DiffCoder, and the results from twelve invited participants further verify the practicality of DiffCoder.
ISSN:2994-970X
2994-970X
DOI:10.1145/3643745