DiffCoder: Enhancing Large Language Model on API Invocation via Analogical Code Exercises

The task of code generation aims to generate code solutions based on given programming problems. Recently, code large language models (code LLMs) have shed new light on this task, owing to their formidable code generation capabilities. While these models are powerful, they seldom focus on further im...

Full description

Saved in:

Bibliographic Details
Published in:	Proceedings of the ACM on software engineering Vol. 1; no. FSE; pp. 406 - 426
Main Authors:	Zan, Daoguang, Yu, Ailun, Shen, Bo, Chen, Bei, Li, Wei, Gong, Yongshun, Chen, Xiaolin, Yao, Yafen, Luo, Weihua, Guan, Bei, Liu, Yan, Wang, Yongji, Wang, Qianxiang, Cui, Lizhen
Format:	Journal Article
Language:	English
Published:	New York, NY, USA ACM 12.07.2024
Subjects:	Artificial intelligence Automatic programming Computing methodologies Natural language generation Software and its engineering Software libraries and repositories Instruction Tuning Large Language Model Code Library Code Generation
ISSN:	2994-970X, 2994-970X
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The task of code generation aims to generate code solutions based on given programming problems. Recently, code large language models (code LLMs) have shed new light on this task, owing to their formidable code generation capabilities. While these models are powerful, they seldom focus on further improving the accuracy of library-oriented API invocation. Nonetheless, programmers frequently invoke APIs in routine coding tasks. In this paper, we aim to enhance the proficiency of existing code LLMs regarding API invocation by mimicking analogical learning, which is a critical learning strategy for humans to learn through differences among multiple instances. Motivated by this, we propose a simple yet effective approach, namely DiffCoder, which excels in API invocation by effectively training on the differences (diffs) between analogical code exercises. To assess the API invocation capabilities of code LLMs, we conduct experiments on seven existing benchmarks that focus on mono-library API invocation. Additionally, we construct a new benchmark, namely PanNumEval, to evaluate the performance of multi-library API invocation. Extensive experiments on eight benchmarks demonstrate the impressive performance of DiffCoder. Furthermore, we develop a VSCode plugin for DiffCoder, and the results from twelve invited participants further verify the practicality of DiffCoder.
ISSN:	2994-970X 2994-970X
DOI:	10.1145/3643745