Bibliographic Details
| Title: |
Wenwang: Toward Effectively Generating Code Beyond Standalone Functions via Generative Pre-trained Models. |
| Authors: |
Yu, Hao, Shen, Bo, Zhang, Jiaxin, Lin, Shaoxin, Li, Lin, Liang, Guangtai, Li, Ying, Wang, Qianxiang, Xie, Tao |
| Source: |
ACM Transactions on Software Engineering & Methodology; Sep2025, Vol. 34 Issue 7, p1-27, 27p |
| Subject Terms: |
CODE generators, PROBABILISTIC generative models, MATHEMATICAL optimization |
| Abstract: |
Code generation models based on the pre-training and fine-tuning paradigm have been increasingly attempted by both academia and industry, resulting in well-known industrial models such as Codex, CodeGen, and PanGu-Coder. After being pre-trained on a large-scale corpus of code, a model is further fine-tuned with datasets specifically for the target downstream task, e.g., generating code from natural language description. The target code being generated can be classified into two types: a standalone function, i.e., a function that invokes or accesses only built-in functions and standard libraries, and a non-standalone function, i.e., a function that invokes or accesses user-defined functions or third-party libraries. To effectively generate code especially non-standalone functions (largely ignored by existing work), in this article, we present Wenwang, an approach to improving the capability of a pre-trained model on generating code beyond standalone functions. Wenwang consists of two components: a fine-tuning dataset named WenwangData and a fine-tuned model named WenwangCoder. Compared with existing fine-tuning datasets, WenwangData additionally covers non-standalone functions. Besides the docstring and code snippet for a function, WenwangData also includes its contextual information collected via program analysis. Based on PanGu-Coder, we produce WenwangCoder by fine-tuning PanGu-Coder on WenwangData with our context-aware fine-tuning technique so that the contextual information can be fully leveraged during code generation. On CoderEval and HumanEval, WenwangCoder outperforms three state-of-the-art models with similar parameter sizes (at the scale of around 300 M), namely CodeGen, PanGu-Coder, and PanGu-FT. Although WenwangCoder does not outperform ChatGPT on HumanEval, WenwangCoder with smaller model parameter sizes can achieve similar effects to ChatGPT on CoderEval. Our experimental results also shed light on a number of promising optimization directions based on existing pre-trained models. [ABSTRACT FROM AUTHOR] |
|
Copyright of ACM Transactions on Software Engineering & Methodology is the property of Association for Computing Machinery and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Database: |
Complementary Index |