Fixing Large Language Models' Specification Misunderstanding for Better Code Generation

Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially with the development of large language models (LLMs). Due to the inherent difficulty of code generation, the code generated by LLMs may not be a...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Proceedings / International Conference on Software Engineering s. 1514 - 1526
Hlavní autori:	Tian, Zhao, Chen, Junjie, Zhang, Xiangyu
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 26.04.2025
Predmet:	Accuracy Benchmark testing Chatbots Code Generation Codes Large language models Programming Prompting Engineering Software engineering Source coding
ISSN:	1558-1225
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially with the development of large language models (LLMs). Due to the inherent difficulty of code generation, the code generated by LLMs may not be aligned with the specification. Although thought-eliciting prompting techniques have been proposed to enhance the code generation performance of LLMs, producing correct understanding for complicated programming problems remains challenging, resulting in unsatisfactory performance. Also, some feedbackbased prompting techniques have been proposed to fix incorrect code using error messages produced by test execution. However, when the generated code deviates significantly from the ground truth, they encounter difficulties in improving performance based on such coarse-grained information. In this work, we propose a novel prompting technique, called \mu\mathbf{FiX} , to improve the code generation performance of LLMs by devising both sophisticated thought-eliciting prompting and feedback-based prompting and making the first exploration on their synergy. It first exploits test case analysis to obtain specification understanding and enables a self-improvement process to identify and refine the misunderstanding in the thoughteliciting prompting phase. \mu\mathbf{FiX} further fixes the specification understanding towards the direction reducing the gap between the provided understanding (from the first phase) and the actual understanding implicity utilized by LLMs for code generation in the feedback-based prompting phase. By improving the understanding with \mu \text{FiX} , the code generation performance of LLMs can be largely improved. Our evaluation on two advanced LLMs (ChatGPT and DeepSeek-Coder) with six widely-used benchmarks by comparing with 15 baselines, demonstrates the effectiveness of \mu\mathbf{FiX} . For example, \mu\mathbf{FiX} outperforms the most effective baseline with an average improvement of 35.62 % in terms of Pass@1 across all subjects.
AbstractList	Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially with the development of large language models (LLMs). Due to the inherent difficulty of code generation, the code generated by LLMs may not be aligned with the specification. Although thought-eliciting prompting techniques have been proposed to enhance the code generation performance of LLMs, producing correct understanding for complicated programming problems remains challenging, resulting in unsatisfactory performance. Also, some feedbackbased prompting techniques have been proposed to fix incorrect code using error messages produced by test execution. However, when the generated code deviates significantly from the ground truth, they encounter difficulties in improving performance based on such coarse-grained information. In this work, we propose a novel prompting technique, called \mu\mathbf{FiX} , to improve the code generation performance of LLMs by devising both sophisticated thought-eliciting prompting and feedback-based prompting and making the first exploration on their synergy. It first exploits test case analysis to obtain specification understanding and enables a self-improvement process to identify and refine the misunderstanding in the thoughteliciting prompting phase. \mu\mathbf{FiX} further fixes the specification understanding towards the direction reducing the gap between the provided understanding (from the first phase) and the actual understanding implicity utilized by LLMs for code generation in the feedback-based prompting phase. By improving the understanding with \mu \text{FiX} , the code generation performance of LLMs can be largely improved. Our evaluation on two advanced LLMs (ChatGPT and DeepSeek-Coder) with six widely-used benchmarks by comparing with 15 baselines, demonstrates the effectiveness of \mu\mathbf{FiX} . For example, \mu\mathbf{FiX} outperforms the most effective baseline with an average improvement of 35.62 % in terms of Pass@1 across all subjects.
Author	Zhang, Xiangyu Tian, Zhao Chen, Junjie
Author_xml	– sequence: 1 givenname: Zhao surname: Tian fullname: Tian, Zhao email: tianzhao@tju.edu.cn organization: College of Intelligence and Computing, Tianjin University,China – sequence: 2 givenname: Junjie surname: Chen fullname: Chen, Junjie email: junjiechen@tju.edu.cn organization: College of Intelligence and Computing, Tianjin University,China – sequence: 3 givenname: Xiangyu surname: Zhang fullname: Zhang, Xiangyu email: xyzhang@cs.purdue.edu organization: Purdue University,Department of Computer Science,USA
BookMark	eNotUD1PwzAUNAgk2tJ_0CEbU8qzn-3EI0RtqdSKoSDGyo5fKqPiVE4qwb8nfCx3N9yddDdmV7GNxNiMw5xzMPfrardQCmUxFyDUHIBDecGmpjAlIlegtOGXbMSVKnMuhLph4657BwAtjRmxt2X4DPGQbWw60IDxcLaD2Laejt1dtjtRHZpQ2z60MduG7hw9pa630f-kmjZlj9T3lLJqSGQripR-vbfsurHHjqb_PGGvy8VL9ZRvnlfr6mGTW6Ghz00N0ktpBKH0XjTWFBK5Q3BaOC6kcxqxQS-1Mk4VUjdEDnHYVta8LhxO2OyvNxDR_pTCh01f--EZMTQp_Aa8yVPW
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/ICSE55347.2025.00108
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9798331505691
EISSN	1558-1225
EndPage	1526
ExternalDocumentID	11029745
Genre	orig-research
GrantInformation_xml	– fundername: National Key Research and Development Program of China grantid: 2024YFB4506300 funderid: 10.13039/501100012166 – fundername: National Natural Science Foundation of China grantid: 62322208,12411530122 funderid: 10.13039/501100001809
GroupedDBID	-~X .4S .DC 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO FEDTE I-F IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO
ID	FETCH-LOGICAL-a260t-9c04d4492e34dd2fa97431b30b62b124bb633f3d4659b5746feeb339838c1c7b3
IEDL.DBID	RIE
ISICitedReferencesCount	0
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001538318100118&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate	Wed Aug 27 01:40:07 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a260t-9c04d4492e34dd2fa97431b30b62b124bb633f3d4659b5746feeb339838c1c7b3
PageCount	13
ParticipantIDs	ieee_primary_11029745
PublicationCentury	2000
PublicationDate	2025-April-26
PublicationDateYYYYMMDD	2025-04-26
PublicationDate_xml	– month: 04 year: 2025 text: 2025-April-26 day: 26
PublicationDecade	2020
PublicationTitle	Proceedings / International Conference on Software Engineering
PublicationTitleAbbrev	ICSE
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0006499
Score	2.3296773
Snippet	Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially...
SourceID	ieee
SourceType	Publisher
StartPage	1514
SubjectTerms	Accuracy Benchmark testing Chatbots Code Generation Codes Large language models Programming Prompting Engineering Software engineering Source coding
Title	Fixing Large Language Models' Specification Misunderstanding for Better Code Generation
URI	https://ieeexplore.ieee.org/document/11029745
WOSCitedRecordID	wos001538318100118&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA62ePBUHxXf5CB4WrubZPO4WloUaimo2FvJEwrSSh_izzeT3bZePHhZliW7CxOSmfkm33wI3TLHlICqrkvQTXAk04UlWW41dMPygZjUMn8ghkM5HqtRTVZPXBjvfTp85u_hNtXy3dyuASrrRFdFYvxbNlBDCFGRtbbbLo-xe82NK3LVeeq-9MqSMhFzQAK4SQEKkr8UVJID6bf--etD1N5R8fBo62SO0J6fHaPWRosB10vzBL33p99xAB7Aye54rVBIDFJnH8s7nGTmQ43P4ecp8PJ2pBYcI1f8kIg9uBvfwFUzahjbRm_93mv3MatFEzIdU5NVpmwe7c8U8ZQ5R4JWECMYmhtOTHTmxnBKA3WMl8qUgvHgYz5NlaTSFlYYeoqas_nMnyGstVOCSM11_GBBXYylpAyiMIEV0nN-jtpgqMln1RdjsrHRxR_PL9EBzAXUYgi_Qs3VYu2v0b79Wk2Xi5s0mz_g6KBx
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB60Cnqqj4pvcxA8rd08Nru5WiwtbkvBir2VZJNAQVrpQ_z5Jtlt68WDl2VZsruQIZmZb_LNB3DPNBOpr-rqAN1YTSKJCxLFhfTdsIwlKrTMz9N-PxuNxKAiqwcujDEmHD4zj_421PL1rFh5qKzpXBVx8W-yC3sJYwSXdK3Nxstd9F6x43Asmt3W63OSUJa6LJB45AR7DclfGirBhbTr__z5ETS2ZDw02LiZY9gx0xOor9UYULU4T-G9Pfl2A1Duz3a7a4lDIi929rF4QEFo3lYIHepNPDNvS2tBLnZFT4Hag1ruDVS2o_ZjG_DWfh62OlElmxBJl5wsI1HEzgJMEEOZ1sRK4aMERWPFiXLuXClOqaWa8USoJGXcGpdRU5HRrMBFqugZ1KazqTkHJKUWKckkl-6DmGoXTWWZTbGyDGeG8wto-Ikaf5adMcbrObr84_kdHHSGvXycd_svV3Do7eIrM4RfQ205X5kb2C--lpPF_DZY9gdoPaO4
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Fixing+Large+Language+Models%27+Specification+Misunderstanding+for+Better+Code+Generation&rft.au=Tian%2C+Zhao&rft.au=Chen%2C+Junjie&rft.au=Zhang%2C+Xiangyu&rft.date=2025-04-26&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=1514&rft.epage=1526&rft_id=info:doi/10.1109%2FICSE55347.2025.00108&rft.externalDocID=11029745