Fixing Large Language Models' Specification Misunderstanding for Better Code Generation

Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially with the development of large language models (LLMs). Due to the inherent difficulty of code generation, the code generated by LLMs may not be a...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings / International Conference on Software Engineering s. 1514 - 1526
Hlavní autori: Tian, Zhao, Chen, Junjie, Zhang, Xiangyu
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 26.04.2025
Predmet:
ISSN:1558-1225
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially with the development of large language models (LLMs). Due to the inherent difficulty of code generation, the code generated by LLMs may not be aligned with the specification. Although thought-eliciting prompting techniques have been proposed to enhance the code generation performance of LLMs, producing correct understanding for complicated programming problems remains challenging, resulting in unsatisfactory performance. Also, some feedbackbased prompting techniques have been proposed to fix incorrect code using error messages produced by test execution. However, when the generated code deviates significantly from the ground truth, they encounter difficulties in improving performance based on such coarse-grained information. In this work, we propose a novel prompting technique, called \mu\mathbf{FiX} , to improve the code generation performance of LLMs by devising both sophisticated thought-eliciting prompting and feedback-based prompting and making the first exploration on their synergy. It first exploits test case analysis to obtain specification understanding and enables a self-improvement process to identify and refine the misunderstanding in the thoughteliciting prompting phase. \mu\mathbf{FiX} further fixes the specification understanding towards the direction reducing the gap between the provided understanding (from the first phase) and the actual understanding implicity utilized by LLMs for code generation in the feedback-based prompting phase. By improving the understanding with \mu \text{FiX} , the code generation performance of LLMs can be largely improved. Our evaluation on two advanced LLMs (ChatGPT and DeepSeek-Coder) with six widely-used benchmarks by comparing with 15 baselines, demonstrates the effectiveness of \mu\mathbf{FiX} . For example, \mu\mathbf{FiX} outperforms the most effective baseline with an average improvement of 35.62 % in terms of Pass@1 across all subjects.
AbstractList Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially with the development of large language models (LLMs). Due to the inherent difficulty of code generation, the code generated by LLMs may not be aligned with the specification. Although thought-eliciting prompting techniques have been proposed to enhance the code generation performance of LLMs, producing correct understanding for complicated programming problems remains challenging, resulting in unsatisfactory performance. Also, some feedbackbased prompting techniques have been proposed to fix incorrect code using error messages produced by test execution. However, when the generated code deviates significantly from the ground truth, they encounter difficulties in improving performance based on such coarse-grained information. In this work, we propose a novel prompting technique, called \mu\mathbf{FiX} , to improve the code generation performance of LLMs by devising both sophisticated thought-eliciting prompting and feedback-based prompting and making the first exploration on their synergy. It first exploits test case analysis to obtain specification understanding and enables a self-improvement process to identify and refine the misunderstanding in the thoughteliciting prompting phase. \mu\mathbf{FiX} further fixes the specification understanding towards the direction reducing the gap between the provided understanding (from the first phase) and the actual understanding implicity utilized by LLMs for code generation in the feedback-based prompting phase. By improving the understanding with \mu \text{FiX} , the code generation performance of LLMs can be largely improved. Our evaluation on two advanced LLMs (ChatGPT and DeepSeek-Coder) with six widely-used benchmarks by comparing with 15 baselines, demonstrates the effectiveness of \mu\mathbf{FiX} . For example, \mu\mathbf{FiX} outperforms the most effective baseline with an average improvement of 35.62 % in terms of Pass@1 across all subjects.
Author Zhang, Xiangyu
Tian, Zhao
Chen, Junjie
Author_xml – sequence: 1
  givenname: Zhao
  surname: Tian
  fullname: Tian, Zhao
  email: tianzhao@tju.edu.cn
  organization: College of Intelligence and Computing, Tianjin University,China
– sequence: 2
  givenname: Junjie
  surname: Chen
  fullname: Chen, Junjie
  email: junjiechen@tju.edu.cn
  organization: College of Intelligence and Computing, Tianjin University,China
– sequence: 3
  givenname: Xiangyu
  surname: Zhang
  fullname: Zhang, Xiangyu
  email: xyzhang@cs.purdue.edu
  organization: Purdue University,Department of Computer Science,USA
BookMark eNotUD1PwzAUNAgk2tJ_0CEbU8qzn-3EI0RtqdSKoSDGyo5fKqPiVE4qwb8nfCx3N9yddDdmV7GNxNiMw5xzMPfrardQCmUxFyDUHIBDecGmpjAlIlegtOGXbMSVKnMuhLph4657BwAtjRmxt2X4DPGQbWw60IDxcLaD2Laejt1dtjtRHZpQ2z60MduG7hw9pa630f-kmjZlj9T3lLJqSGQripR-vbfsurHHjqb_PGGvy8VL9ZRvnlfr6mGTW6Ghz00N0ktpBKH0XjTWFBK5Q3BaOC6kcxqxQS-1Mk4VUjdEDnHYVta8LhxO2OyvNxDR_pTCh01f--EZMTQp_Aa8yVPW
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICSE55347.2025.00108
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798331505691
EISSN 1558-1225
EndPage 1526
ExternalDocumentID 11029745
Genre orig-research
GrantInformation_xml – fundername: National Key Research and Development Program of China
  grantid: 2024YFB4506300
  funderid: 10.13039/501100012166
– fundername: National Natural Science Foundation of China
  grantid: 62322208,12411530122
  funderid: 10.13039/501100001809
GroupedDBID -~X
.4S
.DC
29O
5VS
6IE
6IF
6IH
6IK
6IL
6IM
6IN
8US
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
ARCSS
AVWKF
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
EDO
FEDTE
I-F
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-a260t-9c04d4492e34dd2fa97431b30b62b124bb633f3d4659b5746feeb339838c1c7b3
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001538318100118&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 01:40:07 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a260t-9c04d4492e34dd2fa97431b30b62b124bb633f3d4659b5746feeb339838c1c7b3
PageCount 13
ParticipantIDs ieee_primary_11029745
PublicationCentury 2000
PublicationDate 2025-April-26
PublicationDateYYYYMMDD 2025-04-26
PublicationDate_xml – month: 04
  year: 2025
  text: 2025-April-26
  day: 26
PublicationDecade 2020
PublicationTitle Proceedings / International Conference on Software Engineering
PublicationTitleAbbrev ICSE
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0006499
Score 2.3296773
Snippet Code generation is to automatically generate source code conforming to a given programming specification, which has received extensive attention especially...
SourceID ieee
SourceType Publisher
StartPage 1514
SubjectTerms Accuracy
Benchmark testing
Chatbots
Code Generation
Codes
Large language models
Programming
Prompting Engineering
Software engineering
Source coding
Title Fixing Large Language Models' Specification Misunderstanding for Better Code Generation
URI https://ieeexplore.ieee.org/document/11029745
WOSCitedRecordID wos001538318100118&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA62ePBUHxXf5CB4WrubZPO4WloUaimo2FvJEwrSSh_izzeT3bZePHhZliW7CxOSmfkm33wI3TLHlICqrkvQTXAk04UlWW41dMPygZjUMn8ghkM5HqtRTVZPXBjvfTp85u_hNtXy3dyuASrrRFdFYvxbNlBDCFGRtbbbLo-xe82NK3LVeeq-9MqSMhFzQAK4SQEKkr8UVJID6bf--etD1N5R8fBo62SO0J6fHaPWRosB10vzBL33p99xAB7Aye54rVBIDFJnH8s7nGTmQ43P4ecp8PJ2pBYcI1f8kIg9uBvfwFUzahjbRm_93mv3MatFEzIdU5NVpmwe7c8U8ZQ5R4JWECMYmhtOTHTmxnBKA3WMl8qUgvHgYz5NlaTSFlYYeoqas_nMnyGstVOCSM11_GBBXYylpAyiMIEV0nN-jtpgqMln1RdjsrHRxR_PL9EBzAXUYgi_Qs3VYu2v0b79Wk2Xi5s0mz_g6KBx
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB60Cnqqj4pvcxA8rd08Nru5WiwtbkvBir2VZJNAQVrpQ_z5Jtlt68WDl2VZsruQIZmZb_LNB3DPNBOpr-rqAN1YTSKJCxLFhfTdsIwlKrTMz9N-PxuNxKAiqwcujDEmHD4zj_421PL1rFh5qKzpXBVx8W-yC3sJYwSXdK3Nxstd9F6x43Asmt3W63OSUJa6LJB45AR7DclfGirBhbTr__z5ETS2ZDw02LiZY9gx0xOor9UYULU4T-G9Pfl2A1Duz3a7a4lDIi929rF4QEFo3lYIHepNPDNvS2tBLnZFT4Hag1ruDVS2o_ZjG_DWfh62OlElmxBJl5wsI1HEzgJMEEOZ1sRK4aMERWPFiXLuXClOqaWa8USoJGXcGpdRU5HRrMBFqugZ1KazqTkHJKUWKckkl-6DmGoXTWWZTbGyDGeG8wto-Ikaf5adMcbrObr84_kdHHSGvXycd_svV3Do7eIrM4RfQ205X5kb2C--lpPF_DZY9gdoPaO4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Fixing+Large+Language+Models%27+Specification+Misunderstanding+for+Better+Code+Generation&rft.au=Tian%2C+Zhao&rft.au=Chen%2C+Junjie&rft.au=Zhang%2C+Xiangyu&rft.date=2025-04-26&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=1514&rft.epage=1526&rft_id=info:doi/10.1109%2FICSE55347.2025.00108&rft.externalDocID=11029745