Model Editing for LLMs4Code: How Far are we?
Large Language Models for Code (LLMs4Code) have been found to exhibit outstanding performance in the software engineering domain, especially the remarkable performance in coding tasks. However, even the most advanced LLMs4Code can inevitably contain incorrect or outdated code knowledge. Due to the h...
Uložené v:
| Vydané v: | Proceedings / International Conference on Software Engineering s. 937 - 949 |
|---|---|
| Hlavní autori: | , , , , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
26.04.2025
|
| Predmet: | |
| ISSN: | 1558-1225 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Large Language Models for Code (LLMs4Code) have been found to exhibit outstanding performance in the software engineering domain, especially the remarkable performance in coding tasks. However, even the most advanced LLMs4Code can inevitably contain incorrect or outdated code knowledge. Due to the high cost of training LLMs4Code, it is impractical to re-train the models for fixing these problematic code knowledge. Model editing is a new technical field for effectively and efficiently correcting erroneous knowledge in LLMs, where various model editing techniques and benchmarks have been proposed recently. Despite that, a comprehensive study that thoroughly compares and analyzes the performance of the state-of-the-art model editing techniques for adapting the knowledge within LLMs4Code across various code-related tasks is notably absent. To bridge this gap, we perform the first systematic study on applying state-of-the-art model editing approaches to repair the inaccuracy of LLMs4Code. To that end, we introduce a benchmark named CLMEEval, which consists of two datasets, i.e., CoNaLa-Edit (CNLE) with 21K+ code generation samples and CodeSearchNet-Edit (CSNE) with 16K+ code summarization samples. With the help of CLMEEval, we evaluate six advanced model editing techniques on three LLMs4Code: CodeLlama (7B), CodeQwen1.5 (7B), and Stable-Code (3B). Our findings include that the external memorization-based GRACE approach achieves the best knowledge editing effectiveness and specificity (the editing does not influence untargeted knowledge), while generalization (whether the editing can generalize to other semantically-identical inputs) is a universal challenge for existing techniques. Furthermore, building on in-depth case analysis, we introduce an enhanced version of GRACE called A-GRACE, which incorporates contrastive learning to better capture the semantics of the inputs. Results demonstrate that A-GRACE notably enhances generalization while maintaining similar levels of effectiveness and specificity compared to the vanilla GRACE. |
|---|---|
| AbstractList | Large Language Models for Code (LLMs4Code) have been found to exhibit outstanding performance in the software engineering domain, especially the remarkable performance in coding tasks. However, even the most advanced LLMs4Code can inevitably contain incorrect or outdated code knowledge. Due to the high cost of training LLMs4Code, it is impractical to re-train the models for fixing these problematic code knowledge. Model editing is a new technical field for effectively and efficiently correcting erroneous knowledge in LLMs, where various model editing techniques and benchmarks have been proposed recently. Despite that, a comprehensive study that thoroughly compares and analyzes the performance of the state-of-the-art model editing techniques for adapting the knowledge within LLMs4Code across various code-related tasks is notably absent. To bridge this gap, we perform the first systematic study on applying state-of-the-art model editing approaches to repair the inaccuracy of LLMs4Code. To that end, we introduce a benchmark named CLMEEval, which consists of two datasets, i.e., CoNaLa-Edit (CNLE) with 21K+ code generation samples and CodeSearchNet-Edit (CSNE) with 16K+ code summarization samples. With the help of CLMEEval, we evaluate six advanced model editing techniques on three LLMs4Code: CodeLlama (7B), CodeQwen1.5 (7B), and Stable-Code (3B). Our findings include that the external memorization-based GRACE approach achieves the best knowledge editing effectiveness and specificity (the editing does not influence untargeted knowledge), while generalization (whether the editing can generalize to other semantically-identical inputs) is a universal challenge for existing techniques. Furthermore, building on in-depth case analysis, we introduce an enhanced version of GRACE called A-GRACE, which incorporates contrastive learning to better capture the semantics of the inputs. Results demonstrate that A-GRACE notably enhances generalization while maintaining similar levels of effectiveness and specificity compared to the vanilla GRACE. |
| Author | Ji, Bin Li, Shasha Liu, Xiaodong Li, Xiaopeng Ma, Jun Wang, Shangwen Yu, Jie Zhang, Weimin Wang, Jing |
| Author_xml | – sequence: 1 givenname: Xiaopeng surname: Li fullname: Li, Xiaopeng email: xiaopengli@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 2 givenname: Shangwen surname: Wang fullname: Wang, Shangwen email: wangshangwen13@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 3 givenname: Shasha surname: Li fullname: Li, Shasha email: shashali@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 4 givenname: Jun surname: Ma fullname: Ma, Jun email: majun@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 5 givenname: Jie surname: Yu fullname: Yu, Jie email: yj@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 6 givenname: Xiaodong surname: Liu fullname: Liu, Xiaodong email: liuxiaodong@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 7 givenname: Jing surname: Wang fullname: Wang, Jing email: wangjing@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 8 givenname: Bin surname: Ji fullname: Ji, Bin email: jibin@nudt.edu.cn organization: College of Computer Science, National University of Defense Technology,Changsha,China – sequence: 9 givenname: Weimin surname: Zhang fullname: Zhang, Weimin email: wmzhang104@139.com organization: College of Computer Science, National University of Defense Technology,Changsha,China |
| BookMark | eNotj8FKw0AURUepYFv7B13MB5j4ZiYvM8-NSEhtIaULdV3eJBOJ1EQmheLfG9DVhcvhHu5CzPqhD0KsFaRKAT3sitcS0WQ21aAxBYCMrsSKLDljFALmpK7FXCG6RGmNt2Ixjp8TlmdEc3G_H5pwkmXTnbv-Q7ZDlFW1H7Niqh_ldrjIDUfJMchLeLoTNy2fxrD6z6V435RvxTapDi-74rlKWOdwTjyxo9ZodvVkcWCJWw8Ba59bMN7npm7amj2yrm3GCOw8AzTBNmCQlFmK9d9uF0I4fsfui-PPcbqriUCbX_KsQzU |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ICSE55347.2025.00049 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798331505691 |
| EISSN | 1558-1225 |
| EndPage | 949 |
| ExternalDocumentID | 11029902 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Hunan Provincial Natural Science Foundation Projects grantid: 2022JJ30668,2022JJ30046 funderid: 10.13039/501100004735 |
| GroupedDBID | -~X .4S .DC 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO FEDTE I-F IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-a260t-b9a89f32a8c0648079afb0e5cb6703bb63cdfcab5a2c74a50a8ba00de7d035913 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001538318100073&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 01:40:27 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a260t-b9a89f32a8c0648079afb0e5cb6703bb63cdfcab5a2c74a50a8ba00de7d035913 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_11029902 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-April-26 |
| PublicationDateYYYYMMDD | 2025-04-26 |
| PublicationDate_xml | – month: 04 year: 2025 text: 2025-April-26 day: 26 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / International Conference on Software Engineering |
| PublicationTitleAbbrev | ICSE |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0006499 |
| Score | 2.289598 |
| Snippet | Large Language Models for Code (LLMs4Code) have been found to exhibit outstanding performance in the software engineering domain, especially the remarkable... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 937 |
| SubjectTerms | Adaptation models Benchmark testing Code Generation Code Summarization Codes Encoding Large language models LLMs4Code Maintenance engineering Model Editing Semantics Software engineering Systematics Training |
| Title | Model Editing for LLMs4Code: How Far are we? |
| URI | https://ieeexplore.ieee.org/document/11029902 |
| WOSCitedRecordID | wos001538318100073&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEA6t9NCTfVj6Jocemxp3N5uklx5EsWBF6ANvMnksCEVl1fr3O4mrpYceegtzCTND5pGZb4aQO2WFLTJpGSrXsUyDZZgGFSxPQGkw6APjsOqPvhwM1GikhxVYPWJhvPex-cw_hGOs5buZXYWvsia6qmA90eLuS5lvwFo7s5tj7F5h41pcN5_brx0h0kxiDpiIOJRT_9qgEh1It_7Pq49I4weKR4c7J3NM9vz0hNS3uxho9TRPyX1YavZJO24S-pgphqK0339ZZG0kP9LebE27UFIoPV37pwZ573be2j1WbUJggPnGkhmNkitSlJ9FPhWXGgrDvbAmxxdrTJ5aV1gwAhIrMxAclAHOnZcujOhrpWekNp1N_TmhXFuXSOuVdQKNZAJSeSkyAI25kuPygjQC9-P5ZtjFeMv45R_0K3IYBBwKLEl-TWrLcuVvyIH9Wk4W5W1U0Tfd7o_v |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4MmugJHxjf9uDRlbLbbrdePBAIxIWQiIYbmT42ISFgFpC_73RZMB48eGvm0sxMOo_OfDOEPCRGmIxLE6BybcAVmADToCyIQ0gUaPSBxbDqj1T2-8lopAYlWL3AwjjniuYz9-SPRS3fzs3Kf5XV0VV564kWd19wHrINXGtneGOM3kt0XIOperf51hIi4hKzwFAUYznVrx0qhQtpV_95-TGp_YDx6GDnZk7Inpudkup2GwMtH-cZefRrzaa0ZSe-k5liMErTtLfgTSQ_0858TduQU8gdXbuXGnlvt4bNTlDuQggAM45loBXKLotQggb5TJhUkGnmhNExvlmt48jYzIAWEBrJQTBINDBmnbR-SF8jOieV2XzmLghlythQGpcYK9BMhiATJwUHUJgtWSYvSc1zP_7cjLsYbxm_-oN-Tw47w146Trv912ty5IXtyy1hfEMqy3zlbsmB-VpOFvldoa5vj9eTNg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Model+Editing+for+LLMs4Code%3A+How+Far+are+we%3F&rft.au=Li%2C+Xiaopeng&rft.au=Wang%2C+Shangwen&rft.au=Li%2C+Shasha&rft.au=Ma%2C+Jun&rft.date=2025-04-26&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=937&rft.epage=949&rft_id=info:doi/10.1109%2FICSE55347.2025.00049&rft.externalDocID=11029902 |