Generating Variable Explanations via Zero-shot Prompt Learning
As basic elements in program, variables convey essential information that is critical for program comprehension and maintenance. However, understanding the meanings of variables in program is not always easy for developers, since poor-quality variable names are prevalent while such variable are less...
Uložené v:
| Vydané v: | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 748 - 760 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
11.09.2023
|
| Predmet: | |
| ISSN: | 2643-1572 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | As basic elements in program, variables convey essential information that is critical for program comprehension and maintenance. However, understanding the meanings of variables in program is not always easy for developers, since poor-quality variable names are prevalent while such variable are less informative for program comprehension. Therefore, in this paper, we target at generating concise natural language explanations for variables to facilitate program comprehension. In particular, there are two challenges in variable explanation generation, including the lack of training data and the association with complex code contexts around the variable. To address these issues, we propose a novel approach ZeroVar,which leverages code pre-trained models and zero-shot prompt learning to generate explanations for the variable based on its code context. ZeroVarcontains two stages: (i) a pre-training stage that continually pre-trains a base model (i.e., CodeT5) to recover the randomly-masked parameter descriptions in method docstrings; and (ii) a zero-shot prompt learning stage that leverages the pre-trained model to generate explanations for a given variable via the prompt constructed with the variable and its belonging method context. We then extensively evaluate the quality and usefulness of the variable explanations generated by ZeroVar.We construct an evaluation dataset of 773 variables and their reference explanations. Our results show that ZeroVarcan generate higher-quality explanations than baselines, not only on automated metrics such as BLEU and ROUGE, but also on human metrics such as correctness, completeness, and conciseness. Moreover, we further assess the usefulness of ZeroVAR-generated explanations on two downstream tasks related to variable naming quality, i.e., abbreviation expansion and spelling correction. For abbreviation expansion, the generated variable explanations can help improve the present rate (+13.1%), precision (+3.6%), and recall (+10.0%) of the state-of-the-art abbreviation explanation approach. For spelling correction, by using the generated explanations we can achieve higher hit@1 (+162.9(%) and hit@3 (+49.6%) than the recent variable representation learning approach. |
|---|---|
| AbstractList | As basic elements in program, variables convey essential information that is critical for program comprehension and maintenance. However, understanding the meanings of variables in program is not always easy for developers, since poor-quality variable names are prevalent while such variable are less informative for program comprehension. Therefore, in this paper, we target at generating concise natural language explanations for variables to facilitate program comprehension. In particular, there are two challenges in variable explanation generation, including the lack of training data and the association with complex code contexts around the variable. To address these issues, we propose a novel approach ZeroVar,which leverages code pre-trained models and zero-shot prompt learning to generate explanations for the variable based on its code context. ZeroVarcontains two stages: (i) a pre-training stage that continually pre-trains a base model (i.e., CodeT5) to recover the randomly-masked parameter descriptions in method docstrings; and (ii) a zero-shot prompt learning stage that leverages the pre-trained model to generate explanations for a given variable via the prompt constructed with the variable and its belonging method context. We then extensively evaluate the quality and usefulness of the variable explanations generated by ZeroVar.We construct an evaluation dataset of 773 variables and their reference explanations. Our results show that ZeroVarcan generate higher-quality explanations than baselines, not only on automated metrics such as BLEU and ROUGE, but also on human metrics such as correctness, completeness, and conciseness. Moreover, we further assess the usefulness of ZeroVAR-generated explanations on two downstream tasks related to variable naming quality, i.e., abbreviation expansion and spelling correction. For abbreviation expansion, the generated variable explanations can help improve the present rate (+13.1%), precision (+3.6%), and recall (+10.0%) of the state-of-the-art abbreviation explanation approach. For spelling correction, by using the generated explanations we can achieve higher hit@1 (+162.9(%) and hit@3 (+49.6%) than the recent variable representation learning approach. |
| Author | Lou, Yiling Liu, Junwei Peng, Xin Wang, Chong |
| Author_xml | – sequence: 1 givenname: Chong surname: Wang fullname: Wang, Chong email: wangchong20@fudan.edu.cn organization: School of Computer Science, Fudan University,Shanghai Key Laboratory of Data Science,China – sequence: 2 givenname: Yiling surname: Lou fullname: Lou, Yiling email: yilinglou@fudan.edu.cn organization: School of Computer Science, Fudan University,Shanghai Key Laboratory of Data Science,China – sequence: 3 givenname: Junwei surname: Liu fullname: Liu, Junwei email: 22210240218@m.fudan.edu.cn organization: School of Computer Science, Fudan University,Shanghai Key Laboratory of Data Science,China – sequence: 4 givenname: Xin surname: Peng fullname: Peng, Xin email: pengxin@fudan.edu.cn organization: School of Computer Science, Fudan University,Shanghai Key Laboratory of Data Science,China |
| BookMark | eNotjM1Kw0AURkdRsK19Al3MCyTeuZPJZDZCKWkVAgr-LNyUm-RGI-kkTILo2xvQ1cc5cL6lOPO9ZyGuFMRKgbvZPOUmRXQxAuoYQGk4EWtnXaYNaHQuTU7FAtNER8pYvBDLcfwEMDPYhbjds-dAU-vf5SuFlsqOZf49dORn2ftRfrUk3zj00fjRT_Ix9MdhkgVT8HNzKc4b6kZe_-9KvOzy5-1dVDzs77ebIiLMkimyqSYDgFwphFpVTZ2UdV0bw5TVCktbNWwbbSvChrLGambltMWKHZiESK_E9d9vy8yHIbRHCj8HBeiyRIH-BfErTAw |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ASE56229.2023.00130 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350329964 |
| EISSN | 2643-1572 |
| EndPage | 760 |
| ExternalDocumentID | 10298410 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61972098 funderid: 10.13039/501100001809 |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a284t-763a5002ec120d1cfd4bddd55ea8d12b7cfe7f37ca2fa8f73ee19372ce9054aa3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 6 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200060&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a284t-763a5002ec120d1cfd4bddd55ea8d12b7cfe7f37ca2fa8f73ee19372ce9054aa3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10298410 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-11 |
| PublicationDateYYYYMMDD | 2023-09-11 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0051577 ssib057256115 |
| Score | 2.2904482 |
| Snippet | As basic elements in program, variables convey essential information that is critical for program comprehension and maintenance. However, understanding the... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 748 |
| SubjectTerms | code pretrained models Codes Maintenance engineering Measurement naming quality Natural languages prompt learning Representation learning Task analysis Training data variable explanation |
| Title | Generating Variable Explanations via Zero-shot Prompt Learning |
| URI | https://ieeexplore.ieee.org/document/10298410 |
| WOSCitedRecordID | wos001103357200060&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywOrwXaSOl6QEGrFVFXiQxVLdbHP0IGmStP-fmwnbScGtsiKIuti5905770j5M4Yl0qeCgaQ-wJFFH1WcOQsR1MY8HihBcRmE2o0yicTPW7F6lELg4iRfIb34TL-y7elWYWjMr_Dpc7TIKjaV6rfiLU2iydTHryF2Oa-HqeVam2GBNcPT68DD_UyaFNkMDUVkfe8a6gS8WTY_edMjkhvp8yj4y3mHJM9nJ-Q7qY1A2136il5bOykA6eZfvhyOAikaODbQXP6t6TrGdBPrEq2_C7r8MyfRU1bt9WvHnkfDt6eX1jbKoGBx5ea-a9EaG0g0QjJrTDOpoW1NssQcitkoYxD5RJlQDrInUoQfeampEHtczaA5Ix05uUczwnVoID7Ox136Gs3qUFnaH0akygHCegL0gvxmC4aN4zpJhSXf4xfkcMQ8sCxEOKadOpqhTfkwKzr2bK6je_wFyfDnNk |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVQQYKpfBTxjQfWQOwkOF6QEGpVRKkqUVDFUl3sM3Sgqdq0v59zmrYTA1sURXF0if3unPfuMXZjjItlGIsAIKUCRWT3QRZiGKRoMgOEF1pAaTahut10MNC9SqxeamEQsSSf4a0_LP_l29zM_VYZzXCp09gLqraTmIZYyrVWn0-iCL6FWGe_hNRKVY2GRKjvHt-aBPbSq1Okb2sqSubzxlKlRJRW_Z_Pss8aG20e761R54Bt4fiQ1VfmDLyaq0fsYdlQ2rOa-QcVxF4ixT3jDpb7fzO-GAH_xGkezL7zwt_zZ1Lwqt_qV4O9t5r9p3ZQmSUEQAhTBLROeHMDiUbI0ArjbJxZa5MEIbVCZso4VC5SBqSD1KkIkXI3JQ1qytoAomNWG-djPGFcg4KQrnShQ6repAadoKVEJlIOItCnrOHjMZws-2EMV6E4--P8Ndtt9187w85z9-Wc7fnwe8aFEBesVkzneMl2zKIYzaZX5fv8BYFvoCA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=Generating+Variable+Explanations+via+Zero-shot+Prompt+Learning&rft.au=Wang%2C+Chong&rft.au=Lou%2C+Yiling&rft.au=Liu%2C+Junwei&rft.au=Peng%2C+Xin&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=748&rft.epage=760&rft_id=info:doi/10.1109%2FASE56229.2023.00130&rft.externalDocID=10298410 |