CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation

Automated code generation has been extensively studied in recent literature. In this work, we first survey 66 participants to motivate a more pragmatic code generation scenario, i.e., library-oriented code generation, where the generated code should implement the functionally of the natural language...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] pp. 434 - 445
Main Authors: Liu, Mingwei, Yang, Tianyong, Lou, Yiling, Du, Xueying, Wang, Ying, Peng, Xin
Format: Conference Proceeding
Language:English
Published: IEEE 11.09.2023
Subjects:
ISSN:2643-1572
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Automated code generation has been extensively studied in recent literature. In this work, we first survey 66 participants to motivate a more pragmatic code generation scenario, i.e., library-oriented code generation, where the generated code should implement the functionally of the natural language query with the given library. We then revisit existing learning-based code generation techniques and find they have limited effectiveness in such a library-oriented code generation scenario. To address this limitation, we propose a novel library-oriented code generation technique, CodeGen4Libs, which incorporates two stages: import generation and code generation. The import generation stage generates import statements for the natural language query with the given third-party libraries, while the code generation stage generates concrete code based on the generated imports and the query. To evaluate the effectiveness of our approach, we conduct extensive experiments on a dataset of 403,780 data items. Our results demonstrate that CodeGen4Libs outperforms baseline models in both import generation and code generation stages, achieving improvements of up to 97.4% on EM (Exact Match), 54.5% on BLEU, and 53.5% on Hit@All. Overall, our proposed CodeGen4Libs approach shows promising results in generating high-quality code with specific third-party libraries, which can improve the efficiency and effectiveness of software development.
AbstractList Automated code generation has been extensively studied in recent literature. In this work, we first survey 66 participants to motivate a more pragmatic code generation scenario, i.e., library-oriented code generation, where the generated code should implement the functionally of the natural language query with the given library. We then revisit existing learning-based code generation techniques and find they have limited effectiveness in such a library-oriented code generation scenario. To address this limitation, we propose a novel library-oriented code generation technique, CodeGen4Libs, which incorporates two stages: import generation and code generation. The import generation stage generates import statements for the natural language query with the given third-party libraries, while the code generation stage generates concrete code based on the generated imports and the query. To evaluate the effectiveness of our approach, we conduct extensive experiments on a dataset of 403,780 data items. Our results demonstrate that CodeGen4Libs outperforms baseline models in both import generation and code generation stages, achieving improvements of up to 97.4% on EM (Exact Match), 54.5% on BLEU, and 53.5% on Hit@All. Overall, our proposed CodeGen4Libs approach shows promising results in generating high-quality code with specific third-party libraries, which can improve the efficiency and effectiveness of software development.
Author Du, Xueying
Wang, Ying
Yang, Tianyong
Lou, Yiling
Liu, Mingwei
Peng, Xin
Author_xml – sequence: 1
  givenname: Mingwei
  surname: Liu
  fullname: Liu, Mingwei
  email: liumingwei@fudan.edu.cn
  organization: Fudan University,Shanghai,China
– sequence: 2
  givenname: Tianyong
  surname: Yang
  fullname: Yang, Tianyong
  email: 21212010044@m.fudan.edu.cn
  organization: Fudan University,Shanghai,China
– sequence: 3
  givenname: Yiling
  surname: Lou
  fullname: Lou, Yiling
  email: yilinglou@fudan.edu.cn
  organization: Fudan University,Shanghai,China
– sequence: 4
  givenname: Xueying
  surname: Du
  fullname: Du, Xueying
  email: 21210240012@m.fudan.edu.cn
  organization: Fudan University,Shanghai,China
– sequence: 5
  givenname: Ying
  surname: Wang
  fullname: Wang, Ying
  email: 22210240051@m.fudan.edu.cn
  organization: Fudan University,Shanghai,China
– sequence: 6
  givenname: Xin
  surname: Peng
  fullname: Peng, Xin
  email: pengxin@fudan.edu.cn
  organization: Fudan University,Shanghai,China
BookMark eNotj81Kw0AcxFdRsK19Aj3sC2z873fWWwhtFAI9tJ7LfmpEk7AJiG9vRE_DMPyGmTW66oc-InRHoaAUzEN13EnFmCkYMF4AUGku0NZoU3IJnBmjxCVaMSU4oVKzG7SepncAuRi9Qk09hNjEXrSdmx5xhU9fAznO9jXiahzzYP0bTkPGS5xt_iaH3MV-jgH_cngBY7ZzN_S36DrZjylu_3WDXva7U_1E2kPzXFctsawUM1FGCyMFdSFxZ0tvqRU8MFY646xJmgN1LkEygYLXikMqS690AODCCy_5Bt3_9XYxxvOYu89l1ZkCW-4yzX8A34xM8w
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ASE56229.2023.00159
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350329964
EISSN 2643-1572
EndPage 445
ExternalDocumentID 10298327
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61972098
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a284t-69749541bdf3ba8ca1a43d228b9ba9f7301bbf0f9d10c7630f88c67d0034c4c53
IEDL.DBID RIE
ISICitedReferencesCount 18
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200035&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:28 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a284t-69749541bdf3ba8ca1a43d228b9ba9f7301bbf0f9d10c7630f88c67d0034c4c53
PageCount 12
ParticipantIDs ieee_primary_10298327
PublicationCentury 2000
PublicationDate 2023-Sept.-11
PublicationDateYYYYMMDD 2023-09-11
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-11
  day: 11
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev ASE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0051577
ssib057256115
Score 2.3713582
Snippet Automated code generation has been extensively studied in recent literature. In this work, we first survey 66 participants to motivate a more pragmatic code...
SourceID ieee
SourceType Publisher
StartPage 434
SubjectTerms Code Generation
Codes
Computer languages
Language Model
Libraries
Natural languages
Software
Surveys
Task analysis
Third-party Library
Title CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation
URI https://ieeexplore.ieee.org/document/10298327
WOSCitedRecordID wos001103357200035&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV27TsMwFL2CioGpPIp4ywOrIXacOGarqrYMqFSiSN0qP1GXBPUBv891mhYWBrY4kaXo2tfnOPE5F-AO1z-VIo-lPBhHhfWMKoPNoDTun4Myioe62IQcjYrpVI0bsXqthfHe14fP_H28rP_lu8qu46cyzHCucAbKfdiXMt-ItbaTJ5MI3oztuC_itJSNzRBL1EP3tY9Qz6M2hUdTUxbdSX8VVKnxZND-55scQedHmUfGO8w5hj1fnkB7W5qBNJl6CsNe5fzQl-J5bpaPpEsmXxVFYvnuSbdxESdIV0kjW6Av0e4YySeJ_cjGizoOWQfeBv1J74k2NROoRqBZ0Rz3ByoTzLiQGl1YzbRIHeeFUUarEPPZmJAE5VhicW1JQlHYXLroU2OFzdIzaJVV6c-BFCk-0amRyimhhTQ29VluMyMzqxPPL6ATAzP72NhizLYxufzj_hUcxtjHwxaMXUNrtVj7Gziwn6v5cnFbD-Y37IidrA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV27TsMwFLWgIMFUHkW88cBqiB2njtmqqg9EKZUoUrfKT9QlQW0Kv891mhYWBrY4kaXI19fnOPE5F6FbWP9kDDyWMK8t4cZRIjU0vVSwf_ZSS-bLYhNiOEwnEzmqxOqlFsY5Vx4-c3fhsvyXb3OzDJ_KIMOZhBkottFOwjmLVnKt9fRJBMA3pRv2C0gtRGU0RCN533rtANizoE5hwdaUBn_SXyVVSkTp1v_5Lgeo8aPNw6MN6hyiLZcdofq6OAOucvUY9dq5dT2X8cFMLx5wC4-_cgLU8t3hVuUjjoGw4kq4QF6C4THQTxz64ZUbdQhaA711O-N2n1RVE4gCqClIE3YIMuFUWx9rlRpFFY8tY6mWWkkfMlprH3lpaWRgdYl8mpqmsMGpxnCTxCeoluWZO0U4jeGJirWQVnLFhTaxS5om0SIxKnLsDDXCwEw_VsYY0_WYnP9x_wbt9cfPg-ngcfh0gfZDHMLRC0ovUa2YL90V2jWfxWwxvy4D-w1S5KDz
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=CodeGen4Libs%3A+A+Two-Stage+Approach+for+Library-Oriented+Code+Generation&rft.au=Liu%2C+Mingwei&rft.au=Yang%2C+Tianyong&rft.au=Lou%2C+Yiling&rft.au=Du%2C+Xueying&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=434&rft.epage=445&rft_id=info:doi/10.1109%2FASE56229.2023.00159&rft.externalDocID=10298327