A syntax-guided multi-task learning approach for Turducken-style code generation

Due to the development of pre-trained language models, automated code generation techniques have shown great promise in recent years. However, the generated code will not always adhere to syntactic constraints of the target language, especially in the case of Turducken-style code, where declarative...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Empirical software engineering : an international journal Jg. 28; H. 6; S. 141
Hauptverfasser: Yang, Guang, Zhou, Yu, Chen, Xiang, Zhang, Xiangyu, Xu, Yiran, Han, Tingting, Chen, Taolue
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York Springer US 01.11.2023
Springer Nature B.V
Schlagworte:
ISSN:1382-3256, 1573-7616
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Due to the development of pre-trained language models, automated code generation techniques have shown great promise in recent years. However, the generated code will not always adhere to syntactic constraints of the target language, especially in the case of Turducken-style code, where declarative code snippets are embedded within imperative programs. In this study, we summarize three significant challenges in regards to syntactic constraints: (1) the efficient representation of syntactic constraints, (2) the effective integration of syntactic information, and (3) the scalable syntax-first decoding algorithm. To address these challenges, we propose a syntax-guided multi-task learning approach TurduckenGen. Specifically, we first explicitly append the type information to the code tokens to capture the representation of syntactic constraints. Then we formalize code generation with syntactic constraint representation as an auxiliary task to enable the model to learn the syntactic constraints of the code. Finally, the syntactically correct code is selected accurately from the multiple candidates with the help of the compiler feedback. Extensive experiments and comprehensive analysis demonstrate the effectiveness and general applicability of our approach after being compared with six state-of-the-art baselines on two Turducken-style code datasets. Finally, we conducted a human study and found the code quality generated by our approach is better than baselines in terms of code readability and semantic similarity.
AbstractList Due to the development of pre-trained language models, automated code generation techniques have shown great promise in recent years. However, the generated code will not always adhere to syntactic constraints of the target language, especially in the case of Turducken-style code, where declarative code snippets are embedded within imperative programs. In this study, we summarize three significant challenges in regards to syntactic constraints: (1) the efficient representation of syntactic constraints, (2) the effective integration of syntactic information, and (3) the scalable syntax-first decoding algorithm. To address these challenges, we propose a syntax-guided multi-task learning approach TurduckenGen. Specifically, we first explicitly append the type information to the code tokens to capture the representation of syntactic constraints. Then we formalize code generation with syntactic constraint representation as an auxiliary task to enable the model to learn the syntactic constraints of the code. Finally, the syntactically correct code is selected accurately from the multiple candidates with the help of the compiler feedback. Extensive experiments and comprehensive analysis demonstrate the effectiveness and general applicability of our approach after being compared with six state-of-the-art baselines on two Turducken-style code datasets. Finally, we conducted a human study and found the code quality generated by our approach is better than baselines in terms of code readability and semantic similarity.
ArticleNumber 141
Author Yang, Guang
Zhou, Yu
Han, Tingting
Chen, Taolue
Xu, Yiran
Zhang, Xiangyu
Chen, Xiang
Author_xml – sequence: 1
  givenname: Guang
  surname: Yang
  fullname: Yang, Guang
  organization: The College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
– sequence: 2
  givenname: Yu
  orcidid: 0000-0002-3723-7584
  surname: Zhou
  fullname: Zhou, Yu
  email: zhouyu@nuaa.edu.cn
  organization: The College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
– sequence: 3
  givenname: Xiang
  surname: Chen
  fullname: Chen, Xiang
  organization: The School of Information Science and Technology, Nantong University
– sequence: 4
  givenname: Xiangyu
  surname: Zhang
  fullname: Zhang, Xiangyu
  organization: The College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
– sequence: 5
  givenname: Yiran
  surname: Xu
  fullname: Xu, Yiran
  organization: The College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
– sequence: 6
  givenname: Tingting
  surname: Han
  fullname: Han, Tingting
  organization: Department of Computer Science, Birkbeck, University of London
– sequence: 7
  givenname: Taolue
  surname: Chen
  fullname: Chen, Taolue
  organization: Department of Computer Science, Birkbeck, University of London
BookMark eNp9kEtLAzEUhYNUsK3-AVcDrqN5TTJZluILBF3UdchkknHaaaYmGbD_3tgKgouu7l2c755zzwxM_OAtANcY3WKExF3EiHMGEaEQIyoIxGdgiktBoeCYT_JOKwIpKfkFmMW4RghJwcopeFsUce-T_oLt2DW2KbZjnzqYdNwUvdXBd74t9G4XBm0-CjeEYjWGZjQb62FM-94WZmhs0Vpvg07d4C_BudN9tFe_cw7eH-5Xyyf48vr4vFy8QEOxTNAIYXRtS1GShhpsHG4aVzFXS2ykqajRDGNJHGekZjYHl4LUVNRcOskcl3QObo53c7TP0cak1sMYfLZUpBICUUYYyipyVJkwxBisU7vQbXXYK4zUT3Pq2JzKzalDcwpnqPoHmS4dnktBd_1plB7RmH18a8NfqhPUN0YNhRM
CitedBy_id crossref_primary_10_1007_s10515_025_00494_9
crossref_primary_10_1145_3728639
crossref_primary_10_1016_j_cose_2024_104184
crossref_primary_10_1016_j_infsof_2025_107820
crossref_primary_10_1145_3695988
crossref_primary_10_1109_TSE_2024_3440503
crossref_primary_10_1007_s10664_024_10466_4
Cites_doi 10.1145/3510003.3510096
10.1145/3540250.3549162
10.18653/v1/N18-2093
10.3115/1075812.1075823
10.3115/1073083.1073135
10.18653/v1/2021.emnlp-main.669
10.18653/v1/P19-1448
10.24963/ijcai.2022/588
10.18653/v1/D19-1204
10.1142/S0218194020500230
10.1145/604045.604120
10.1145/3560815
10.1049/sfw2.12017
10.18653/v1/2022.findings-naacl.141
10.1016/j.jss.2022.111577
10.1109/ICSE48619.2023.00072
10.18653/v1/P19-1443
10.18653/v1/2022.acl-long.499
10.1145/3510454.3528648
10.1609/aaai.v33i01.33017055
10.18653/v1/2021.emnlp-main.685
10.1109/APSEC53868.2021.00029
10.18653/v1/2020.acl-main.677
10.1007/s10664-019-09730-9
10.18653/v1/D18-1111
10.1145/2790755.2790797
10.1016/j.infsof.2020.106309
10.1145/319838.319848
10.1109/EICT.2015.7391926
10.1145/3487569
10.18653/v1/D18-2002
10.48550/ARXIV.2211.00818
10.1109/DSA52907.2021.00013
10.1109/SANER50967.2021.00014
10.18653/v1/2022.acl-long.576
10.18653/v1/2021.acl-long.353
10.1007/978-1-4612-4380-9_16
10.18653/v1/P17-4012
10.18653/v1/D18-1425
10.18653/v1/2021.naacl-main.211
10.1109/SANER53432.2022.00052
10.1109/ISSRE52982.2021.00042
10.1145/3510003.3510152
10.1109/ASE51524.2021.9678552
10.18653/v1/2020.acl-main.131
10.1109/ICPC52881.2021.00022
10.1198/108571105X46642
10.18653/v1/P16-1057
10.18653/v1/2021.spnlp-1.2
10.18653/v1/2021.emnlp-main.779
10.1109/MSR.2013.6624004
10.1145/3324884.3416591
10.18653/v1/P17-1089
10.1145/3540250.3549113
10.18653/v1/2021.acl-long.295
10.18653/v1/D16-1137
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
DBID AAYXX
CITATION
7SC
8FD
8FE
8FG
ABJCF
AFKRA
ARAPS
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
JQ2
L6V
L7M
L~C
L~D
M7S
P5Z
P62
PHGZM
PHGZT
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
S0W
DOI 10.1007/s10664-023-10372-1
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central
Technology collection
ProQuest One
ProQuest Central
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Engineering Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering collection
DELNET Engineering & Technology Collection
DatabaseTitle CrossRef
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Engineering Collection
Advanced Technologies & Aerospace Collection
Engineering Database
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest DELNET Engineering and Technology Collection
Materials Science & Engineering Collection
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList
Technology Collection
Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1573-7616
ExternalDocumentID 10_1007_s10664_023_10372_1
GrantInformation_xml – fundername: Collaborative Innovation Center of Novel Software Technology and Industrialization, and the Open Project of Key Laboratory of Safety-Critical Software for Nanjing University of Aeronautics and Astronautics, Ministry of Industry and Information Technology
  grantid: No. NJ2020022
– fundername: State Key Laboratory of Novel Software Technology
  grantid: KFKT2022A03; No. 62272397
  funderid: http://dx.doi.org/10.13039/501100011246
– fundername: Postgraduate Research & Practice Innovation Program of Jiangsu Province
– fundername: Natural Science Foundation of Jiangsu Province
  grantid: No. BK20201292
  funderid: http://dx.doi.org/10.13039/501100004608
– fundername: National Natural Science Foundation of China
  grantid: No. 61972197
  funderid: http://dx.doi.org/10.13039/501100001809
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
.86
.DC
.VR
06D
0R~
0VY
199
1N0
1SB
2.D
203
28-
29G
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
78A
8FE
8FG
8TC
8UJ
95-
95.
95~
96X
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYOK
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTD
ABFTV
ABHLI
ABHQN
ABJCF
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACHSB
ACHXU
ACIWK
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
B-.
BA0
BBWZM
BDATZ
BENPR
BGLVJ
BGNMA
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNWQR
GQ6
GQ7
GQ8
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
KDC
KOV
KOW
L6V
LAK
LLZTM
M4Y
M7S
MA-
N2Q
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
P19
P62
P9O
PF0
PT4
PT5
PTHSS
Q2X
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S0W
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TSG
TSK
TSV
TUC
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7S
Z7V
Z7X
Z7Z
Z81
Z83
Z86
Z88
Z8M
Z8N
Z8P
Z8R
Z8T
Z8U
Z8W
Z92
ZMTXR
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
7SC
8FD
DWQXO
JQ2
L7M
L~C
L~D
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c319t-c77cabe5752d3c1cf1ddf84fb91c9c83ca41192f642b4e325972b37b69f94f693
IEDL.DBID RSV
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001082634100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1382-3256
IngestDate Tue Dec 02 16:28:25 EST 2025
Sat Nov 29 05:37:47 EST 2025
Tue Nov 18 22:19:04 EST 2025
Fri Feb 21 02:40:56 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Turducken-style code
CodeT5
Multi-task learning
Abstract syntax tree
Syntactically-constrained code generation
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-c77cabe5752d3c1cf1ddf84fb91c9c83ca41192f642b4e325972b37b69f94f693
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-3723-7584
PQID 2877034240
PQPubID 326341
ParticipantIDs proquest_journals_2877034240
crossref_primary_10_1007_s10664_023_10372_1
crossref_citationtrail_10_1007_s10664_023_10372_1
springer_journals_10_1007_s10664_023_10372_1
PublicationCentury 2000
PublicationDate 2023-11-01
PublicationDateYYYYMMDD 2023-11-01
PublicationDate_xml – month: 11
  year: 2023
  text: 2023-11-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationSubtitle An International Journal
PublicationTitle Empirical software engineering : an international journal
PublicationTitleAbbrev Empir Software Eng
PublicationYear 2023
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References WangXWangYWanYMiFLiYZhouPLiuJWuHJiangXLiuQCompilable neural code generation with compiler feedbackFindings of the Association for Computational Linguistics: ACL20222022919
HussainYHuangZZhouYImproving source code suggestion with code embedding and enhanced convolutional long short-term memoryIET Softw202115319921310.1049/sfw2.12017
Dong Y, Jiang X, Liu Y, Li G, Jin Z (2022) Codepad: Sequence-based code generation with pushdown automaton. https://doi.org/10.48550/ARXIV.2211.00818. arXiv:2211.00818
Liu F, Li G, Zhao Y, Jin Z (2020a) Multi-task learning based pre-trained language model for code completion. In: Proceedings of the 35th IEEE/ACM international conference on automated software engineering. pp 473–485
Liguori P, Al-Hossami E, Orbinato V, Natella R, Shaikh S, Cotroneo D, Cukic B (2021) Evil: exploiting software via natural language. In: 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, pp 321–332
SunZZhuQMouLXiongYLiGZhangLA grammar-based structural cnn decoder for code generationProceedings of the AAAI conference on artificial intelligence2019337055706210.1609/aaai.v33i01.33017055
XuFFVasilescuBNeubigGIn-ide code generation from natural language: Promise and challengesACM Trans Softw Eng Methodol (TOSEM)202231214710.1145/3487569
Zhong V, Xiong C, Socher R (2017) Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv:1709.00103
Yu T, Zhang R, Yasunaga M, Tan YC, Lin XV, Li S, Er H, Li I, Pang B, Chen T et al (2019b) Sparc: Cross-domain semantic parsing in context. In: Proceedings of the 57th annual meeting of the association for computational linguistics. pp 4511–4523
Chakraborty S, Ahmed T, Ding Y, Devanbu PT, Ray B (2022) Natgen: generative pre-training by “naturalizing” source code. In: Proceedings of the 30th ACM joint european software engineering conference and symposium on the foundations of software engineering. pp 18–30
Hu X, Xia X, Lo D, Wan Z, Chen Q, Zimmermann T (2022) Practitioners’ expectations on automated code comment generation. In: 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022. ACM, Pittsburgh, PA, USA, May 25-27, 2022, pp 1693–1705. https://doi.org/10.1145/3510003.3510152
Wang D, Yu Y, Li S, Dong W, Wang J, Qing L (2021a) Mulcode: A multi-task learning approach for source code understanding. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 48–59
RaffelCShazeerNRobertsALeeKNarangSMatenaMZhouYLiWLiuPJExploring the limits of transfer learning with a unified text-to-text transformerJ Mach Learn Res2020211548555514138124
Liang Q, Sun Z, Zhu Q, Zhang W, Yu L, Xiong Y, Zhang L (2021) Lyra: A benchmark for turducken-style code generation. arXiv:2108.12144
Mahmud T, Hasan KA, Ahmed M, Chak THC (2015) A rule based approach for nlp based query processing. In: 2015 2nd International conference on electrical information and communication technologies (EICT). IEEE, pp 78–82
Gu Y, Han X, Liu Z, Huang M (2022) Ppt: Pre-trained prompt tuning for few-shot learning. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers). pp 8410–8423
Mou L, Men R, Li G, Zhang L, Jin Z (2015) On end-to-end program generation from user intention by deep neural networks. arXiv:1510.07211
SunZZhuQXiongYSunYMouLZhangLTreegen: A tree-based transformer architecture for code generationProc AAAI Conf Art Intell20203489848991
Fernandes S, Bernardino J (2015) What is bigquery? In: Proceedings of the 19th International Database Engineering & Applications Symposium. pp 202–203
Huang J, Wang Y, Wang Y, Dong Y, Xiao Y (2021) Relation aware semi-autoregressive semantic parsing for nl2sql. arXiv:2108.00804
LinXVSocherRXiongCBridging textual and tabular data for cross-domain text-to-sql semantic parsingFindings of the Association for Computational Linguistics: EMNLP2020202048704888
Longpre S, Hou L, Vu T, Webson A, Chung HW, Tay Y, Zhou D, Le QV, Zoph B, Wei J, et al (2023) The flan collection: Designing data and methods for effective instruction tuning. arXiv:2301.13688
Bogin B, Berant J, Gardner M (2019) Representing schema structure with graph neural networks for text-to-sql parsing. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. pp 4560–4565
Wang B, Shin R, Liu X, Polozov O, Richardson M (2020) Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. In: Proceedings of the 58th annual meeting of the association for computational linguistics. pp 7567–7578
Li XL, Liang P (2021) Prefix-tuning: Optimizing continuous prompts for generation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). pp 4582–4597
Gao T, Fisch A, Chen D (2021) Making pre-trained language models better few-shot learners. In: Joint conference of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL-IJCNLP 2021, Association for Computational Linguistics (ACL). pp 3816–3830
Wang Y, Wang W, Joty S, Hoi SC (2021b) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 8696–8708
Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: Open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations. pp 67–72
LiuPYuanWFuJJiangZHayashiHNeubigGPre-train, prompt, and predict: A systematic survey of prompting methods in natural language processingACM Comput Surv202355913510.1145/3560815
Liu Y, Tantithamthavorn C, Liu Y, Li L (2023c) On the reliability and explainability of automated code generation approaches. arXiv:2302.09587
Bailey MW (2009) Workshop on declarative aspects of multicore programming (damp 2009) damp 2009
Liu F, Li J, Zhang L (2023a) Syntax and domain aware model for unsupervised program translation. arXiv:2302.03908
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Yang G, Zhou Y, Chen X, Zhang X, Han T, Chen T (2022c) Exploitgen: Template-augmented exploit code generation based on codebert. J Syst Softw 111577
Dahl DA, Bates M, Brown MK, Fisher WM, Hunicke-Smith K, Pallett DS, Pao C, Rudnicky A, Shriberg E (1994) Expanding the scope of the atis task: The atis-3 corpus. In: Human language technology: proceedings of a workshop held at Plainsboro, New Jersey, March 8-11, 1994
Liu Q, Chen Y, Chen B, Lou JG, Chen Z, Zhou B, Zhang D (2020b) You impress me: Dialogue generation via mutual persona perception. In: Proceedings of the 58th annual meeting of the association for computational linguistics. pp 1417–1427
Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D et al. (2021a) Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv:2102.04664
Eghbali A, Pradel M (2022) Crystalbleu: precisely and efficiently measuring the similarity of code. In: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. pp 341–342
Le H, Wang Y, Gotmare AD, Savarese S, Hoi SC (2022) Coderl: Mastering code generation through pretrained models and deep reinforcement learning. arXiv:2207.01780
Xuan K, Wang Y, Wang Y, Wen Z, Dong Y (2021) Sead: End-to-end text-to-sql generation with schema-aware denoising. arXiv:2105.07911
Sánchez-Cartagena VM, Esplà-Gomis M, Pérez-Ortiz JA, Sánchez-Martínez F (2021) Rethinking data augmentation for low-resource neural machine translation: A multi-task learning approach. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 8502–8516
Yang G, Chen X, Cao J, Xu S, Cui Z, Yu C, Liu K (2021a) Comformer: Code comment generation via transformer and fusion method-based hybrid code representation. In: 2021 8th International conference on dependable systems and their applications (DSA). IEEE, pp 30–41
Xie R, Ye W, Sun J, Zhang S (2021) Exploiting method names to improve code summarization: A deliberation multi-task learning approach. In: 2021 IEEE/ACM 29th international conference on program comprehension (ICPC). IEEE, pp 138–148
Huang J, Wang C, Zhang J, Yan C, Cui H, Inala JP, Clement C, Duan N, Gao J (2022) Execution-based evaluation for data science code generation models. arXiv:2211.09374
LiuFLiGWeiBXiaXFuZJinZA unified multi-task learning model for ast-level and token-level code completionEmp Softw Eng2022274138
Niu C, Li C, Ng V, Ge J, Huang L, Luo B (2022) Spt-code: sequence-to-sequence pre-training for learning source code representations. In: Proceedings of the 44th international conference on software engineering. pp 2006–2018
Yu T, Zhang R, Er H, Li S, Xue E, Pang B, Lin XV, Tan YC, Shi T, Li Z et al. (2019a) Cosql: A conversational text-to-sql challenge towards cross-domain natural language interfaces to databases. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). pp 1962–1979
RadfordAWuJChildRLuanDAmodeiDSutskeverILanguage models are unsupervised multitask learnersOpenAI blog2019189
Yu T, Zhang R, Yang K, Yasunaga M, Wang D, Li Z, Ma J, Li I, Yao Q, Roman S et al. (2018b) Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. In: Proceedings of the 2018 conference on empirical methods in natural language processing. pp 3911–3921
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. pp 311–318
S
10372_CR23
10372_CR67
10372_CR68
10372_CR29
10372_CR27
10372_CR28
Y Hussain (10372_CR25) 2020; 30
10372_CR72
10372_CR73
10372_CR70
10372_CR71
10372_CR32
10372_CR33
10372_CR77
P Liu (10372_CR39) 2023; 55
Z Sun (10372_CR58) 2020; 34
10372_CR74
10372_CR31
10372_CR75
10372_CR14
10372_CR15
10372_CR59
10372_CR12
10372_CR56
10372_CR13
Z Feng (10372_CR10) 2020; 2020
10372_CR19
10372_CR16
Y Hussain (10372_CR26) 2021; 15
10372_CR17
X Hu (10372_CR18) 2020; 25
C Raffel (10372_CR52) 2020; 21
10372_CR61
10372_CR62
10372_CR60
10372_CR21
10372_CR65
10372_CR22
10372_CR66
10372_CR20
10372_CR64
10372_CR5
10372_CR47
10372_CR6
10372_CR48
10372_CR3
10372_CR45
10372_CR4
10372_CR46
10372_CR1
10372_CR2
10372_CR49
10372_CR9
P Legendre (10372_CR30) 2005; 10
Z Sun (10372_CR57) 2019; 33
10372_CR7
10372_CR8
10372_CR50
10372_CR54
10372_CR11
10372_CR55
Y Hussain (10372_CR24) 2020; 125
10372_CR53
10372_CR36
A Radford (10372_CR51) 2019; 1
10372_CR78
10372_CR35
10372_CR79
F Liu (10372_CR37) 2022; 27
10372_CR38
XV Lin (10372_CR34) 2020; 2020
X Wang (10372_CR63) 2022; 2022
10372_CR80
10372_CR83
10372_CR40
FF Xu (10372_CR69) 2022; 31
10372_CR81
10372_CR82
10372_CR43
10372_CR44
10372_CR41
G Yang (10372_CR76) 2023; 197
10372_CR42
References_xml – reference: Li XL, Liang P (2021) Prefix-tuning: Optimizing continuous prompts for generation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). pp 4582–4597
– reference: Yu T, Li Z, Zhang Z, Zhang R, Radev D (2018a) Typesql: Knowledge-based type-aware neural text-to-sql generation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). pp 588–594
– reference: Liguori P, Al-Hossami E, Orbinato V, Natella R, Shaikh S, Cotroneo D, Cukic B (2021) Evil: exploiting software via natural language. In: 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, pp 321–332
– reference: Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) Unixcoder: Unified cross-modal pre-training for code representation. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers). pp 7212–7225
– reference: Wei B, Li G, Xia X, Fu Z, Jin Z (2019) Code generation as a dual task of code summarization. Adv Neural Inf Process Syst 32
– reference: Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. pp 311–318
– reference: Hu X, Xia X, Lo D, Wan Z, Chen Q, Zimmermann T (2022) Practitioners’ expectations on automated code comment generation. In: 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022. ACM, Pittsburgh, PA, USA, May 25-27, 2022, pp 1693–1705. https://doi.org/10.1145/3510003.3510152
– reference: LiuFLiGWeiBXiaXFuZJinZA unified multi-task learning model for ast-level and token-level code completionEmp Softw Eng2022274138
– reference: Wang Y, Wang W, Joty S, Hoi SC (2021b) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 8696–8708
– reference: Bailey MW (2009) Workshop on declarative aspects of multicore programming (damp 2009) damp 2009
– reference: Xuan K, Wang Y, Wang Y, Wen Z, Dong Y (2021) Sead: End-to-end text-to-sql generation with schema-aware denoising. arXiv:2105.07911
– reference: Chakraborty S, Ahmed T, Ding Y, Devanbu PT, Ray B (2022) Natgen: generative pre-training by “naturalizing” source code. In: Proceedings of the 30th ACM joint european software engineering conference and symposium on the foundations of software engineering. pp 18–30
– reference: Scholak T, Schucher N, Bahdanau D (2021) Picard: Parsing incrementally for constrained auto-regressive decoding from language models. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 9895–9901
– reference: Liang Q, Sun Z, Zhu Q, Zhang W, Yu L, Xiong Y, Zhang L (2021) Lyra: A benchmark for turducken-style code generation. arXiv:2108.12144
– reference: LiuPYuanWFuJJiangZHayashiHNeubigGPre-train, prompt, and predict: A systematic survey of prompting methods in natural language processingACM Comput Surv202355913510.1145/3560815
– reference: Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: International conference on machine learning. PMLR, pp 933–941
– reference: Huang J, Wang C, Zhang J, Yan C, Cui H, Inala JP, Clement C, Duan N, Gao J (2022) Execution-based evaluation for data science code generation models. arXiv:2211.09374
– reference: FengZGuoDTangDDuanNFengXGongMShouLQinBLiuTJiangDCodebert: A pre-trained model for programming and natural languagesFindings of the Association for Computational Linguistics: EMNLP2020202015361547
– reference: Zhong V, Xiong C, Socher R (2017) Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv:1709.00103
– reference: Wang D, Yu Y, Li S, Dong W, Wang J, Qing L (2021a) Mulcode: A multi-task learning approach for source code understanding. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 48–59
– reference: Gao T, Fisch A, Chen D (2021) Making pre-trained language models better few-shot learners. In: Joint conference of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL-IJCNLP 2021, Association for Computational Linguistics (ACL). pp 3816–3830
– reference: Rubin O, Berant J (2021) Smbop: Semi-autoregressive bottom-up semantic parsing. In: Proceedings of the 5th workshop on structured prediction for NLP (SPNLP 2021). pp 12–21
– reference: Wiseman S, Rush AM (2016) Sequence-to-sequence learning as beam-search optimization. In: Proceedings of the 2016 conference on empirical methods in natural language processing. pp 1296–1306
– reference: Ren S, Guo D, Lu S, Zhou L, Liu S, Tang D, Sundaresan N, Zhou M, Blanco A, Ma S (2020) Codebleu: a method for automatic evaluation of code synthesis. arXiv:2009.10297
– reference: Ling W, Blunsom P, Grefenstette E, Hermann KM, Kočiskỳ T, Wang F, Senior A (2016) Latent predictor networks for code generation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers). pp 599–609
– reference: Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement CB, Drain D, Jiang D, Tang D, Li G, Zhou L, Shou L, Zhou L, Tufano M, Gong M, Zhou M, Duan N, Sundaresan N, Deng SK, Fu S, Liu S (2021b) Codexglue: A machine learning benchmark dataset for code understanding and generation. In: Vanschoren J, Yeung S (eds) Proceedings of the neural information processing systems track on datasets and benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual
– reference: Dong Y, Jiang X, Liu Y, Li G, Jin Z (2022) Codepad: Sequence-based code generation with pushdown automaton. https://doi.org/10.48550/ARXIV.2211.00818. arXiv:2211.00818
– reference: HuXLiGXiaXLoDJinZDeep code comment generation with hybrid lexical and syntactical informationEmpir Softw Eng20202532179221710.1007/s10664-019-09730-9
– reference: LegendrePSpecies associations: the kendall coefficient of concordance revisitedJ Agric Biol Environ Stat200510222624510.1198/108571105X46642
– reference: Niu C, Li C, Ng V, Ge J, Huang L, Luo B (2022) Spt-code: sequence-to-sequence pre-training for learning source code representations. In: Proceedings of the 44th international conference on software engineering. pp 2006–2018
– reference: HussainYHuangZZhouYWangSCodegru: Context-aware deep learning with gated recurrent unit for source code modelingInf Softw Technol202012510.1016/j.infsof.2020.106309
– reference: SunZZhuQMouLXiongYLiGZhangLA grammar-based structural cnn decoder for code generationProceedings of the AAAI conference on artificial intelligence2019337055706210.1609/aaai.v33i01.33017055
– reference: Gu Y, Han X, Liu Z, Huang M (2022) Ppt: Pre-trained prompt tuning for few-shot learning. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers). pp 8410–8423
– reference: XuFFVasilescuBNeubigGIn-ide code generation from natural language: Promise and challengesACM Trans Softw Eng Methodol (TOSEM)202231214710.1145/3487569
– reference: Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: Open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations. pp 67–72
– reference: Husain H, Wu HH, Gazit T, Allamanis M, Brockschmidt M (2019) Codesearchnet challenge: Evaluating the state of semantic code search. arXiv:1909.09436
– reference: Iyer S, Konstas I, Cheung A, Krishnamurthy J, Zettlemoyer L (2017) Learning a neural semantic parser from user feedback. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). pp 963–973
– reference: Liu F, Li G, Zhao Y, Jin Z (2020a) Multi-task learning based pre-trained language model for code completion. In: Proceedings of the 35th IEEE/ACM international conference on automated software engineering. pp 473–485
– reference: Sánchez-Cartagena VM, Esplà-Gomis M, Pérez-Ortiz JA, Sánchez-Martínez F (2021) Rethinking data augmentation for low-resource neural machine translation: A multi-task learning approach. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 8502–8516
– reference: WangXWangYWanYMiFLiYZhouPLiuJWuHJiangXLiuQCompilable neural code generation with compiler feedbackFindings of the Association for Computational Linguistics: ACL20222022919
– reference: Yin P, Neubig G (2018) Tranx: A transition-based neural abstract syntax parser for semantic parsing and code generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations. pp 7–12
– reference: Hu X, Gao Z, Xia X, Lo D, Yang X (2021) Automating user notice generation for smart contract functions. In: 2021 36th IEEE/ACM international conference on automated software engineering (ASE). pp 5–17. https://doi.org/10.1109/ASE51524.2021.9678552
– reference: Liu Q, Chen Y, Chen B, Lou JG, Chen Z, Zhou B, Zhang D (2020b) You impress me: Dialogue generation via mutual persona perception. In: Proceedings of the 58th annual meeting of the association for computational linguistics. pp 1417–1427
– reference: Fernandes S, Bernardino J (2015) What is bigquery? In: Proceedings of the 19th International Database Engineering & Applications Symposium. pp 202–203
– reference: Longpre S, Hou L, Vu T, Webson A, Chung HW, Tay Y, Zhou D, Le QV, Zoph B, Wei J, et al (2023) The flan collection: Designing data and methods for effective instruction tuning. arXiv:2301.13688
– reference: YangGZhouYChenXZhangXHanTChenTExploitgen: Template-augmented exploit code generation based on codebertJ Syst Softw202319710.1016/j.jss.2022.111577
– reference: RaffelCShazeerNRobertsALeeKNarangSMatenaMZhouYLiWLiuPJExploring the limits of transfer learning with a unified text-to-text transformerJ Mach Learn Res2020211548555514138124
– reference: Allamanis M, Sutton C (2013) Why, when, and what: analyzing stack overflow questions by topic, type, and code. In: 2013 10th Working conference on mining software repositories (MSR). IEEE, pp 53–56
– reference: Zelle JM, Mooney RJ (1996) Learning to parse database queries using inductive logic programming. In: Proceedings of the national conference on artificial intelligence. pp 1050–1055
– reference: Liu Y, Tantithamthavorn C, Liu Y, Li L (2023c) On the reliability and explainability of automated code generation approaches. arXiv:2302.09587
– reference: Yang G, Chen X, Zhou Y, Yu C (2022a) Dualsc: Automatic generation and summarization of shellcode via transformer and dual learning. arXiv:2202.09785
– reference: Mahmud T, Hasan KA, Ahmed M, Chak THC (2015) A rule based approach for nlp based query processing. In: 2015 2nd International conference on electrical information and communication technologies (EICT). IEEE, pp 78–82
– reference: Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S et al (2021) Graphcodebert: Pre-training code representations with data flow. In: ICLR
– reference: HussainYHuangZZhouYWangSDeep transfer learning for source code modelingInt J Softw Eng Knowl Eng2020300564966810.1142/S0218194020500230
– reference: Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D et al. (2021a) Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv:2102.04664
– reference: Yang G, Chen X, Cao J, Xu S, Cui Z, Yu C, Liu K (2021a) Comformer: Code comment generation via transformer and fusion method-based hybrid code representation. In: 2021 8th International conference on dependable systems and their applications (DSA). IEEE, pp 30–41
– reference: Mou L, Men R, Li G, Zhang L, Jin Z (2015) On end-to-end program generation from user intention by deep neural networks. arXiv:1510.07211
– reference: Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
– reference: Huang J, Wang Y, Wang Y, Dong Y, Xiao Y (2021) Relation aware semi-autoregressive semantic parsing for nl2sql. arXiv:2108.00804
– reference: Bogin B, Berant J, Gardner M (2019) Representing schema structure with graph neural networks for text-to-sql parsing. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. pp 4560–4565
– reference: Wilcoxon F (1992) Individual comparisons by ranking methods. In: Breakthroughs in statistics. Springer, pp 196–202
– reference: Hayati SA, Olivier R, Avvaru P, Yin P, Tomasic A, Neubig G (2018) Retrieval-based neural code generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing. pp 925–930
– reference: Le H, Wang Y, Gotmare AD, Savarese S, Hoi SC (2022) Coderl: Mastering code generation through pretrained models and deep reinforcement learning. arXiv:2207.01780
– reference: Yang G, Zhou Y, Chen X, Yu C (2021b) Fine-grained pseudo-code generation method via code feature extraction and transformer. In: 2021 28th Asia-pacific software engineering conference (APSEC). IEEE, pp 213–222
– reference: Lloyd JW (1994) Practical advtanages of declarative programming. In: GULP-PRODE (1). pp 18–30
– reference: Wang C, Yang Y, Gao C, Peng Y, Zhang H, Lyu MR (2022a) No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence. In: Proceedings of the 30th ACM joint European software engineering conference and symposium on the foundations of software engineering. pp 382–394
– reference: Xie R, Ye W, Sun J, Zhang S (2021) Exploiting method names to improve code summarization: A deliberation multi-task learning approach. In: 2021 IEEE/ACM 29th international conference on program comprehension (ICPC). IEEE, pp 138–148
– reference: Dahl DA, Bates M, Brown MK, Fisher WM, Hunicke-Smith K, Pallett DS, Pao C, Rudnicky A, Shriberg E (1994) Expanding the scope of the atis task: The atis-3 corpus. In: Human language technology: proceedings of a workshop held at Plainsboro, New Jersey, March 8-11, 1994
– reference: SunZZhuQXiongYSunYMouLZhangLTreegen: A tree-based transformer architecture for code generationProc AAAI Conf Art Intell20203489848991
– reference: Ahmad W, Chakraborty S, Ray B, Chang KW (2021) Unified pre-training for program understanding and generation. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp 2655–2668
– reference: Yu T, Zhang R, Yang K, Yasunaga M, Wang D, Li Z, Ma J, Li I, Yao Q, Roman S et al. (2018b) Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. In: Proceedings of the 2018 conference on empirical methods in natural language processing. pp 3911–3921
– reference: Popescu AM, Etzioni O, Kautz H (2003) Towards a theory of natural language interfaces to databases. In: Proceedings of the 8th international conference on intelligent user interfaces. pp 149–157
– reference: Yang G, Zhou Y, Chen X, Zhang X, Han T, Chen T (2022c) Exploitgen: Template-augmented exploit code generation based on codebert. J Syst Softw 111577
– reference: Yu T, Zhang R, Er H, Li S, Xue E, Pang B, Lin XV, Tan YC, Shi T, Li Z et al. (2019a) Cosql: A conversational text-to-sql challenge towards cross-domain natural language interfaces to databases. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). pp 1962–1979
– reference: Liu F, Li J, Zhang L (2023a) Syntax and domain aware model for unsupervised program translation. arXiv:2302.03908
– reference: LinXVSocherRXiongCBridging textual and tabular data for cross-domain text-to-sql semantic parsingFindings of the Association for Computational Linguistics: EMNLP2020202048704888
– reference: RadfordAWuJChildRLuanDAmodeiDSutskeverILanguage models are unsupervised multitask learnersOpenAI blog2019189
– reference: Yu T, Zhang R, Yasunaga M, Tan YC, Lin XV, Li S, Er H, Li I, Pang B, Chen T et al (2019b) Sparc: Cross-domain semantic parsing in context. In: Proceedings of the 57th annual meeting of the association for computational linguistics. pp 4511–4523
– reference: Eghbali A, Pradel M (2022) Crystalbleu: precisely and efficiently measuring the similarity of code. In: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. pp 341–342
– reference: HussainYHuangZZhouYImproving source code suggestion with code embedding and enhanced convolutional long short-term memoryIET Softw202115319921310.1049/sfw2.12017
– reference: Gifford DK, Lucassen JM (1986) Integrating functional and imperative programming. In: Proceedings of the 1986 ACM conference on LISP and functional programming. pp 28–38
– reference: Wang B, Shin R, Liu X, Polozov O, Richardson M (2020) Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. In: Proceedings of the 58th annual meeting of the association for computational linguistics. pp 7567–7578
– reference: Yang G, Chen X, Zhou Y, Yu C (2022b) Dualsc: Automatic generation and summarization of shellcode via transformer and dual learning. In: IEEE international conference on software analysis, evolution and reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022. IEEE, pp 361–372. https://doi.org/10.1109/SANER53432.2022.00052
– ident: 10372_CR23
– ident: 10372_CR48
  doi: 10.1145/3510003.3510096
– ident: 10372_CR5
  doi: 10.1145/3540250.3549162
– ident: 10372_CR65
– ident: 10372_CR78
  doi: 10.18653/v1/N18-2093
– ident: 10372_CR6
  doi: 10.3115/1075812.1075823
– ident: 10372_CR42
– ident: 10372_CR49
  doi: 10.3115/1073083.1073135
– ident: 10372_CR55
  doi: 10.18653/v1/2021.emnlp-main.669
– ident: 10372_CR47
– ident: 10372_CR4
  doi: 10.18653/v1/P19-1448
– ident: 10372_CR22
– ident: 10372_CR7
– ident: 10372_CR32
  doi: 10.24963/ijcai.2022/588
– ident: 10372_CR53
– ident: 10372_CR43
– volume: 2022
  start-page: 9
  year: 2022
  ident: 10372_CR63
  publication-title: Findings of the Association for Computational Linguistics: ACL
– ident: 10372_CR80
  doi: 10.18653/v1/D19-1204
– volume: 30
  start-page: 649
  issue: 05
  year: 2020
  ident: 10372_CR25
  publication-title: Int J Softw Eng Knowl Eng
  doi: 10.1142/S0218194020500230
– ident: 10372_CR50
  doi: 10.1145/604045.604120
– volume: 55
  start-page: 1
  issue: 9
  year: 2023
  ident: 10372_CR39
  publication-title: ACM Comput Surv
  doi: 10.1145/3560815
– volume: 15
  start-page: 199
  issue: 3
  year: 2021
  ident: 10372_CR26
  publication-title: IET Softw
  doi: 10.1049/sfw2.12017
– ident: 10372_CR70
  doi: 10.18653/v1/2022.findings-naacl.141
– ident: 10372_CR75
  doi: 10.1016/j.jss.2022.111577
– ident: 10372_CR38
  doi: 10.1109/ICSE48619.2023.00072
– ident: 10372_CR81
  doi: 10.18653/v1/P19-1443
– ident: 10372_CR15
– ident: 10372_CR16
  doi: 10.18653/v1/2022.acl-long.499
– volume: 34
  start-page: 8984
  year: 2020
  ident: 10372_CR58
  publication-title: Proc AAAI Conf Art Intell
– ident: 10372_CR9
  doi: 10.1145/3510454.3528648
– volume: 33
  start-page: 7055
  year: 2019
  ident: 10372_CR57
  publication-title: Proceedings of the AAAI conference on artificial intelligence
  doi: 10.1609/aaai.v33i01.33017055
– ident: 10372_CR64
  doi: 10.18653/v1/2021.emnlp-main.685
– ident: 10372_CR72
  doi: 10.1109/APSEC53868.2021.00029
– volume: 2020
  start-page: 4870
  year: 2020
  ident: 10372_CR34
  publication-title: Findings of the Association for Computational Linguistics: EMNLP
– ident: 10372_CR60
  doi: 10.18653/v1/2020.acl-main.677
– volume: 25
  start-page: 2179
  issue: 3
  year: 2020
  ident: 10372_CR18
  publication-title: Empir Softw Eng
  doi: 10.1007/s10664-019-09730-9
– ident: 10372_CR21
– ident: 10372_CR17
  doi: 10.18653/v1/D18-1111
– ident: 10372_CR44
– ident: 10372_CR29
– ident: 10372_CR11
  doi: 10.1145/2790755.2790797
– volume: 21
  start-page: 5485
  issue: 1
  year: 2020
  ident: 10372_CR52
  publication-title: J Mach Learn Res
– volume: 125
  year: 2020
  ident: 10372_CR24
  publication-title: Inf Softw Technol
  doi: 10.1016/j.infsof.2020.106309
– ident: 10372_CR13
  doi: 10.1145/319838.319848
– ident: 10372_CR46
  doi: 10.1109/EICT.2015.7391926
– volume: 31
  start-page: 1
  issue: 2
  year: 2022
  ident: 10372_CR69
  publication-title: ACM Trans Softw Eng Methodol (TOSEM)
  doi: 10.1145/3487569
– ident: 10372_CR77
  doi: 10.18653/v1/D18-2002
– ident: 10372_CR8
  doi: 10.48550/ARXIV.2211.00818
– ident: 10372_CR71
  doi: 10.1109/DSA52907.2021.00013
– ident: 10372_CR62
  doi: 10.1109/SANER50967.2021.00014
– ident: 10372_CR82
– ident: 10372_CR3
– ident: 10372_CR14
  doi: 10.18653/v1/2022.acl-long.576
– ident: 10372_CR31
  doi: 10.18653/v1/2021.acl-long.353
– ident: 10372_CR66
  doi: 10.1007/978-1-4612-4380-9_16
– ident: 10372_CR28
  doi: 10.18653/v1/P17-4012
– ident: 10372_CR79
  doi: 10.18653/v1/D18-1425
– ident: 10372_CR1
  doi: 10.18653/v1/2021.naacl-main.211
– ident: 10372_CR73
  doi: 10.1109/SANER53432.2022.00052
– ident: 10372_CR45
– ident: 10372_CR33
  doi: 10.1109/ISSRE52982.2021.00042
– ident: 10372_CR20
  doi: 10.1145/3510003.3510152
– ident: 10372_CR19
  doi: 10.1109/ASE51524.2021.9678552
– volume: 197
  year: 2023
  ident: 10372_CR76
  publication-title: J Syst Softw
  doi: 10.1016/j.jss.2022.111577
– ident: 10372_CR40
  doi: 10.18653/v1/2020.acl-main.131
– ident: 10372_CR74
  doi: 10.1109/SANER53432.2022.00052
– ident: 10372_CR68
  doi: 10.1109/ICPC52881.2021.00022
– volume: 10
  start-page: 226
  issue: 2
  year: 2005
  ident: 10372_CR30
  publication-title: J Agric Biol Environ Stat
  doi: 10.1198/108571105X46642
– ident: 10372_CR35
  doi: 10.18653/v1/P16-1057
– ident: 10372_CR54
  doi: 10.18653/v1/2021.spnlp-1.2
– ident: 10372_CR83
– ident: 10372_CR56
  doi: 10.18653/v1/2021.emnlp-main.779
– ident: 10372_CR59
– ident: 10372_CR2
  doi: 10.1109/MSR.2013.6624004
– ident: 10372_CR36
  doi: 10.1145/3324884.3416591
– volume: 27
  start-page: 1
  issue: 4
  year: 2022
  ident: 10372_CR37
  publication-title: Emp Softw Eng
– volume: 2020
  start-page: 1536
  year: 2020
  ident: 10372_CR10
  publication-title: Findings of the Association for Computational Linguistics: EMNLP
– ident: 10372_CR27
  doi: 10.18653/v1/P17-1089
– ident: 10372_CR41
– ident: 10372_CR61
  doi: 10.1145/3540250.3549113
– ident: 10372_CR12
  doi: 10.18653/v1/2021.acl-long.295
– ident: 10372_CR67
  doi: 10.18653/v1/D16-1137
– volume: 1
  start-page: 9
  issue: 8
  year: 2019
  ident: 10372_CR51
  publication-title: OpenAI blog
SSID ssj0009745
Score 2.4111724
Snippet Due to the development of pre-trained language models, automated code generation techniques have shown great promise in recent years. However, the generated...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 141
SubjectTerms Algorithms
Compilers
Computer Science
Constraint modelling
Decoding
Interpreters
Programming Languages
Representations
Software Engineering/Programming and Operating Systems
Syntax
SummonAdditionalLinks – databaseName: Engineering Database
  dbid: M7S
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEA5aPXixPrFaJQdvGmyyadI9SRGLBykFK_S2bF6lWLa1uxX7703SrIuCvXjdR1jyzWRmk5nvA-BaGcwEIxKRNI0QFbKNhEwF4jb6MGZUy0hPmf_M-_3OaBQPwoZbHsoqyzXRL9RqJt0e-Z3N7Lmjq6Ot-_k7cqpR7nQ1SGhsgx3HktDypXsvFeku9yLFjmYPRTa2h6aZ0DrHGEU2YiHXKUcQ_hmYqmzz1wGpjzu9-n-_-ADsh4wTdtcmcgi2dHYE6qWaAwzOfQwGXZivsiL9ROPlRGkFfa0hKtL8DQZtiTEsKcihzXXhcLlQriwjQ3mxmmro2uPh2PNYO7hPwGvvcfjwhILeApLWEQskObcwaZvAERVJLA1WynSoETGWsexEMqXYJoTG_rIIqu2ExpyIiAsWm5gaFkenoJbNMn0GIKYaU3vVKS1SQbggjimwExtHEEZYuwFwOdmJDGTkThNjmlQ0yg6gxAKUeIAS3AA33-_M11QcG59ulqgkwS3zpIKkAW5LXKvbf492vnm0C7BHvCm5rZkmqBWLpb4Eu_KjmOSLK2-UX4lB5eA
  priority: 102
  providerName: ProQuest
Title A syntax-guided multi-task learning approach for Turducken-style code generation
URI https://link.springer.com/article/10.1007/s10664-023-10372-1
https://www.proquest.com/docview/2877034240
Volume 28
WOSCitedRecordID wos001082634100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1573-7616
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0009745
  issn: 1382-3256
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PS8MwFA6yefDi_InTOXLwpoElzZLmOGXiQUbZpgwvpUmbMRxV1k7cf2-StVZFBb300KahvJfX917y3vcBcBZrzCQjCpEo8hCVqoukiiTixvswpuOOVg4y_5YPBv5kIoKiKSwrq93LI0n3p_7Q7MYYRcbHINvbRpDJeeoWvMTyFgxH9xXULnfUxBZcD3nGoxetMt_P8dkdVTHml2NR522uG__7zh2wXUSXsLdeDrtgI0n3QKNkboCFIe-DoAezVZpHr2i6nMVJDF1dIcqj7BEWPBJTWMKNQxPXwvFyEdsSjBRl-WqeQNsKD6cOs9qq9gDcXffHVzeo4FZAyhhdjhTnRiWJCdZI7CmsNI5j7VMtBVZC-Z6KKDbBnzbpiaSJEaPgRHpcMqEF1Ux4h6CWPqXJEYCYJpiau5ZVkUrCJbGogL7QFgyMsG4T4FLEoSqAxy3_xTysIJOtyEIjstCJLMRNcP7-zvMaduPX0a1Sc2FhglloUkFu8Q1ppwkuSk1Vj3-e7fhvw0_AFnHKttsyLVDLF8vkFGyql3yWLdqgftkfBMO2rSkdmWvQfWi75foGK47fYA
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3LbtQwFL2qChJsKE91SgEvYAUWteOxJwuEKqBq1WE0i0Gq2IT4Naqo0naSAean-Ebu9SREINFdF2yT2Eri4-vjxz0H4LmPQlstHZdlmXFl3ZBbV1pucPTROvq96JJk_thMJqOTk3y6AT-7XBg6VtnFxBSo_bmjNfLXyOwNydWpvbcXl5xco2h3tbPQWMPiOKy-45StfnP0Htv3hZQHH2bvDnnrKsAdwq3hzhh8mYA0RfrMCReF93Gkos2Fy90oc6USSHsiEnOrQobTAyNtZqzOY66iJvElDPk3kEbkFAimw8-9yK9Jpsgk68exoG6TdNpUPa0VxxGSU2ae5OLPgbBnt39tyKZx7mDrf_tDd-FOy6jZ_roL3IONUN2Hrc6tgrXB6wFM91m9qpryB58vT33wLJ2l5E1Zf2Wtd8acdRLrDLk8my0Xno6dVLxuVmeBUfo_myedboLzQ_h0Ld_1CDar8ypsAxMqCIVXyUlSWWmsJCXEUR5JAE3q4QBE17iFa8XWyfPjrOhlogkQBQKiSIAoxABe_i5zsZYaufLp3Q4FRRt26qKHwABedTjqb_-7tp2ra3sGtw5nH8fF-Ghy_BhuywRjWobahc1msQxP4Kb71pzWi6epQzD4ct34-gUd2UN9
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB1VLUJcKJ9iSwEf4ARW145rbw4IFdoVVavVChWptxB_rSqqtGyyhf1r_Dpmsg4RSPTWA9cktpL4eWZsz7wH8NJHoa2WjsuyzLiybpdbV1pu0PtoHf0wupYy_9hMJqPT03y6Bj-7WhhKq-xsYmuo_YWjPfIdjOwN0dWp4U5MaRHT_fG7y2-cFKTopLWT01hB5Cgsv-PyrX57uI9j_UrK8cHJh488KQxwh9BruDMGXyxgyCJ95oSLwvs4UtHmwuVulLlSCQyBIgbpVoUMlwpG2sxYncdcRU1ETGj-N0yWD3F2bbw_mEw_9ZS_ppVIJpI_jk11KtlJhXtaK47-klOdnuTiT7fYx7p_Hc-2Xm-8-T__r3twN8XabG81Oe7DWqgewGanY8GSWXsI0z1WL6um_MFnizMfPGuzLHlT1l9ZUtWYsY58nWGUz04Wc08JKRWvm-V5YEQMwGYtgzcB_RF8vpHvegzr1UUVngATKgiFV0ljUllprCSOxFEeiRpN6t0BiG6gC5do2EkN5LzoCaQJHAWCo2jBUYgBvP7d5nJFQnLt09sdIopkkOqih8MA3nSY6m__u7et63t7AbcRVsXx4eToKdyRLaJpf2ob1pv5IjyDW-6qOavnz9PsYPDlpgH2C2fITXU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+syntax-guided+multi-task+learning+approach+for+Turducken-style+code+generation&rft.jtitle=Empirical+software+engineering+%3A+an+international+journal&rft.au=Yang%2C+Guang&rft.au=Zhou%2C+Yu&rft.au=Chen%2C+Xiang&rft.au=Zhang%2C+Xiangyu&rft.date=2023-11-01&rft.pub=Springer+US&rft.issn=1382-3256&rft.eissn=1573-7616&rft.volume=28&rft.issue=6&rft_id=info:doi/10.1007%2Fs10664-023-10372-1&rft.externalDocID=10_1007_s10664_023_10372_1
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1382-3256&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1382-3256&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1382-3256&client=summon