Learning Deep Semantics for Test Completion

Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings / International Conference on Software Engineering s. 2111 - 2123
Hlavní autori: Nie, Pengyu, Banerjee, Rahul, Li, Junyi Jessy, Mooney, Raymond J., Gligoric, Milos
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.05.2023
Predmet:
ISSN:1558-1225
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test. We develop TECo-a deep learning model using code semantics for test completion. The key insight underlying TECO is that predicting the next statement in a test method requires reasoning about code execution, which is hard to do with only syntax-level data that existing code completion models use. Teco extracts and uses six kinds of code semantics data, including the execution result of prior statements and the execution context of the test method. To provide a testbed for this new task, as well as to evaluate TECO, we collect a corpus of 130,934 test methods from 1,270 open-source Java projects. Our results show that Teco achieves an exact-match accuracy of 18, which is 29% higher than the best baseline using syntax-level data only. When measuring functional correctness of generated next statement, Teco can generate runnable code in 29% of the cases compared to 18% obtained by the best baseline. Moreover, Teco is sianificantly better than prior work on test oracle generation.
AbstractList Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test. We develop TECo-a deep learning model using code semantics for test completion. The key insight underlying TECO is that predicting the next statement in a test method requires reasoning about code execution, which is hard to do with only syntax-level data that existing code completion models use. Teco extracts and uses six kinds of code semantics data, including the execution result of prior statements and the execution context of the test method. To provide a testbed for this new task, as well as to evaluate TECO, we collect a corpus of 130,934 test methods from 1,270 open-source Java projects. Our results show that Teco achieves an exact-match accuracy of 18, which is 29% higher than the best baseline using syntax-level data only. When measuring functional correctness of generated next statement, Teco can generate runnable code in 29% of the cases compared to 18% obtained by the best baseline. Moreover, Teco is sianificantly better than prior work on test oracle generation.
Author Gligoric, Milos
Li, Junyi Jessy
Banerjee, Rahul
Nie, Pengyu
Mooney, Raymond J.
Author_xml – sequence: 1
  givenname: Pengyu
  surname: Nie
  fullname: Nie, Pengyu
  email: pynie@utexas.edu
  organization: UT,Austin,USA
– sequence: 2
  givenname: Rahul
  surname: Banerjee
  fullname: Banerjee, Rahul
  email: rahulb517@utexas.edu
  organization: UT,Austin,USA
– sequence: 3
  givenname: Junyi Jessy
  surname: Li
  fullname: Li, Junyi Jessy
  email: jessy@utexas.edu
  organization: UT,Austin,USA
– sequence: 4
  givenname: Raymond J.
  surname: Mooney
  fullname: Mooney, Raymond J.
  email: mooney@utexas.edu
  organization: UT,Austin,USA
– sequence: 5
  givenname: Milos
  surname: Gligoric
  fullname: Gligoric, Milos
  email: gligoric@utexas.edu
  organization: UT,Austin,USA
BookMark eNotj7FOwzAUAA0CiVLyBx2yowQ_2-_FHlEoUCkSQ8tc2eYFWWqcKsnC3wOC6W466W7FVR4zC7EBWQNI97Br91tjCVytpNK1lNDYC1G4xgIRGmwkuEuxAkRbgVJ4I4p5TkEiOAVa0krcd-ynnPJn-cR8Lvc8-LykOJf9OJUHnpeyHYfziZc05jtx3fvTzMU_1-L9eXtoX6vu7WXXPnaVV5aWypMJ5D0RhchKRjaoIwQm1_9KJB0sfYBGiMp4DIS6N9ijAidNBKPXYvPXTcx8PE9p8NPXEX7mFCmpvwEui0RS
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICSE48619.2023.00178
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781665457019
1665457015
EISSN 1558-1225
EndPage 2123
ExternalDocumentID 10172620
Genre orig-research
GroupedDBID -~X
.4S
.DC
123
23M
29O
5VS
6IE
6IF
6IH
6IK
6IL
6IM
6IN
8US
AAJGR
AAWTH
ABLEC
ADZIZ
AFFNX
ALMA_UNASSIGNED_HOLDINGS
APO
ARCSS
AVWKF
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
EDO
FEDTE
I-F
I07
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
XOL
ID FETCH-LOGICAL-a286t-a64b6aa666bce20ce453c1be69f53c1c63b86d1351c24a5b653f45f521904c143
IEDL.DBID RIE
ISICitedReferencesCount 26
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001032629800169&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:09:24 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a286t-a64b6aa666bce20ce453c1be69f53c1c63b86d1351c24a5b653f45f521904c143
PageCount 13
ParticipantIDs ieee_primary_10172620
PublicationCentury 2000
PublicationDate 2023-May
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-May
PublicationDecade 2020
PublicationTitle Proceedings / International Conference on Software Engineering
PublicationTitleAbbrev ICSE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib051921306
ssj0006499
Score 2.487058
Snippet Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code...
SourceID ieee
SourceType Publisher
StartPage 2111
SubjectTerms Codes
Deep learning
deep neural networks
Java
Measurement
Predictive models
programming language semantics
Semantics
test completion
Writing
Title Learning Deep Semantics for Test Completion
URI https://ieeexplore.ieee.org/document/10172620
WOSCitedRecordID wos001032629800169&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05b8IwGLUK6tCJHlS95aGrwYmveKagdkFIUIkN-fhSMTQgjv7-2k6g6tChmxXJUnwk7_l47yH0HFiqCrhOiWJGEe7BEss5JxlwDZKWzGcuhU2o8biYz_WkEasnLQwApMtn0IvFdJbvV24ft8r6cfpEA_UWaimlarHWYfKIaOzF4pFh8xuWgcs3WrmM6v7bYDrkRVgu9GJgeC_F0v9KVEmAMur881XOUfdHmocnR9C5QCdQXaLOIZsBN5_q1dE49QO_AKzxFD5DFy7dFgeSimcBCnCsE523V1UXvY-Gs8EraYIRiMkLuSNGciuNCSsP6yCnDrhgLrMgdRkLTjJbSB-z91zOjbBSsJKLMiC1ptwFhnSN2tWqghuEqSxYyTSEMVG8pLmxTjPrCy-9CORM3KJubPxiXXtfLA7tvvvj-T06i_1bXwl8QO3dZg-P6NR97ZbbzVMasW8w45Np
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JTwIxGG0UTfSEC8bdHrwWOtNlpmeEQERCAibcyLT9xnBwhrD4-23LgPHgwVszSZPpMvNel_ceQs-OpSYO1ylJWJYQbkETzTknEXAFkubMRiaETSTDYTqdqlElVg9aGAAIl8-g6YvhLN-WZuO3ylp--ngD9UN0JDiPo61cazd9hLf2Yv7QsPoRS8fmK7VcRFWr3x53eOoWDE0fGd4MwfS_MlUCpHTr_3yZM9T4Eefh0R52ztEBFBeovktnwNXHerm3Tv3ALwALPIZP14lzs8KOpuKJAwPs63jv7bJooPduZ9LukSoagWRxKtckk1zLLHNrD20gpga4YCbSIFXuC0YynUrr0_dMzDOhpWA5F7nDakW5cRzpCtWKsoBrhKlMWc4UuFFJeE7jTBvFtE2ttMLRM3GDGr7xs8XW_WK2a_ftH8-f0Elv8jaYDfrD1zt06vt6e0HwHtXWyw08oGPztZ6vlo9h9L4BEa2WsA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Learning+Deep+Semantics+for+Test+Completion&rft.au=Nie%2C+Pengyu&rft.au=Banerjee%2C+Rahul&rft.au=Li%2C+Junyi+Jessy&rft.au=Mooney%2C+Raymond+J.&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=2111&rft.epage=2123&rft_id=info:doi/10.1109%2FICSE48619.2023.00178&rft.externalDocID=10172620