Learning Deep Semantics for Test Completion
Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in...
Uložené v:
| Vydané v: | Proceedings / International Conference on Software Engineering s. 2111 - 2123 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.05.2023
|
| Predmet: | |
| ISSN: | 1558-1225 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test. We develop TECo-a deep learning model using code semantics for test completion. The key insight underlying TECO is that predicting the next statement in a test method requires reasoning about code execution, which is hard to do with only syntax-level data that existing code completion models use. Teco extracts and uses six kinds of code semantics data, including the execution result of prior statements and the execution context of the test method. To provide a testbed for this new task, as well as to evaluate TECO, we collect a corpus of 130,934 test methods from 1,270 open-source Java projects. Our results show that Teco achieves an exact-match accuracy of 18, which is 29% higher than the best baseline using syntax-level data only. When measuring functional correctness of generated next statement, Teco can generate runnable code in 29% of the cases compared to 18% obtained by the best baseline. Moreover, Teco is sianificantly better than prior work on test oracle generation. |
|---|---|
| AbstractList | Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test. We develop TECo-a deep learning model using code semantics for test completion. The key insight underlying TECO is that predicting the next statement in a test method requires reasoning about code execution, which is hard to do with only syntax-level data that existing code completion models use. Teco extracts and uses six kinds of code semantics data, including the execution result of prior statements and the execution context of the test method. To provide a testbed for this new task, as well as to evaluate TECO, we collect a corpus of 130,934 test methods from 1,270 open-source Java projects. Our results show that Teco achieves an exact-match accuracy of 18, which is 29% higher than the best baseline using syntax-level data only. When measuring functional correctness of generated next statement, Teco can generate runnable code in 29% of the cases compared to 18% obtained by the best baseline. Moreover, Teco is sianificantly better than prior work on test oracle generation. |
| Author | Gligoric, Milos Li, Junyi Jessy Banerjee, Rahul Nie, Pengyu Mooney, Raymond J. |
| Author_xml | – sequence: 1 givenname: Pengyu surname: Nie fullname: Nie, Pengyu email: pynie@utexas.edu organization: UT,Austin,USA – sequence: 2 givenname: Rahul surname: Banerjee fullname: Banerjee, Rahul email: rahulb517@utexas.edu organization: UT,Austin,USA – sequence: 3 givenname: Junyi Jessy surname: Li fullname: Li, Junyi Jessy email: jessy@utexas.edu organization: UT,Austin,USA – sequence: 4 givenname: Raymond J. surname: Mooney fullname: Mooney, Raymond J. email: mooney@utexas.edu organization: UT,Austin,USA – sequence: 5 givenname: Milos surname: Gligoric fullname: Gligoric, Milos email: gligoric@utexas.edu organization: UT,Austin,USA |
| BookMark | eNotj7FOwzAUAA0CiVLyBx2yowQ_2-_FHlEoUCkSQ8tc2eYFWWqcKsnC3wOC6W466W7FVR4zC7EBWQNI97Br91tjCVytpNK1lNDYC1G4xgIRGmwkuEuxAkRbgVJ4I4p5TkEiOAVa0krcd-ynnPJn-cR8Lvc8-LykOJf9OJUHnpeyHYfziZc05jtx3fvTzMU_1-L9eXtoX6vu7WXXPnaVV5aWypMJ5D0RhchKRjaoIwQm1_9KJB0sfYBGiMp4DIS6N9ijAidNBKPXYvPXTcx8PE9p8NPXEX7mFCmpvwEui0RS |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ICSE48619.2023.00178 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781665457019 1665457015 |
| EISSN | 1558-1225 |
| EndPage | 2123 |
| ExternalDocumentID | 10172620 |
| Genre | orig-research |
| GroupedDBID | -~X .4S .DC 123 23M 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ AFFNX ALMA_UNASSIGNED_HOLDINGS APO ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO FEDTE I-F I07 IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS XOL |
| ID | FETCH-LOGICAL-a286t-a64b6aa666bce20ce453c1be69f53c1c63b86d1351c24a5b653f45f521904c143 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 26 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001032629800169&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:09:24 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a286t-a64b6aa666bce20ce453c1be69f53c1c63b86d1351c24a5b653f45f521904c143 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10172620 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-May |
| PublicationDateYYYYMMDD | 2023-05-01 |
| PublicationDate_xml | – month: 05 year: 2023 text: 2023-May |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / International Conference on Software Engineering |
| PublicationTitleAbbrev | ICSE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib051921306 ssj0006499 |
| Score | 2.487058 |
| Snippet | Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 2111 |
| SubjectTerms | Codes Deep learning deep neural networks Java Measurement Predictive models programming language semantics Semantics test completion Writing |
| Title | Learning Deep Semantics for Test Completion |
| URI | https://ieeexplore.ieee.org/document/10172620 |
| WOSCitedRecordID | wos001032629800169&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05b8IwGLUK6tCJHlS95aGrwYmveKagdkFIUIkN-fhSMTQgjv7-2k6g6tChmxXJUnwk7_l47yH0HFiqCrhOiWJGEe7BEss5JxlwDZKWzGcuhU2o8biYz_WkEasnLQwApMtn0IvFdJbvV24ft8r6cfpEA_UWaimlarHWYfKIaOzF4pFh8xuWgcs3WrmM6v7bYDrkRVgu9GJgeC_F0v9KVEmAMur881XOUfdHmocnR9C5QCdQXaLOIZsBN5_q1dE49QO_AKzxFD5DFy7dFgeSimcBCnCsE523V1UXvY-Gs8EraYIRiMkLuSNGciuNCSsP6yCnDrhgLrMgdRkLTjJbSB-z91zOjbBSsJKLMiC1ptwFhnSN2tWqghuEqSxYyTSEMVG8pLmxTjPrCy-9CORM3KJubPxiXXtfLA7tvvvj-T06i_1bXwl8QO3dZg-P6NR97ZbbzVMasW8w45Np |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JTwIxGG0UTfSEC8bdHrwWOtNlpmeEQERCAibcyLT9xnBwhrD4-23LgPHgwVszSZPpMvNel_ceQs-OpSYO1ylJWJYQbkETzTknEXAFkubMRiaETSTDYTqdqlElVg9aGAAIl8-g6YvhLN-WZuO3ylp--ngD9UN0JDiPo61cazd9hLf2Yv7QsPoRS8fmK7VcRFWr3x53eOoWDE0fGd4MwfS_MlUCpHTr_3yZM9T4Eefh0R52ztEBFBeovktnwNXHerm3Tv3ALwALPIZP14lzs8KOpuKJAwPs63jv7bJooPduZ9LukSoagWRxKtckk1zLLHNrD20gpga4YCbSIFXuC0YynUrr0_dMzDOhpWA5F7nDakW5cRzpCtWKsoBrhKlMWc4UuFFJeE7jTBvFtE2ttMLRM3GDGr7xs8XW_WK2a_ftH8-f0Elv8jaYDfrD1zt06vt6e0HwHtXWyw08oGPztZ6vlo9h9L4BEa2WsA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Learning+Deep+Semantics+for+Test+Completion&rft.au=Nie%2C+Pengyu&rft.au=Banerjee%2C+Rahul&rft.au=Li%2C+Junyi+Jessy&rft.au=Mooney%2C+Raymond+J.&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=2111&rft.epage=2123&rft_id=info:doi/10.1109%2FICSE48619.2023.00178&rft.externalDocID=10172620 |