Code to Comment "Translation": Data, Metrics, Baselining & Evaluation
The relationship of comments to code, and in particular, the task of generating useful comments given the code, has long been of interest. The earliest approaches have been based on strong syntactic theories of comment-structures, and relied on textual templates. More recently, researchers have appl...
Uložené v:
| Vydané v: | 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) s. 746 - 757 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
ACM
01.09.2020
|
| Predmet: | |
| ISSN: | 2643-1572 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The relationship of comments to code, and in particular, the task of generating useful comments given the code, has long been of interest. The earliest approaches have been based on strong syntactic theories of comment-structures, and relied on textual templates. More recently, researchers have applied deep-learning methods to this task-specifically, trainable generative translation models which are known to work very well for Natural Language translation (e.g., from German to English). We carefully examine the underlying assumption here: that the task of generating comments sufficiently resembles the task of translating between natural languages, and so similar models and evaluation metrics could be used. We analyze several recent code-comment datasets for this task: CODENN, DEEPCOM, FUNCOM, and Docstring. We compare them with WMT19, a standard dataset frequently used to train state-of-the-art natural language translators. We found some interesting differences between the code-comment data and the WMT19 natural language data. Next, we describe and conduct some studies to calibrate BLEU (which is commonly used as a measure of comment quality). using "affinity pairs" of methods, from different projects, in the same project, in the same class, etc; Our study suggests that the current performance on some datasets might need to be improved substantially. We also argue that fairly naive information retrieval (IR) methods do well enough at this task to be considered a reasonable baseline. Finally, we make some suggestions on how our findings might be used in future research in this area. |
|---|---|
| AbstractList | The relationship of comments to code, and in particular, the task of generating useful comments given the code, has long been of interest. The earliest approaches have been based on strong syntactic theories of comment-structures, and relied on textual templates. More recently, researchers have applied deep-learning methods to this task-specifically, trainable generative translation models which are known to work very well for Natural Language translation (e.g., from German to English). We carefully examine the underlying assumption here: that the task of generating comments sufficiently resembles the task of translating between natural languages, and so similar models and evaluation metrics could be used. We analyze several recent code-comment datasets for this task: CODENN, DEEPCOM, FUNCOM, and Docstring. We compare them with WMT19, a standard dataset frequently used to train state-of-the-art natural language translators. We found some interesting differences between the code-comment data and the WMT19 natural language data. Next, we describe and conduct some studies to calibrate BLEU (which is commonly used as a measure of comment quality). using "affinity pairs" of methods, from different projects, in the same project, in the same class, etc; Our study suggests that the current performance on some datasets might need to be improved substantially. We also argue that fairly naive information retrieval (IR) methods do well enough at this task to be considered a reasonable baseline. Finally, we make some suggestions on how our findings might be used in future research in this area. |
| Author | Yu, Zhou Sezhiyan, Hariharan Devanbu, Prem Gros, David |
| Author_xml | – sequence: 1 givenname: David surname: Gros fullname: Gros, David email: dgros@ucdavis.edu organization: University of California,Davis – sequence: 2 givenname: Hariharan surname: Sezhiyan fullname: Sezhiyan, Hariharan email: hsezhiyan@ucdavis.edu organization: University of California,Davis – sequence: 3 givenname: Prem surname: Devanbu fullname: Devanbu, Prem email: devanbu@ucdavis.edu organization: University of California,Davis – sequence: 4 givenname: Zhou surname: Yu fullname: Yu, Zhou email: joyu@ucdavis.edu organization: University of California,Davis |
| BookMark | eNotjr1PwzAUxA0CibZ0ZmCxOjA1xfZ7thO2EsKHVMRS5sp1XpBR6qA4IPHfEwHT6X53Ot2UncQuEmMXUqykRH0NoDDPcQUojUZzxOaFzcdAgLEmx2M2UQYhk9qqMzZN6V0IPRo7YVXZ1cSHjpfd4UBx4Itt72Jq3RC6uLjhd25wS_5MQx98WvJbl6gNMcQ3fsWrL9d-_hbP2Wnj2kTzf52x1_tqWz5mm5eHp3K9yZxCO2SFb2TTGK2U8oQOUGunSRV-L2vQ5LVH0iPzDiSYemSyQOOlGK_avTYwY5d_u4GIdh99OLj-e1eo3AgQ8AOWikpr |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK ESBDL RIE RIL |
| DOI | 10.1145/3324884.3416546 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Xplore Open Access Journals IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781450367684 1450367682 |
| EISSN | 2643-1572 |
| EndPage | 757 |
| ExternalDocumentID | 9286030 |
| Genre | orig-research |
| GroupedDBID | 29I 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO ESBDL IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a247t-9cf1ff65222ce4a3455a5e29cb1d35ec5c4e555aca3136dd351946c105777b563 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 54 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000651313500063&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:37:34 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a247t-9cf1ff65222ce4a3455a5e29cb1d35ec5c4e555aca3136dd351946c105777b563 |
| OpenAccessLink | https://ieeexplore.ieee.org/document/9286030 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_9286030 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-Sept. |
| PublicationDateYYYYMMDD | 2020-09-01 |
| PublicationDate_xml | – month: 09 year: 2020 text: 2020-Sept. |
| PublicationDecade | 2020 |
| PublicationTitle | 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2020 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0051577 ssj0002871035 |
| Score | 2.4916444 |
| Snippet | The relationship of comments to code, and in particular, the task of generating useful comments given the code, has long been of interest. The earliest... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 746 |
| SubjectTerms | Calibration Current measurement Data models Information retrieval Natural languages Task analysis US Government |
| Title | Code to Comment "Translation": Data, Metrics, Baselining & Evaluation |
| URI | https://ieeexplore.ieee.org/document/9286030 |
| WOSCitedRecordID | wos000651313500063&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED61FQNTgRbxllUhpqatYzt2GCmtGKDqAFK3yo-LxNKgNuX3YychCImFzbrBsvy67-z77gO4pQppKh2LFGY64jHKyEx8S_uQ2TpKra2Iws9ysVCrVbpswbDhwiBimXyGo9As__JdbvfhqWycxirxm7INbSllxdVq3lMC8p-wBvp6Ny1lXcqHcjFmHjgoxUf-0g70nV9aKqUrmXf_N4gj6P9w8siy8TbH0MLNCXS_RRlIfUZ7MJvmDkmRk0D98D2RQemOqpS3wT151IUekpcgpGV3Q_KgS0a675LckVlT-rsPb_PZ6_QpqrUSIh1zWUSpzWiWJR5NxRa5ZlwILTBOraGOCbTCchTeZjWjLHEu6PLxxAaVXymNSNgpdDb5Bs-AZIkOZfnQYwPFfThnEpsxicakynj4aM6hF2Zl_VGVw1jXE3Lxt_kSDuMQopZpWVfQKbZ7vIYD-1m877Y35Rp-AXnMmn4 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0gmugJFYzfNsR4YoFu2-2uRxGCEQgHTLiRtjubeGENLP5-2911jYkXb80cmqZf86adNw_gnoZIIxkzL8REedxH6em-bSkbMpuYUmMKovBEzmbhchnNa9CpuDCImCefYdc187_8ODU791TWi_wwsJtyD_YF5z4t2FrVi4rD_n1WgV_rqKUsi_lQLnrMQocw5F17bTsCzy81ldyZjBr_G8YxtH5YeWRe-ZsTqOH6FBrfsgykPKVNGA7SGEmWEkf-sD2Rdu6QiqS39iN5VpnqkKmT0jLbDnlSOSfddkkeyLAq_t2Ct9FwMRh7pVqCp3wuMy8yCU2SwOIp3yBXjAuhBPqR0TRmAo0wHIW1GcUoC-LYKfPxwDidXym1CNgZ1NfpGs-BJIFyhfnQooOQ24BOByZhErWOQm0BpL6AppuV1UdREGNVTsjl3-Y7OBwvppPV5GX2egVHvgtY8ySta6hnmx3ewIH5zN63m9t8Pb8At56dxQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+35th+IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%28ASE%29&rft.atitle=Code+to+Comment+%22Translation%22%3A+Data%2C+Metrics%2C+Baselining+%26+Evaluation&rft.au=Gros%2C+David&rft.au=Sezhiyan%2C+Hariharan&rft.au=Devanbu%2C+Prem&rft.au=Yu%2C+Zhou&rft.date=2020-09-01&rft.pub=ACM&rft.eissn=2643-1572&rft.spage=746&rft.epage=757&rft_id=info:doi/10.1145%2F3324884.3416546&rft.externalDocID=9286030 |