From Commit Message Generation to History-Aware Commit Message Completion
Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit messag...
Uložené v:
| Vydané v: | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 723 - 735 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
11.09.2023
|
| Predmet: | |
| ISSN: | 2643-1572 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit message generation (CMG) has not yet led to wide adoption in practice. We argue that if we could shift the focus from commit message generation to commit message completion and use previous commit history as additional context, we could significantly improve the quality and the personal nature of the resulting commit messages. In this paper, we propose and evaluate both of these novel ideas. Since the existing datasets lack historical data, we collect and share a novel dataset called CommitChronicle, containing 10.7M commits across 20 programming languages. We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3.5-turbo. Our results show that in some contexts, commit message completion shows better results than generation, and that while in general GPT-3.5-turbo performs worse, it shows potential for long and detailed messages. As for the history, the results show that historical information improves the performance of CMG models in the generation task, and the performance of GPT-3.5-turbo in both generation and completion. |
|---|---|
| AbstractList | Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit message generation (CMG) has not yet led to wide adoption in practice. We argue that if we could shift the focus from commit message generation to commit message completion and use previous commit history as additional context, we could significantly improve the quality and the personal nature of the resulting commit messages. In this paper, we propose and evaluate both of these novel ideas. Since the existing datasets lack historical data, we collect and share a novel dataset called CommitChronicle, containing 10.7M commits across 20 programming languages. We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3.5-turbo. Our results show that in some contexts, commit message completion shows better results than generation, and that while in general GPT-3.5-turbo performs worse, it shows potential for long and detailed messages. As for the history, the results show that historical information improves the performance of CMG models in the generation task, and the performance of GPT-3.5-turbo in both generation and completion. |
| Author | Eliseeva, Aleksandra Dig, Danny Sokolov, Yaroslav Golubev, Yaroslav Bogomolov, Egor Bryksin, Timofey |
| Author_xml | – sequence: 1 givenname: Aleksandra surname: Eliseeva fullname: Eliseeva, Aleksandra email: alexandra.eliseeva@jetbrains.com organization: JetBrains Research,Republic of Serbia – sequence: 2 givenname: Yaroslav surname: Sokolov fullname: Sokolov, Yaroslav email: yaroslav.sokolov@jetbrains.com organization: JetBrains,Germany – sequence: 3 givenname: Egor surname: Bogomolov fullname: Bogomolov, Egor email: egor.bogomolov@jetbrains.com organization: JetBrains Research,Republic of Cyprus – sequence: 4 givenname: Yaroslav surname: Golubev fullname: Golubev, Yaroslav email: yaroslav.golubev@jetbrains.com organization: JetBrains Research,Republic of Serbia – sequence: 5 givenname: Danny surname: Dig fullname: Dig, Danny email: danny.dig@jetbrains.com organization: jetBrains Research, University of Colorado Boulder,United States – sequence: 6 givenname: Timofey surname: Bryksin fullname: Bryksin, Timofey email: timofey.bryksin@jetbrains.com organization: JetBrains Research,Republic of Cyprus |
| BookMark | eNpdj8tKw0AYRkdRsK19Al3kBRL_mX-uyxB6g0oX1XWZJDMSaTJlJiB9eyO6cvVx4HDgm5O7IQyOkCcKBaVgXsrjSkjGTMGAYQEASt-QpVFGowBkxkh-S2ZMcsypUOyBzFP6BBATqBnZrWPosyr0fTdmry4l--GyjRtctGMXhmwM2bZLY4jXvPyy0f1XJ7yc3Y_6SO69PSe3_NsFeV-v3qptvj9sdlW5zy3TfMxb6n1Da1TeW_SqbmrJpaAIXmugkoM1TDdCTCC0bT03XNW-Nagsx4ZbXJDn327nnDtdYtfbeD1RYNNhNPgNZGROyg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ASE56229.2023.00078 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350329964 |
| EISSN | 2643-1572 |
| EndPage | 735 |
| ExternalDocumentID | 10298339 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a284t-d1ffc1b37ffa3f7bcb6465130f8801640a928c5580158adf4947bfd937a43c4a3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 15 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200058&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a284t-d1ffc1b37ffa3f7bcb6465130f8801640a928c5580158adf4947bfd937a43c4a3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10298339 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-11 |
| PublicationDateYYYYMMDD | 2023-09-11 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0051577 ssib057256115 |
| Score | 2.4163013 |
| Snippet | Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 723 |
| SubjectTerms | Computer languages Filtering Focusing History Reliability Software Writing |
| Title | From Commit Message Generation to History-Aware Commit Message Completion |
| URI | https://ieeexplore.ieee.org/document/10298339 |
| WOSCitedRecordID | wos001103357200058&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGA5uePA0PyZ-k4PXaNKkTXMcsqEHx0CF3UY-ZQdbqZ3ivzdv200RPHhrSgnlTd4-eZs8z4PQpQyx8nHOESsTTUTiJdHCBuIMjZ9C6YU2rjGbkNNpPp-rWUdWb7gw3vvm8Jm_gstmL9-VdgW_ymKGJyrnXPVQT8qsJWutJ08qI3gztln7RpyWspMZYlRdjx7GEeoT4KYkIGpKwVjth6FKgyeTwT_fZBcNv5l5eLbBnD205Yt9NFhbM-AuUw_Q3aQqXzCwP5Y1vgebk2ePW4lpGAlcl7gVCPkkow9d-d-PQo8gy10WQ_Q0GT_e3JLONYHoCDU1cSwEywyXIWgepLEmA79zTkNM1VgcUa2S3KZpbKS5dkEoIU1wcZmiBbdC80PUL8rCHyHsuDHKwE4hzaFuUZkX3EhqTJ4F79JjNITQLF5bYYzFOionf9w_RTsQfThuwdgZ6tfVyp-jbfteL9-qi2Y4vwATxJ_7 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFA06BX2aHxO_zYOv0bRJl-ZxyMaG2xg4YW8jn-KDrdRO8d-b225TBB98a0soJTc3J7fJOQeha-FD5WOtJUbEivDYCaK48cRqGqZC4bjStjKbEONxOpvJyZKsXnFhnHPV4TN3A5fVXr7NzQJ-lYUMj2XKmNxEWwnnMa3pWqvhk4gA31G0Xv0GpBZiKTQUUXnbeegGsI-BnRKDrCkFa7UflioVovSa__yWPdT65ubhyRp19tGGyw5Qc2XOgJe5eogGvSJ_wcD_eC7xCIxOnhyuRaYhFrjMcS0R8kk6H6pwv5vCG0GYO89a6LHXnd71ydI3gagANiWxkfcm0kx4r5gX2ug2OJ4z6kOyhvKIKhmnJknCTZIq67nkQnsbFiqKM8MVO0KNLM_cMcKWaS017BXSFCoX2XacaUG1Ttve2eQEtaBr5q-1NMZ81Sunfzy_Qjv96Wg4Hw7G92doFyIBhy-i6Bw1ymLhLtC2eS-f34rLKrRfmBGjQg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=From+Commit+Message+Generation+to+History-Aware+Commit+Message+Completion&rft.au=Eliseeva%2C+Aleksandra&rft.au=Sokolov%2C+Yaroslav&rft.au=Bogomolov%2C+Egor&rft.au=Golubev%2C+Yaroslav&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=723&rft.epage=735&rft_id=info:doi/10.1109%2FASE56229.2023.00078&rft.externalDocID=10298339 |