From Commit Message Generation to History-Aware Commit Message Completion

Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit messag...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 723 - 735
Hlavní autori: Eliseeva, Aleksandra, Sokolov, Yaroslav, Bogomolov, Egor, Golubev, Yaroslav, Dig, Danny, Bryksin, Timofey
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 11.09.2023
Predmet:
ISSN:2643-1572
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit message generation (CMG) has not yet led to wide adoption in practice. We argue that if we could shift the focus from commit message generation to commit message completion and use previous commit history as additional context, we could significantly improve the quality and the personal nature of the resulting commit messages. In this paper, we propose and evaluate both of these novel ideas. Since the existing datasets lack historical data, we collect and share a novel dataset called CommitChronicle, containing 10.7M commits across 20 programming languages. We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3.5-turbo. Our results show that in some contexts, commit message completion shows better results than generation, and that while in general GPT-3.5-turbo performs worse, it shows potential for long and detailed messages. As for the history, the results show that historical information improves the performance of CMG models in the generation task, and the performance of GPT-3.5-turbo in both generation and completion.
AbstractList Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit message generation (CMG) has not yet led to wide adoption in practice. We argue that if we could shift the focus from commit message generation to commit message completion and use previous commit history as additional context, we could significantly improve the quality and the personal nature of the resulting commit messages. In this paper, we propose and evaluate both of these novel ideas. Since the existing datasets lack historical data, we collect and share a novel dataset called CommitChronicle, containing 10.7M commits across 20 programming languages. We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3.5-turbo. Our results show that in some contexts, commit message completion shows better results than generation, and that while in general GPT-3.5-turbo performs worse, it shows potential for long and detailed messages. As for the history, the results show that historical information improves the performance of CMG models in the generation task, and the performance of GPT-3.5-turbo in both generation and completion.
Author Eliseeva, Aleksandra
Dig, Danny
Sokolov, Yaroslav
Golubev, Yaroslav
Bogomolov, Egor
Bryksin, Timofey
Author_xml – sequence: 1
  givenname: Aleksandra
  surname: Eliseeva
  fullname: Eliseeva, Aleksandra
  email: alexandra.eliseeva@jetbrains.com
  organization: JetBrains Research,Republic of Serbia
– sequence: 2
  givenname: Yaroslav
  surname: Sokolov
  fullname: Sokolov, Yaroslav
  email: yaroslav.sokolov@jetbrains.com
  organization: JetBrains,Germany
– sequence: 3
  givenname: Egor
  surname: Bogomolov
  fullname: Bogomolov, Egor
  email: egor.bogomolov@jetbrains.com
  organization: JetBrains Research,Republic of Cyprus
– sequence: 4
  givenname: Yaroslav
  surname: Golubev
  fullname: Golubev, Yaroslav
  email: yaroslav.golubev@jetbrains.com
  organization: JetBrains Research,Republic of Serbia
– sequence: 5
  givenname: Danny
  surname: Dig
  fullname: Dig, Danny
  email: danny.dig@jetbrains.com
  organization: jetBrains Research, University of Colorado Boulder,United States
– sequence: 6
  givenname: Timofey
  surname: Bryksin
  fullname: Bryksin, Timofey
  email: timofey.bryksin@jetbrains.com
  organization: JetBrains Research,Republic of Cyprus
BookMark eNpdj8tKw0AYRkdRsK19Al3kBRL_mX-uyxB6g0oX1XWZJDMSaTJlJiB9eyO6cvVx4HDgm5O7IQyOkCcKBaVgXsrjSkjGTMGAYQEASt-QpVFGowBkxkh-S2ZMcsypUOyBzFP6BBATqBnZrWPosyr0fTdmry4l--GyjRtctGMXhmwM2bZLY4jXvPyy0f1XJ7yc3Y_6SO69PSe3_NsFeV-v3qptvj9sdlW5zy3TfMxb6n1Da1TeW_SqbmrJpaAIXmugkoM1TDdCTCC0bT03XNW-Nagsx4ZbXJDn327nnDtdYtfbeD1RYNNhNPgNZGROyg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ASE56229.2023.00078
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350329964
EISSN 2643-1572
EndPage 735
ExternalDocumentID 10298339
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a284t-d1ffc1b37ffa3f7bcb6465130f8801640a928c5580158adf4947bfd937a43c4a3
IEDL.DBID RIE
ISICitedReferencesCount 15
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200058&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:41 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a284t-d1ffc1b37ffa3f7bcb6465130f8801640a928c5580158adf4947bfd937a43c4a3
PageCount 13
ParticipantIDs ieee_primary_10298339
PublicationCentury 2000
PublicationDate 2023-Sept.-11
PublicationDateYYYYMMDD 2023-09-11
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-11
  day: 11
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev ASE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0051577
ssib057256115
Score 2.4163013
Snippet Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit...
SourceID ieee
SourceType Publisher
StartPage 723
SubjectTerms Computer languages
Filtering
Focusing
History
Reliability
Software
Writing
Title From Commit Message Generation to History-Aware Commit Message Completion
URI https://ieeexplore.ieee.org/document/10298339
WOSCitedRecordID wos001103357200058&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGA5uePA0PyZ-k4PXaNKkTXMcsqEHx0CF3UY-ZQdbqZ3ivzdv200RPHhrSgnlTd4-eZs8z4PQpQyx8nHOESsTTUTiJdHCBuIMjZ9C6YU2rjGbkNNpPp-rWUdWb7gw3vvm8Jm_gstmL9-VdgW_ymKGJyrnXPVQT8qsJWutJ08qI3gztln7RpyWspMZYlRdjx7GEeoT4KYkIGpKwVjth6FKgyeTwT_fZBcNv5l5eLbBnD205Yt9NFhbM-AuUw_Q3aQqXzCwP5Y1vgebk2ePW4lpGAlcl7gVCPkkow9d-d-PQo8gy10WQ_Q0GT_e3JLONYHoCDU1cSwEywyXIWgepLEmA79zTkNM1VgcUa2S3KZpbKS5dkEoIU1wcZmiBbdC80PUL8rCHyHsuDHKwE4hzaFuUZkX3EhqTJ4F79JjNITQLF5bYYzFOionf9w_RTsQfThuwdgZ6tfVyp-jbfteL9-qi2Y4vwATxJ_7
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFA06BX2aHxO_zYOv0bRJl-ZxyMaG2xg4YW8jn-KDrdRO8d-b225TBB98a0soJTc3J7fJOQeha-FD5WOtJUbEivDYCaK48cRqGqZC4bjStjKbEONxOpvJyZKsXnFhnHPV4TN3A5fVXr7NzQJ-lYUMj2XKmNxEWwnnMa3pWqvhk4gA31G0Xv0GpBZiKTQUUXnbeegGsI-BnRKDrCkFa7UflioVovSa__yWPdT65ubhyRp19tGGyw5Qc2XOgJe5eogGvSJ_wcD_eC7xCIxOnhyuRaYhFrjMcS0R8kk6H6pwv5vCG0GYO89a6LHXnd71ydI3gagANiWxkfcm0kx4r5gX2ug2OJ4z6kOyhvKIKhmnJknCTZIq67nkQnsbFiqKM8MVO0KNLM_cMcKWaS017BXSFCoX2XacaUG1Ttve2eQEtaBr5q-1NMZ81Sunfzy_Qjv96Wg4Hw7G92doFyIBhy-i6Bw1ymLhLtC2eS-f34rLKrRfmBGjQg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=From+Commit+Message+Generation+to+History-Aware+Commit+Message+Completion&rft.au=Eliseeva%2C+Aleksandra&rft.au=Sokolov%2C+Yaroslav&rft.au=Bogomolov%2C+Egor&rft.au=Golubev%2C+Yaroslav&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=723&rft.epage=735&rft_id=info:doi/10.1109%2FASE56229.2023.00078&rft.externalDocID=10298339