FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

Commit messages summarize code changes of each commit in nat-ural language, which help developers understand code changes without digging into detailed implementations and play an essen-tial role in comprehending software evolution. To alleviate human efforts in writing commit messages, researchers...

Full description

Saved in:
Bibliographic Details
Published in:2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) pp. 970 - 981
Main Authors: Dong, Jinhao, Lou, Yiling, Zhu, Qihao, Sun, Zeyu, Li, Zhilin, Zhang, Wenjie, Hao, Dan
Format: Conference Proceeding
Language:English
Published: ACM 01.05.2022
Subjects:
ISSN:1558-1225
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Commit messages summarize code changes of each commit in nat-ural language, which help developers understand code changes without digging into detailed implementations and play an essen-tial role in comprehending software evolution. To alleviate human efforts in writing commit messages, researchers have proposed var-ious automated techniques to generate commit messages, including template-based, information retrieval-based, and learning-based techniques. Although promising, previous techniques have limited effectiveness due to their coarse-grained code change representations. This work proposes a novel commit message generation technique, FIRA, which first represents code changes via fine-grained graphs and then learns to generate commit messages automati-cally. Different from previous techniques, FIRA represents the code changes with fine-grained graphs, which explicitly describe the code edit operations between the old version and the new version, and code tokens at different granularities (i.e., sub-tokens and integral tokens). Based on the graph-based representation, FIRA generates commit messages by a generation model, which includes a graph-neural-network-based encoder and a transformer-based decoder. To make both sub-tokens and integral tokens as available ingredients for commit message generation, the decoder is further incorporated with a novel dual copy mechanism. We further per-form an extensive study to evaluate the effectiveness of FIRA. Our quantitative results show that FIRA outperforms state-of-the-art techniques in terms of BLEU, ROUGE-L, and METEOR; and our ablation analysis further shows that major components in our technique both positively contribute to the effectiveness of FIRA. In addition, we further perform a human study to evaluate the quality of generated commit messages from the perspective of developers, and the results consistently show the effectiveness of FIRA over the compared techniques.
ISSN:1558-1225
DOI:10.1145/3510003.3510069