iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm

Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before and after the code changes. A recent differential testing of three state-of- the-art AST mapping algorithms, i.e., GumTree, MTDiff, and IJM,...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 863 - 874
Hlavní autoři: Zhang, Neng, Chen, Qinde, Zheng, Zibin, Zou, Ying
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 11.09.2023
Témata:
ISSN:2643-1572
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before and after the code changes. A recent differential testing of three state-of- the-art AST mapping algorithms, i.e., GumTree, MTDiff, and IJM, reveals that the algorithms generate inaccurate mappings for a considerable number of file revisions. We find that the inaccurate mappings could be caused by the mutual influence: the mappings of lower-level AST nodes (e.g., tokens) have impacts on the mappings of higher-level AST nodes (e.g., statements) and vice versa. This mutual influence issue is rarely considered by existing algorithms. In this paper, we propose an algorithm, called iASTMapper, that iteratively map two ASTs based on the similarities between AST nodes. Given a file revision, we extract three types of AST nodes in different levels of program structures (i.e., tokens, statements, and inner-statements) from the ASTs of the two source code files. We first build mappings of the unchanged statements and inner-statements. Then, we use an iterative method to map the rest of the nodes without mapping. For each of the three types of nodes, we iteratively map the nodes based on their similarities measured using heuristic rules. We further use an iterative mechanism to connect the three iterative mapping processes by considering the mutual influence between the mappings of different types of nodes. Finally, a series of code edit actions are generated from the node mappings to help users understand and locate the code changes during revisions. We conduct experiments to compare iASTMapper with three baselines, i.e., GumTree, MTDiff, and IJM, by automatically evaluating 210,997 file revisions from ten Java projects. Furthermore, we manually evaluate the correctness of the code edit actions generated for 200 file revisions with 12 evaluators. The results demonstrate that iASTMapper outperforms the baselines. iASTMapper can generate shorter code edit actions by at least 1.29% than the baselines, with a high accuracy of 96.23%.
AbstractList Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before and after the code changes. A recent differential testing of three state-of- the-art AST mapping algorithms, i.e., GumTree, MTDiff, and IJM, reveals that the algorithms generate inaccurate mappings for a considerable number of file revisions. We find that the inaccurate mappings could be caused by the mutual influence: the mappings of lower-level AST nodes (e.g., tokens) have impacts on the mappings of higher-level AST nodes (e.g., statements) and vice versa. This mutual influence issue is rarely considered by existing algorithms. In this paper, we propose an algorithm, called iASTMapper, that iteratively map two ASTs based on the similarities between AST nodes. Given a file revision, we extract three types of AST nodes in different levels of program structures (i.e., tokens, statements, and inner-statements) from the ASTs of the two source code files. We first build mappings of the unchanged statements and inner-statements. Then, we use an iterative method to map the rest of the nodes without mapping. For each of the three types of nodes, we iteratively map the nodes based on their similarities measured using heuristic rules. We further use an iterative mechanism to connect the three iterative mapping processes by considering the mutual influence between the mappings of different types of nodes. Finally, a series of code edit actions are generated from the node mappings to help users understand and locate the code changes during revisions. We conduct experiments to compare iASTMapper with three baselines, i.e., GumTree, MTDiff, and IJM, by automatically evaluating 210,997 file revisions from ten Java projects. Furthermore, we manually evaluate the correctness of the code edit actions generated for 200 file revisions with 12 evaluators. The results demonstrate that iASTMapper outperforms the baselines. iASTMapper can generate shorter code edit actions by at least 1.29% than the baselines, with a high accuracy of 96.23%.
Author Chen, Qinde
Zou, Ying
Zhang, Neng
Zheng, Zibin
Author_xml – sequence: 1
  givenname: Neng
  surname: Zhang
  fullname: Zhang, Neng
  email: zhangn279@mail.sysu.edu.cn
  organization: Sun Yat-sen University,China
– sequence: 2
  givenname: Qinde
  surname: Chen
  fullname: Chen, Qinde
  email: chenqd6@mail2.sysu.edu.cn
  organization: Sun Yat-sen University,China
– sequence: 3
  givenname: Zibin
  surname: Zheng
  fullname: Zheng, Zibin
  email: zhzibin@mail.sysu.edu.cn
  organization: Sun Yat-sen University,China
– sequence: 4
  givenname: Ying
  surname: Zou
  fullname: Zou, Ying
  email: ying.zou@queensu.ca
  organization: Queen's University,Canada
BookMark eNotj9FKwzAYhaMouM09gV7kBVqTP0mbeFfH1MFEob0faft3RtqupEHs26-iV4cPznfgLMlVf-qRkDvOYs6ZecjyrUoATAwMRMwYT_UFWZvUaKGYAGMSeUkWkEgRcZXCDVmO4xdjaoZ0QT5clhdvdhjQP9Ksp7uA3gb3jTR3nWutd2GKnuyINc3KMXhbBZpPfbA_tPCI9Fd1_ZFm7fE0dz-7W3Ld2HbE9X-uSPG8LTav0f79ZbfJ9pEFLUMEtuKVTGRZMyUkCkx42khVlqBLPpNBo00ltQbRNPMlqxXWiWBcp6U1IFbk_m_WIeJh8K6zfjpwBkZLLsQZOx9QSA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ASE56229.2023.00178
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350329964
EISSN 2643-1572
EndPage 874
ExternalDocumentID 10298413
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a284t-2ac1c464bd0534e3e617f45bb28b13e69e989c48823ff178a85ed630187ba923
IEDL.DBID RIE
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200069&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:41 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a284t-2ac1c464bd0534e3e617f45bb28b13e69e989c48823ff178a85ed630187ba923
PageCount 12
ParticipantIDs ieee_primary_10298413
PublicationCentury 2000
PublicationDate 2023-Sept.-11
PublicationDateYYYYMMDD 2023-09-11
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-11
  day: 11
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev ASE
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0051577
ssib057256115
Score 2.2584696
Snippet Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before...
SourceID ieee
SourceType Publisher
StartPage 863
SubjectTerms AST mapping
Code change analysis
Codes
Iterative algorithms
Java
Software algorithms
Source coding
Syntactics
Testing
Title iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm
URI https://ieeexplore.ieee.org/document/10298413
WOSCitedRecordID wos001103357200069&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywOroXY-bLMFRAUDVaVk6FbZzgUi0bQqKYJ_zzlNCwsDmxMlSnT2y7vYfvcIuTI84MjrjsV9iFkoLDAFkWTcL7nhELB54RqzCTkcqvFYj1qxeqOFAYBm8xlc-2azlp_P3NJPlSHChVah96jdljJeibXWgyeSSN6cb3Jf5Gkp2zJDvK9vkvQBqV54bYrwRU25N1b7ZajS8Mmg-8832SO9H2UeHW04Z59sQXVAumtrBtoi9ZCMyiTNns18DotbmlT0qSmejF82mpbTEv9mMflmd8hgOU2sn-1wNU2_qtp80mwBQP2t-ACavL3M8NrXaY9kg4fs_pG15gnMIOPUTBjHXRiHNkeYhRAApipFGFkrlOV4pEEr7RC-IigKDIZREeRx4D36rMGs74h0qlkFx4QqHke8MFZZjYjXkW8JBdaYMFZOyRPS8wGazFflMSbr2Jz-cf6M7Po-8JsuOD8nnXqxhAuy4z7q8n1x2XTqNyk8oFw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT4MwFG7MNNHT_DHjb3vwiq6lQPGGZmaL27IEDrstbXkoiWPLZEb_e18Zm148eCsEAnntx_do-72PkBvFXIa8bhy_Db4juAZHghc4zC654RDQaWYqs4lgOJTjcTiqxeqVFgYAqs1ncGub1Vp-OjNLO1WGCOehFNajdtsTgrdXcq318PECpG_GNtkvMnUQ1IWGWDu8i-IOkj236hRuy5oya632y1KlYpSn5j_fZZ-0frR5dLRhnQOyBcUhaa7NGWiN1SMyyqM4Gaj5HBb3NCporyqfjN82GufTHP9nMf12HpDDUhppO99hShp_FaX6pMkCgNpb8QE0enuZ4bWv0xZJnjrJY9ep7RMchZxTOlwZZoQvdIpAE-ACJiuZ8LTmUjM8CiGUoUEAczfLMBhKepD6rnXp0wrzvmPSKGYFnBAqme-xTGmpQ8R86NkWl6CVEr40MjglLRugyXxVIGOyjs3ZH-evyW43GfQn_d7w-Zzs2f6wWzAYuyCNcrGES7JjPsr8fXFVdfA3sdmjow
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=iASTMapper%3A+An+Iterative+Similarity-Based+Abstract+Syntax+Tree+Mapping+Algorithm&rft.au=Zhang%2C+Neng&rft.au=Chen%2C+Qinde&rft.au=Zheng%2C+Zibin&rft.au=Zou%2C+Ying&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=863&rft.epage=874&rft_id=info:doi/10.1109%2FASE56229.2023.00178&rft.externalDocID=10298413