iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm
Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before and after the code changes. A recent differential testing of three state-of- the-art AST mapping algorithms, i.e., GumTree, MTDiff, and IJM,...
Uloženo v:
| Vydáno v: | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 863 - 874 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
11.09.2023
|
| Témata: | |
| ISSN: | 2643-1572 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before and after the code changes. A recent differential testing of three state-of- the-art AST mapping algorithms, i.e., GumTree, MTDiff, and IJM, reveals that the algorithms generate inaccurate mappings for a considerable number of file revisions. We find that the inaccurate mappings could be caused by the mutual influence: the mappings of lower-level AST nodes (e.g., tokens) have impacts on the mappings of higher-level AST nodes (e.g., statements) and vice versa. This mutual influence issue is rarely considered by existing algorithms. In this paper, we propose an algorithm, called iASTMapper, that iteratively map two ASTs based on the similarities between AST nodes. Given a file revision, we extract three types of AST nodes in different levels of program structures (i.e., tokens, statements, and inner-statements) from the ASTs of the two source code files. We first build mappings of the unchanged statements and inner-statements. Then, we use an iterative method to map the rest of the nodes without mapping. For each of the three types of nodes, we iteratively map the nodes based on their similarities measured using heuristic rules. We further use an iterative mechanism to connect the three iterative mapping processes by considering the mutual influence between the mappings of different types of nodes. Finally, a series of code edit actions are generated from the node mappings to help users understand and locate the code changes during revisions. We conduct experiments to compare iASTMapper with three baselines, i.e., GumTree, MTDiff, and IJM, by automatically evaluating 210,997 file revisions from ten Java projects. Furthermore, we manually evaluate the correctness of the code edit actions generated for 200 file revisions with 12 evaluators. The results demonstrate that iASTMapper outperforms the baselines. iASTMapper can generate shorter code edit actions by at least 1.29% than the baselines, with a high accuracy of 96.23%. |
|---|---|
| AbstractList | Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before and after the code changes. A recent differential testing of three state-of- the-art AST mapping algorithms, i.e., GumTree, MTDiff, and IJM, reveals that the algorithms generate inaccurate mappings for a considerable number of file revisions. We find that the inaccurate mappings could be caused by the mutual influence: the mappings of lower-level AST nodes (e.g., tokens) have impacts on the mappings of higher-level AST nodes (e.g., statements) and vice versa. This mutual influence issue is rarely considered by existing algorithms. In this paper, we propose an algorithm, called iASTMapper, that iteratively map two ASTs based on the similarities between AST nodes. Given a file revision, we extract three types of AST nodes in different levels of program structures (i.e., tokens, statements, and inner-statements) from the ASTs of the two source code files. We first build mappings of the unchanged statements and inner-statements. Then, we use an iterative method to map the rest of the nodes without mapping. For each of the three types of nodes, we iteratively map the nodes based on their similarities measured using heuristic rules. We further use an iterative mechanism to connect the three iterative mapping processes by considering the mutual influence between the mappings of different types of nodes. Finally, a series of code edit actions are generated from the node mappings to help users understand and locate the code changes during revisions. We conduct experiments to compare iASTMapper with three baselines, i.e., GumTree, MTDiff, and IJM, by automatically evaluating 210,997 file revisions from ten Java projects. Furthermore, we manually evaluate the correctness of the code edit actions generated for 200 file revisions with 12 evaluators. The results demonstrate that iASTMapper outperforms the baselines. iASTMapper can generate shorter code edit actions by at least 1.29% than the baselines, with a high accuracy of 96.23%. |
| Author | Chen, Qinde Zou, Ying Zhang, Neng Zheng, Zibin |
| Author_xml | – sequence: 1 givenname: Neng surname: Zhang fullname: Zhang, Neng email: zhangn279@mail.sysu.edu.cn organization: Sun Yat-sen University,China – sequence: 2 givenname: Qinde surname: Chen fullname: Chen, Qinde email: chenqd6@mail2.sysu.edu.cn organization: Sun Yat-sen University,China – sequence: 3 givenname: Zibin surname: Zheng fullname: Zheng, Zibin email: zhzibin@mail.sysu.edu.cn organization: Sun Yat-sen University,China – sequence: 4 givenname: Ying surname: Zou fullname: Zou, Ying email: ying.zou@queensu.ca organization: Queen's University,Canada |
| BookMark | eNotj9FKwzAYhaMouM09gV7kBVqTP0mbeFfH1MFEob0faft3RtqupEHs26-iV4cPznfgLMlVf-qRkDvOYs6ZecjyrUoATAwMRMwYT_UFWZvUaKGYAGMSeUkWkEgRcZXCDVmO4xdjaoZ0QT5clhdvdhjQP9Ksp7uA3gb3jTR3nWutd2GKnuyINc3KMXhbBZpPfbA_tPCI9Fd1_ZFm7fE0dz-7W3Ld2HbE9X-uSPG8LTav0f79ZbfJ9pEFLUMEtuKVTGRZMyUkCkx42khVlqBLPpNBo00ltQbRNPMlqxXWiWBcp6U1IFbk_m_WIeJh8K6zfjpwBkZLLsQZOx9QSA |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ASE56229.2023.00178 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350329964 |
| EISSN | 2643-1572 |
| EndPage | 874 |
| ExternalDocumentID | 10298413 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a284t-2ac1c464bd0534e3e617f45bb28b13e69e989c48823ff178a85ed630187ba923 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200069&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a284t-2ac1c464bd0534e3e617f45bb28b13e69e989c48823ff178a85ed630187ba923 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_10298413 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-11 |
| PublicationDateYYYYMMDD | 2023-09-11 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0051577 ssib057256115 |
| Score | 2.2584696 |
| Snippet | Abstract syntax tree (AST) mapping algorithms are widely used to locate the code changes in a file revision by mapping the AST nodes of the source code before... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 863 |
| SubjectTerms | AST mapping Code change analysis Codes Iterative algorithms Java Software algorithms Source coding Syntactics Testing |
| Title | iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm |
| URI | https://ieeexplore.ieee.org/document/10298413 |
| WOSCitedRecordID | wos001103357200069&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywOroY7jOGYLqBUMVJWSoVsVxxeIRNMqTRH8e85pWlgY2BwrUay7nN_Z8btHyI0EIUOdATNg-8zX3DJtQs1wHpQB5JxbzzRiE2o0CicTPW7J6g0XBgCaw2dw65rNv3w7z1Zuqwwj3NOh7zRqd5UK1mStzccjFYI359vcF3FaqbbMEO_ruygeINR7jpviuaKm3Amr_RJUafBk2P3nSA5I74eZR8dbzDkkO1Aeke5GmoG2kXpMxkUUJy_pYgHVPY1K-twUT8aZjcbFrMDVLCbf7AERzNLIuN2OrKbxV1mnnzSpAKh7FF9Ao_fXOd77NuuRZDhIHp9YK57AUkScmnlpxjM_8I3FMPNBAKYquS-N8ULD8UqDRh9h-Hoiz9EYaSjBBsJp9JkUs74T0innJZwSGhhhAiv7JhfCt2hCLjJlQpHmoCUE8oz0nIGmi3V5jOnGNud_9F-QfecDd-iC80vSqasVXJG97KMultV149RvUzKg3Q |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVQQYKpfBTxjQfWQB3bScwWUFEr2qpSMnSr4vgCkWhalRTBv-ecpoWFgc2xEsW6y_mdHb97hNxI4DJQKTgaTNsRihlH6UA5OA9KDzLGjKsrsQl_OAzGYzWqyeoVFwYAqsNncGub1b98M0uXdqsMI9xVgbAatdtSCLe9omutPx_pI3wztsl-Eal9vy40xNrqLow6CPauZae4tqwps9JqvyRVKkR5av5zLPuk9cPNo6MN6hyQLSgOSXMtzkDrWD0iozyM4kEyn8PinoYF7VXlk3Fuo1E-zXE9i-m384AYZmio7X5HWtLoqyiTTxovAKh9FF9Aw7eXGd77Om2R-KkTP3adWj7BSRBzSsdNUpYKT2iDgSaAAyYrmZBau4FmeKVAoZcwgF2eZWiMJJBgPG5V-nSCed8xaRSzAk4I9TTXnpFtnXEuDJqQ8dTXAU8yUBI8eUpa1kCT-apAxmRtm7M_-q_Jbjce9Cf93vD5nOxZf9gjGIxdkEa5WMIl2Uk_yvx9cVU5-Bvi46Qk |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=iASTMapper%3A+An+Iterative+Similarity-Based+Abstract+Syntax+Tree+Mapping+Algorithm&rft.au=Zhang%2C+Neng&rft.au=Chen%2C+Qinde&rft.au=Zheng%2C+Zibin&rft.au=Zou%2C+Ying&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=863&rft.epage=874&rft_id=info:doi/10.1109%2FASE56229.2023.00178&rft.externalDocID=10298413 |