Interpretable Code Summarization
Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and us...
Uloženo v:
| Vydáno v: | IEEE transactions on reliability Ročník 74; číslo 1; s. 2280 - 2289 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.03.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 0018-9529, 1558-1721 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and use various machine learning techniques to generate natural language from source codes. Although most of these methods are state of the art, it is difficult to understand the complex encoding and decoding process to map the tokens with natural language words. Therefore, these coding and decoding approaches are treated as opaque models (black box). This research proposes explainable AI methods that overcome the black box features for the token mapping in code summarization process. Here, we created an abstract syntax tree (AST) from the tokens of the source code. We then embedded the AST into natural language words using a bilingual statistical probability approach to generate possible statistical parse trees. We applied a page rank algorithm among the parse trees to rank the trees. From the best-ranked tree, we generate the comment for the corresponding code snippet. To explain our code generation method, we used Takagi-Sugeno fuzzy approach, layerwise relevance propagation and a hidden Markov model. These approaches make our method trustworthy and understandable to humans to understand the process of source code token mapping with natural language words. |
|---|---|
| AbstractList | Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and use various machine learning techniques to generate natural language from source codes. Although most of these methods are state of the art, it is difficult to understand the complex encoding and decoding process to map the tokens with natural language words. Therefore, these coding and decoding approaches are treated as opaque models (black box). This research proposes explainable AI methods that overcome the black box features for the token mapping in code summarization process. Here, we created an abstract syntax tree (AST) from the tokens of the source code. We then embedded the AST into natural language words using a bilingual statistical probability approach to generate possible statistical parse trees. We applied a page rank algorithm among the parse trees to rank the trees. From the best-ranked tree, we generate the comment for the corresponding code snippet. To explain our code generation method, we used Takagi–Sugeno fuzzy approach, layerwise relevance propagation and a hidden Markov model. These approaches make our method trustworthy and understandable to humans to understand the process of source code token mapping with natural language words. |
| Author | Kamal, Md Sarwar Dey, Nilanjan Nimmy, Sonia Farhana |
| Author_xml | – sequence: 1 givenname: Md Sarwar orcidid: 0000-0002-1945-821X surname: Kamal fullname: Kamal, Md Sarwar email: mdsarwar.kamal@uts.edu.au organization: School of Computer Science, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, Australia – sequence: 2 givenname: Sonia Farhana orcidid: 0000-0003-1788-1329 surname: Nimmy fullname: Nimmy, Sonia Farhana email: s.nimmy@adfa.edu.au organization: Faculty of Economics and Business, University of New South Wales, Sydney, NSW, Australia – sequence: 3 givenname: Nilanjan orcidid: 0000-0001-8437-498X surname: Dey fullname: Dey, Nilanjan email: nilanjan.dey@tint.edu.in organization: Department of Computer Science and Engineering, Techno International New Town, Chakpachuria, West Bengal, India |
| BookMark | eNp9kD1rwzAQhkVJoUnauUuHQGcnkk6ypLGEfgQChdSdhSSfwSGxXVkZ2l9fp8lQOnQ6Dp7nPt4JGTVtg4TcMjpnjJpFsZlzysUcwHCt8gsyZlLqjCnORmRMKdOZkdxckUnfb4dWCKPHZLZqEsYuYnJ-h7NlW-Ls7bDfu1h_uVS3zTW5rNyux5tznZL3p8di-ZKtX59Xy4d1FrihKfNIQym992CQKdC0rDTXgguPQQWpuTS5AFSqVF6awFlpHOgKAAQ4X3qYkvvT3C62Hwfsk922h9gMKy0wJVjOQeQDJU9UiG3fR6xsqNPPnSm6emcZtccwbLGxxzDsOYzBW_zxulgPT37-Y9ydjBoRf9ESqKQCvgGsUWog |
| CODEN | IERQAD |
| CitedBy_id | crossref_primary_10_3390_app15020672 |
| Cites_doi | 10.1109/ICECCS.2019.00012 10.1109/TSE.2019.2925616 10.1016/j.infsof.2021.106761 10.1007/s10664-019-09730-9 10.1109/TSE.2015.2442238 10.1109/TKDE.2019.2931327 10.1145/3468264.3468539 10.1109/TSE.2020.3022212 10.1109/TSE.2019.2946357 10.1109/TSE.2016.2591536 10.1145/3428301 10.1109/TSE.2021.3119012 10.1109/TCE.2003.1196417 10.1109/SMC52423.2021.9658619 10.1109/TCE.2012.6311337 10.1007/978-3-642-16776-8_2 10.1109/ASE.2015.36 10.1145/3472883.3486995 10.1109/TEM.2020.2976642 10.1016/j.jksuci.2022.08.026 10.1109/TSE.2022.3140868 10.1109/TSE.2015.2465386 10.21467/proceedings.114.47 10.1109/ACCESS.2020.3040060 10.1109/TKDE.2019.2947421 10.1109/TSE.2019.2930519 10.1109/ASE.2019.00012 10.1109/ICECCT52121.2021.9616781 10.1109/TSE.2020.2979701 10.18653/v1/P16-1078 10.18293/SEKE2018-191 10.1109/TR.2020.3001918 10.1109/ICoICT52021.2021.9527459 10.18653/v1/P16-1195 10.1109/ACCESS.2021.3051171 10.1007/978-0-387-30162-4_332 10.1007/978-3-030-87571-8_47 10.1109/TSE.2017.2664836 10.1109/ijcnn.2019.8851751 10.1109/TVCG.2020.3028958 10.1002/spe.2893 10.1109/ICISIM.2017.8122149 10.1016/j.jss.2021.111036 10.1109/TCE.2019.2912802 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD L7M |
| DOI | 10.1109/TR.2024.3392876 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
| DatabaseTitle | CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-1721 |
| EndPage | 2289 |
| ExternalDocumentID | 10_1109_TR_2024_3392876 10530504 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 8WZ 97E A6W AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD H~9 IAAWW IBMZZ ICLAB IDIHD IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ OCL P2P RIA RIE RNS TN5 VH1 VJK AAYXX CITATION 7SP 8FD L7M |
| ID | FETCH-LOGICAL-c290t-be0cd5bbb39e17380df828424bec7c58259643e77d7b59c21d9a38f33343abdb3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001226180900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0018-9529 |
| IngestDate | Tue Jul 22 06:11:07 EDT 2025 Sat Nov 29 08:14:30 EST 2025 Tue Nov 18 21:42:11 EST 2025 Wed Aug 27 01:48:45 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c290t-be0cd5bbb39e17380df828424bec7c58259643e77d7b59c21d9a38f33343abdb3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-8437-498X 0000-0003-1788-1329 0000-0002-1945-821X |
| PQID | 3174162346 |
| PQPubID | 85456 |
| PageCount | 10 |
| ParticipantIDs | crossref_citationtrail_10_1109_TR_2024_3392876 crossref_primary_10_1109_TR_2024_3392876 ieee_primary_10530504 proquest_journals_3174162346 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-03-01 |
| PublicationDateYYYYMMDD | 2025-03-01 |
| PublicationDate_xml | – month: 03 year: 2025 text: 2025-03-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on reliability |
| PublicationTitleAbbrev | TR |
| PublicationYear | 2025 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref35 ref12 ref34 ref15 ref37 ref14 ref36 ref31 ref30 ref11 ref33 ref10 ref32 Husain (ref42) 2020 ref2 ref1 ref39 ref16 ref38 ref19 ref18 ref24 ref46 ref23 ref45 ref26 ref25 ref20 ref41 ref22 ref44 ref21 ref43 ref28 ref27 ref29 ref8 ref7 Movshovitz-Attias (ref17) ref9 ref4 ref3 ref6 ref5 ref40 |
| References_xml | – ident: ref30 doi: 10.1109/ICECCS.2019.00012 – ident: ref5 doi: 10.1109/TSE.2019.2925616 – ident: ref21 doi: 10.1016/j.infsof.2021.106761 – ident: ref43 doi: 10.1007/s10664-019-09730-9 – ident: ref14 doi: 10.1109/TSE.2015.2442238 – ident: ref44 doi: 10.1109/TKDE.2019.2931327 – ident: ref33 doi: 10.1145/3468264.3468539 – ident: ref7 doi: 10.1109/TSE.2020.3022212 – ident: ref8 doi: 10.1109/TSE.2019.2946357 – ident: ref2 doi: 10.1109/TSE.2016.2591536 – ident: ref31 doi: 10.1145/3428301 – ident: ref4 doi: 10.1109/TSE.2021.3119012 – ident: ref10 doi: 10.1109/TCE.2003.1196417 – ident: ref23 doi: 10.1109/SMC52423.2021.9658619 – ident: ref9 doi: 10.1109/TCE.2012.6311337 – year: 2020 ident: ref42 article-title: CodeSearchNet challenge: Evaluating the state of semantic code search – ident: ref38 doi: 10.1007/978-3-642-16776-8_2 – ident: ref16 doi: 10.1109/ASE.2015.36 – ident: ref32 doi: 10.1145/3472883.3486995 – ident: ref1 doi: 10.1109/TEM.2020.2976642 – ident: ref45 doi: 10.1016/j.jksuci.2022.08.026 – ident: ref6 doi: 10.1109/TSE.2022.3140868 – ident: ref12 doi: 10.1109/TSE.2015.2465386 – ident: ref25 doi: 10.21467/proceedings.114.47 – ident: ref34 doi: 10.1109/ACCESS.2020.3040060 – ident: ref37 doi: 10.1109/TKDE.2019.2947421 – start-page: 35 volume-title: Proc. 51st Annu. Meeting Assoc. Comput. Linguistics ident: ref17 article-title: Natural language models for predicting programming comments – ident: ref3 doi: 10.1109/TSE.2019.2930519 – ident: ref29 doi: 10.1109/ASE.2019.00012 – ident: ref15 doi: 10.1109/ICECCT52121.2021.9616781 – ident: ref28 doi: 10.1109/TSE.2020.2979701 – ident: ref26 doi: 10.18653/v1/P16-1078 – ident: ref20 doi: 10.18293/SEKE2018-191 – ident: ref27 doi: 10.1109/TR.2020.3001918 – ident: ref39 doi: 10.1109/ICoICT52021.2021.9527459 – ident: ref18 doi: 10.18653/v1/P16-1195 – ident: ref40 doi: 10.1109/ACCESS.2021.3051171 – ident: ref46 doi: 10.1007/978-0-387-30162-4_332 – ident: ref41 doi: 10.1007/978-3-030-87571-8_47 – ident: ref13 doi: 10.1109/TSE.2017.2664836 – ident: ref19 doi: 10.1109/ijcnn.2019.8851751 – ident: ref36 doi: 10.1109/TVCG.2020.3028958 – ident: ref24 doi: 10.1002/spe.2893 – ident: ref35 doi: 10.1109/ICISIM.2017.8122149 – ident: ref22 doi: 10.1016/j.jss.2021.111036 – ident: ref11 doi: 10.1109/TCE.2019.2912802 |
| SSID | ssj0014498 |
| Score | 2.4266758 |
| Snippet | Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 2280 |
| SubjectTerms | Abstract syntax tree (AST) Algorithms Black boxes code summarization Codes Decoding Encoding-Decoding explainable a Explainable artificial intelligence hidden Markov model (HMM) Hidden Markov models Language LRP Machine learning Mapping Markov chains Natural language Natural languages page rank Recurrent neural networks Source code Source coding Statistical analysis Syntactics Takagi–Sugeno (T–S) fuzzy Words (language) |
| Title | Interpretable Code Summarization |
| URI | https://ieeexplore.ieee.org/document/10530504 https://www.proquest.com/docview/3174162346 |
| Volume | 74 |
| WOSCitedRecordID | wos001226180900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-1721 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014498 issn: 0018-9529 databaseCode: RIE dateStart: 19630101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB20eNCDnxWrVfbgwcvW7CbbbI5SLB6kSK3Q27KTZEEorfTD3-8km9aC9OBtDwmEl5lk3mbmDcB9l6xEoNJxmaOIRclUjFgZsuXUJqVipcx8ofCrHAzy8Vi9hWJ1XwtjrfXJZ7bjPv1bvpnplftVRh6ekXk69c99Kbt1sdbmyUAIFY5d8uAsVUHHJ2HqcTQkIpiKDqdgIHfqIltXkO-p8ucg9rdL_-Sf6zqF4xBGRk_1vp_Bnp2ew9GWuOAFRL_5hDixUW9mbPTuS9VC6WUTPvrPo95LHPohxDpVbBmjZdpkiMiVTSTPmamILwnC2mqpM-J6TlzLSmkkZkqniVElzyvOueAlGuSX0JjOpvYKIkmRBXEXrJgyghNFI3yQuGCikJUMsxZ01hAVOoiFu54Vk8KTBqaK0bBwmBYB0xY8bCZ81ToZu4c2HYRbw2r0WtBeb0IRHGlRUHhDIWPKRfd6x7QbOExdT16fF9aGxnK-srdwoL-Xn4v5nbeRH9rltoA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5SBfXgs2K16h48eNmazaPZHKVYKtYidYXels1jQSit9OHvd5KNtSA9eNtDAuHLTDLfZuYbhG7bYCVMSR0XqWIxK7CMlSoN2DKxSSFxIbgvFO6LwSAdjeRrKFb3tTDWWp98Zlvu07_lm6leul9l4OEczNOpf25zxgiuyrVWjwaMyXDwgg9zIoOST4LlfTYEKkhYi0I4kDp9kbVLyHdV-XMU-_ule_jPlR2hgxBIRg_Vzh-jLTs5Qftr8oKnKPrNKFRjG3WmxkZvvlgtFF_W0Xv3Mev04tARIdZE4kWsLNaGK6WotImgKTYlMCYGaFstNAe25-S1rBBGKC41SYwsaFpSShktlFH0DNUm04k9R5GA2ALYiyqxNIwCSQN8FLDBRCpcYMUbqPUDUa6DXLjrWjHOPW3AMs-GucM0D5g20N1qwmellLF5aN1BuDasQq-Bmj-bkAdXmucQ4EDQSChrX2yYdoN2e9lLP-8_DZ4v0R5xHXp9llgT1Razpb1CO_pr8TGfXXt7-QaLb7nH |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Interpretable+Code+Summarization&rft.jtitle=IEEE+transactions+on+reliability&rft.au=Kamal%2C+Md+Sarwar&rft.au=Nimmy%2C+Sonia+Farhana&rft.au=Dey%2C+Nilanjan&rft.date=2025-03-01&rft.issn=0018-9529&rft.eissn=1558-1721&rft.volume=74&rft.issue=1&rft.spage=2280&rft.epage=2289&rft_id=info:doi/10.1109%2FTR.2024.3392876&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TR_2024_3392876 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9529&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9529&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9529&client=summon |