Interpretable Code Summarization

Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and us...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on reliability Ročník 74; číslo 1; s. 2280 - 2289
Hlavní autoři: Kamal, Md Sarwar, Nimmy, Sonia Farhana, Dey, Nilanjan
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.03.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0018-9529, 1558-1721
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and use various machine learning techniques to generate natural language from source codes. Although most of these methods are state of the art, it is difficult to understand the complex encoding and decoding process to map the tokens with natural language words. Therefore, these coding and decoding approaches are treated as opaque models (black box). This research proposes explainable AI methods that overcome the black box features for the token mapping in code summarization process. Here, we created an abstract syntax tree (AST) from the tokens of the source code. We then embedded the AST into natural language words using a bilingual statistical probability approach to generate possible statistical parse trees. We applied a page rank algorithm among the parse trees to rank the trees. From the best-ranked tree, we generate the comment for the corresponding code snippet. To explain our code generation method, we used Takagi-Sugeno fuzzy approach, layerwise relevance propagation and a hidden Markov model. These approaches make our method trustworthy and understandable to humans to understand the process of source code token mapping with natural language words.
AbstractList Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and use various machine learning techniques to generate natural language from source codes. Although most of these methods are state of the art, it is difficult to understand the complex encoding and decoding process to map the tokens with natural language words. Therefore, these coding and decoding approaches are treated as opaque models (black box). This research proposes explainable AI methods that overcome the black box features for the token mapping in code summarization process. Here, we created an abstract syntax tree (AST) from the tokens of the source code. We then embedded the AST into natural language words using a bilingual statistical probability approach to generate possible statistical parse trees. We applied a page rank algorithm among the parse trees to rank the trees. From the best-ranked tree, we generate the comment for the corresponding code snippet. To explain our code generation method, we used Takagi-Sugeno fuzzy approach, layerwise relevance propagation and a hidden Markov model. These approaches make our method trustworthy and understandable to humans to understand the process of source code token mapping with natural language words.
Author Kamal, Md Sarwar
Dey, Nilanjan
Nimmy, Sonia Farhana
Author_xml – sequence: 1
  givenname: Md Sarwar
  orcidid: 0000-0002-1945-821X
  surname: Kamal
  fullname: Kamal, Md Sarwar
  email: mdsarwar.kamal@uts.edu.au
  organization: School of Computer Science, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, Australia
– sequence: 2
  givenname: Sonia Farhana
  orcidid: 0000-0003-1788-1329
  surname: Nimmy
  fullname: Nimmy, Sonia Farhana
  email: s.nimmy@adfa.edu.au
  organization: Faculty of Economics and Business, University of New South Wales, Sydney, NSW, Australia
– sequence: 3
  givenname: Nilanjan
  orcidid: 0000-0001-8437-498X
  surname: Dey
  fullname: Dey, Nilanjan
  email: nilanjan.dey@tint.edu.in
  organization: Department of Computer Science and Engineering, Techno International New Town, Chakpachuria, West Bengal, India
BookMark eNp9kD1rwzAQhkVJoUnauUuHQGcnkk6ypLGEfgQChdSdhSSfwSGxXVkZ2l9fp8lQOnQ6Dp7nPt4JGTVtg4TcMjpnjJpFsZlzysUcwHCt8gsyZlLqjCnORmRMKdOZkdxckUnfb4dWCKPHZLZqEsYuYnJ-h7NlW-Ls7bDfu1h_uVS3zTW5rNyux5tznZL3p8di-ZKtX59Xy4d1FrihKfNIQym992CQKdC0rDTXgguPQQWpuTS5AFSqVF6awFlpHOgKAAQ4X3qYkvvT3C62Hwfsk922h9gMKy0wJVjOQeQDJU9UiG3fR6xsqNPPnSm6emcZtccwbLGxxzDsOYzBW_zxulgPT37-Y9ydjBoRf9ESqKQCvgGsUWog
CODEN IERQAD
CitedBy_id crossref_primary_10_3390_app15020672
Cites_doi 10.1109/ICECCS.2019.00012
10.1109/TSE.2019.2925616
10.1016/j.infsof.2021.106761
10.1007/s10664-019-09730-9
10.1109/TSE.2015.2442238
10.1109/TKDE.2019.2931327
10.1145/3468264.3468539
10.1109/TSE.2020.3022212
10.1109/TSE.2019.2946357
10.1109/TSE.2016.2591536
10.1145/3428301
10.1109/TSE.2021.3119012
10.1109/TCE.2003.1196417
10.1109/SMC52423.2021.9658619
10.1109/TCE.2012.6311337
10.1007/978-3-642-16776-8_2
10.1109/ASE.2015.36
10.1145/3472883.3486995
10.1109/TEM.2020.2976642
10.1016/j.jksuci.2022.08.026
10.1109/TSE.2022.3140868
10.1109/TSE.2015.2465386
10.21467/proceedings.114.47
10.1109/ACCESS.2020.3040060
10.1109/TKDE.2019.2947421
10.1109/TSE.2019.2930519
10.1109/ASE.2019.00012
10.1109/ICECCT52121.2021.9616781
10.1109/TSE.2020.2979701
10.18653/v1/P16-1078
10.18293/SEKE2018-191
10.1109/TR.2020.3001918
10.1109/ICoICT52021.2021.9527459
10.18653/v1/P16-1195
10.1109/ACCESS.2021.3051171
10.1007/978-0-387-30162-4_332
10.1007/978-3-030-87571-8_47
10.1109/TSE.2017.2664836
10.1109/ijcnn.2019.8851751
10.1109/TVCG.2020.3028958
10.1002/spe.2893
10.1109/ICISIM.2017.8122149
10.1016/j.jss.2021.111036
10.1109/TCE.2019.2912802
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
DBID 97E
RIA
RIE
AAYXX
CITATION
7SP
8FD
L7M
DOI 10.1109/TR.2024.3392876
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Electronics & Communications Abstracts
Technology Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-1721
EndPage 2289
ExternalDocumentID 10_1109_TR_2024_3392876
10530504
Genre orig-research
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
8WZ
97E
A6W
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
OCL
P2P
RIA
RIE
RNS
TN5
VH1
VJK
AAYXX
CITATION
7SP
8FD
L7M
ID FETCH-LOGICAL-c290t-be0cd5bbb39e17380df828424bec7c58259643e77d7b59c21d9a38f33343abdb3
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001226180900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0018-9529
IngestDate Tue Jul 22 06:11:07 EDT 2025
Sat Nov 29 08:14:30 EST 2025
Tue Nov 18 21:42:11 EST 2025
Wed Aug 27 01:48:45 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c290t-be0cd5bbb39e17380df828424bec7c58259643e77d7b59c21d9a38f33343abdb3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8437-498X
0000-0003-1788-1329
0000-0002-1945-821X
PQID 3174162346
PQPubID 85456
PageCount 10
ParticipantIDs crossref_citationtrail_10_1109_TR_2024_3392876
crossref_primary_10_1109_TR_2024_3392876
ieee_primary_10530504
proquest_journals_3174162346
PublicationCentury 2000
PublicationDate 2025-03-01
PublicationDateYYYYMMDD 2025-03-01
PublicationDate_xml – month: 03
  year: 2025
  text: 2025-03-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on reliability
PublicationTitleAbbrev TR
PublicationYear 2025
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref35
ref12
ref34
ref15
ref37
ref14
ref36
ref31
ref30
ref11
ref33
ref10
ref32
Husain (ref42) 2020
ref2
ref1
ref39
ref16
ref38
ref19
ref18
ref24
ref46
ref23
ref45
ref26
ref25
ref20
ref41
ref22
ref44
ref21
ref43
ref28
ref27
ref29
ref8
ref7
Movshovitz-Attias (ref17)
ref9
ref4
ref3
ref6
ref5
ref40
References_xml – ident: ref30
  doi: 10.1109/ICECCS.2019.00012
– ident: ref5
  doi: 10.1109/TSE.2019.2925616
– ident: ref21
  doi: 10.1016/j.infsof.2021.106761
– ident: ref43
  doi: 10.1007/s10664-019-09730-9
– ident: ref14
  doi: 10.1109/TSE.2015.2442238
– ident: ref44
  doi: 10.1109/TKDE.2019.2931327
– ident: ref33
  doi: 10.1145/3468264.3468539
– ident: ref7
  doi: 10.1109/TSE.2020.3022212
– ident: ref8
  doi: 10.1109/TSE.2019.2946357
– ident: ref2
  doi: 10.1109/TSE.2016.2591536
– ident: ref31
  doi: 10.1145/3428301
– ident: ref4
  doi: 10.1109/TSE.2021.3119012
– ident: ref10
  doi: 10.1109/TCE.2003.1196417
– ident: ref23
  doi: 10.1109/SMC52423.2021.9658619
– ident: ref9
  doi: 10.1109/TCE.2012.6311337
– year: 2020
  ident: ref42
  article-title: CodeSearchNet challenge: Evaluating the state of semantic code search
– ident: ref38
  doi: 10.1007/978-3-642-16776-8_2
– ident: ref16
  doi: 10.1109/ASE.2015.36
– ident: ref32
  doi: 10.1145/3472883.3486995
– ident: ref1
  doi: 10.1109/TEM.2020.2976642
– ident: ref45
  doi: 10.1016/j.jksuci.2022.08.026
– ident: ref6
  doi: 10.1109/TSE.2022.3140868
– ident: ref12
  doi: 10.1109/TSE.2015.2465386
– ident: ref25
  doi: 10.21467/proceedings.114.47
– ident: ref34
  doi: 10.1109/ACCESS.2020.3040060
– ident: ref37
  doi: 10.1109/TKDE.2019.2947421
– start-page: 35
  volume-title: Proc. 51st Annu. Meeting Assoc. Comput. Linguistics
  ident: ref17
  article-title: Natural language models for predicting programming comments
– ident: ref3
  doi: 10.1109/TSE.2019.2930519
– ident: ref29
  doi: 10.1109/ASE.2019.00012
– ident: ref15
  doi: 10.1109/ICECCT52121.2021.9616781
– ident: ref28
  doi: 10.1109/TSE.2020.2979701
– ident: ref26
  doi: 10.18653/v1/P16-1078
– ident: ref20
  doi: 10.18293/SEKE2018-191
– ident: ref27
  doi: 10.1109/TR.2020.3001918
– ident: ref39
  doi: 10.1109/ICoICT52021.2021.9527459
– ident: ref18
  doi: 10.18653/v1/P16-1195
– ident: ref40
  doi: 10.1109/ACCESS.2021.3051171
– ident: ref46
  doi: 10.1007/978-0-387-30162-4_332
– ident: ref41
  doi: 10.1007/978-3-030-87571-8_47
– ident: ref13
  doi: 10.1109/TSE.2017.2664836
– ident: ref19
  doi: 10.1109/ijcnn.2019.8851751
– ident: ref36
  doi: 10.1109/TVCG.2020.3028958
– ident: ref24
  doi: 10.1002/spe.2893
– ident: ref35
  doi: 10.1109/ICISIM.2017.8122149
– ident: ref22
  doi: 10.1016/j.jss.2021.111036
– ident: ref11
  doi: 10.1109/TCE.2019.2912802
SSID ssj0014498
Score 2.4267585
Snippet Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 2280
SubjectTerms Abstract syntax tree (AST)
Algorithms
Black boxes
code summarization
Codes
Decoding
Encoding-Decoding
explainable a
Explainable artificial intelligence
hidden Markov model (HMM)
Hidden Markov models
Language
LRP
Machine learning
Mapping
Markov chains
Natural language
Natural languages
page rank
Recurrent neural networks
Source code
Source coding
Statistical analysis
Syntactics
Takagi–Sugeno (T–S) fuzzy
Words (language)
Title Interpretable Code Summarization
URI https://ieeexplore.ieee.org/document/10530504
https://www.proquest.com/docview/3174162346
Volume 74
WOSCitedRecordID wos001226180900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-1721
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014498
  issn: 0018-9529
  databaseCode: RIE
  dateStart: 19630101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5aPOjBZ8VqlT148LI1m0eTHKVYPBWpFXpb8pgFobTS1-83yaa1ID1420MC4ZtM8s1m5huEHpVRngYUOueYVjnTFcmVVpALrrtQUCk1t7HZhBgM5His3lOxeqyFAYCYfAad8Bnf8t3MrsKvMu_h3G_PoP55KES3LtbaPhkwptKx6z2YE5V0fAqsnkdDHwgS1qGeDMigLrJzBcWeKn8O4ni79M_-ua5zdJpoZPZS2_0CHcD0Ep3siAteoew3n9BMIOvNHGQfsVQtlV420Wf_ddR7y1M_hNwShZe5AWwdN8ZQBYWgErvKx0uMMG8HYbmP9YK4FgjhhOHKksIpTWVFKWVUG2foNWpMZ1O4QZkBg00BzjBgQVPdYJCGsm6lpcK2kC3U2UBU2iQWHnpWTMoYNGBVjoZlwLRMmLbQ03bCd62TsX9oM0C4M6xGr4XaGyOUyZEWpac3njISv7TbPdPu0DEJPXljXlgbNZbzFdyjI7tefi3mD3GP_AA6FbeE
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5SBfXgs2K16h48eNmazaObHKVYKtYidYXeliQ7C0JppQ9_v0k21oL04G0PCYRvMsk3m5lvELqVWloakKiYY1rGTJUklkpCnHLVhoQKobjxzSbSwUCMRvI1FKv7WhgA8Mln0HKf_i2_mJql-1VmPZzb7enUP7c5YwRX5VqrRwPGZDh4rQ9zIoOST4LlfTa0oSBhLWrpgHD6ImuXkO-q8uco9vdL9_CfKztCB4FIRg-V5Y_RFkxO0P6avOApin4zCvUYos60gOjNF6uF4ss6eu8-Zp1eHDoixIZIvIg1YFNwrTWVkKRU4KK0ERMjzFoiNdxGe05eC9K0SDWXhiSFVFSUlFJGlS40PUO1yXQC5yjSoLFOoNAMmFNV1xiEpqxdKiGxSUQDtX4gyk2QC3ddK8a5DxuwzLNh7jDNA6YNdLea8FkpZWweWncQrg2r0Gug5o8R8uBK89wSHEsaiV3axYZpN2i3l7308_7T4PkS7RHXoddniTVRbTFbwhXaMV-Lj_ns2u-Xb_YSuss
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Interpretable+Code+Summarization&rft.jtitle=IEEE+transactions+on+reliability&rft.au=Kamal%2C+Md+Sarwar&rft.au=Nimmy%2C+Sonia+Farhana&rft.au=Dey%2C+Nilanjan&rft.date=2025-03-01&rft.pub=IEEE&rft.issn=0018-9529&rft.volume=74&rft.issue=1&rft.spage=2280&rft.epage=2289&rft_id=info:doi/10.1109%2FTR.2024.3392876&rft.externalDocID=10530504
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9529&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9529&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9529&client=summon