A text classification approach to API type resolution for incomplete code snippets

The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools rely on these code snippets as a source of information. Although code snippets are intended as examples for humans, they may not form comp...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Science of computer programming Jg. 227; S. 102941
Hauptverfasser: Velázquez-Rodríguez, Camilo, Di Nucci, Dario, De Roover, Coen
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 01.04.2023
Schlagworte:
ISSN:0167-6423, 1872-7964
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools rely on these code snippets as a source of information. Although code snippets are intended as examples for humans, they may not form compilation units. For instance, snippets illustrating how to use an API might lack the import statements for the corresponding API types. Thus, it becomes essential to determine the fully-qualified name of API types in incomplete snippets. We present RESICO, a machine learning-based text classification approach to resolving the simple name of API types to their fully-qualified names. RESICO is trained on a corpus of Java programs for which a compiler can determine the fully-qualified names. For four machine learning classifiers, we evaluate the type resolution accuracy of the resulting models on the original and an extended version of datasets of snippets previously used to evaluate the current state-of-the-art approach based on information retrieval. Results show that our approach outperforms the state-of-the-art one, although the training phase is slightly slower. We observe that most of the incorrect type resolutions are not due to ambiguities among the simple names for API types but due to similarities among the contexts in which these types are used, representing a future research challenge. •Stack Overflow code snippets might lack information about referenced API types.•RESICO is an approach to resolve simple API names to their fully qualified versions.•RESICO encodes API references and their contexts to train a classification algorithm.•Our approach outperforms the state-of-the-art COSTER in an in-depth evaluation.•Mispredictions by the models are mainly due to similar contexts around API usages.
AbstractList The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools rely on these code snippets as a source of information. Although code snippets are intended as examples for humans, they may not form compilation units. For instance, snippets illustrating how to use an API might lack the import statements for the corresponding API types. Thus, it becomes essential to determine the fully-qualified name of API types in incomplete snippets. We present RESICO, a machine learning-based text classification approach to resolving the simple name of API types to their fully-qualified names. RESICO is trained on a corpus of Java programs for which a compiler can determine the fully-qualified names. For four machine learning classifiers, we evaluate the type resolution accuracy of the resulting models on the original and an extended version of datasets of snippets previously used to evaluate the current state-of-the-art approach based on information retrieval. Results show that our approach outperforms the state-of-the-art one, although the training phase is slightly slower. We observe that most of the incorrect type resolutions are not due to ambiguities among the simple names for API types but due to similarities among the contexts in which these types are used, representing a future research challenge. •Stack Overflow code snippets might lack information about referenced API types.•RESICO is an approach to resolve simple API names to their fully qualified versions.•RESICO encodes API references and their contexts to train a classification algorithm.•Our approach outperforms the state-of-the-art COSTER in an in-depth evaluation.•Mispredictions by the models are mainly due to similar contexts around API usages.
ArticleNumber 102941
Author Di Nucci, Dario
Velázquez-Rodríguez, Camilo
De Roover, Coen
Author_xml – sequence: 1
  givenname: Camilo
  orcidid: 0000-0002-8360-1519
  surname: Velázquez-Rodríguez
  fullname: Velázquez-Rodríguez, Camilo
  email: camilo.ernesto.velazquez.rodriguez@vub.be
  organization: Vrije Universiteit Brussel, Brussels, Belgium
– sequence: 2
  givenname: Dario
  surname: Di Nucci
  fullname: Di Nucci, Dario
  email: ddinucci@unisa.it
  organization: University of Salerno, Fisciano (SA), Italy
– sequence: 3
  givenname: Coen
  surname: De Roover
  fullname: De Roover, Coen
  email: coen.de.roover@vub.be
  organization: Vrije Universiteit Brussel, Brussels, Belgium
BookMark eNqFkM1KAzEQgINUsK0-gZe8wNZkk_07eCjFn0JBET2HZHaCKdvNkkSxb--29eRBD8PAzHzDzDcjk973SMg1ZwvOeHmzXURw4Bc5y8VYyRvJz8iU11WeVU0pJ2Q6TlVZKXNxQWYxbhljpaz4lLwsacKvRKHTMTrrQCfne6qHIXgN7zR5unxe07QfkAaMvvs49q0P1PXgd0OHCSn4Fmns3TBgipfk3Oou4tVPnpO3-7vX1WO2eXpYr5abDISsU4ZGVKYyUhqT11YaKEsEKIzEWtStNbwwvDVtDWwMsDXTlW1kUVgGIpcaxZw0p70QfIwBrQKXjuenoF2nOFMHOWqrjnLUQY46yRlZ8YsdgtvpsP-Huj1ROL716TAcZrAHbF1ASKr17k_-G_Ytg6E
CitedBy_id crossref_primary_10_1145_3715724
crossref_primary_10_1007_s11042_023_16812_w
Cites_doi 10.1111/j.2517-6161.1974.tb00994.x
10.4304/jcp.7.12.2913-2920
10.1162/tacl_a_00051
10.1023/A:1022627411411
10.1080/00437956.1954.11659520
10.1145/3290353
10.1073/pnas.92.22.9977
10.1023/A:1011441423217
10.1007/s41133-020-00032-0
10.1023/A:1010933404324
10.1109/TSE.2017.2779479
10.1016/j.infsof.2020.106367
10.1108/eb026526
10.1145/2775051.2677009
10.14257/ijdta.2014.7.1.06
10.1016/j.ipm.2021.102798
10.1109/TSE.2011.103
ContentType Journal Article
Copyright 2023 Elsevier B.V.
Copyright_xml – notice: 2023 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.scico.2023.102941
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7964
ExternalDocumentID 10_1016_j_scico_2023_102941
S0167642323000230
GroupedDBID --K
--M
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
4.4
457
4G.
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABJNI
ABMAC
ABTAH
ABVKL
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADHUB
ADMUD
AEBSH
AEKER
AENEX
AEXQZ
AFFNX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
DU5
E.L
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG9
M26
M41
MO0
N9A
NCXOZ
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TN5
WUQ
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
ADVLN
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c348t-eb37b7b44bb28f4bc66ecc5b4e838dfb15b1dbd8c0d8ccf80a7f9455f0c324ae3
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000958767800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0167-6423
IngestDate Sat Nov 29 07:22:13 EST 2025
Tue Nov 18 21:51:37 EST 2025
Fri Feb 23 02:38:22 EST 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Fully qualified name resolution
Stack overflow
Machine learning
Text classification
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c348t-eb37b7b44bb28f4bc66ecc5b4e838dfb15b1dbd8c0d8ccf80a7f9455f0c324ae3
ORCID 0000-0002-8360-1519
OpenAccessLink https://www.sciencedirect.com/science/article/pii/S0167642323000230
ParticipantIDs crossref_citationtrail_10_1016_j_scico_2023_102941
crossref_primary_10_1016_j_scico_2023_102941
elsevier_sciencedirect_doi_10_1016_j_scico_2023_102941
PublicationCentury 2000
PublicationDate April 2023
2023-04-00
PublicationDateYYYYMMDD 2023-04-01
PublicationDate_xml – month: 04
  year: 2023
  text: April 2023
PublicationDecade 2020
PublicationTitle Science of computer programming
PublicationYear 2023
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Saifullah, Asaduzzaman, Roy (br0050) 2019
Jones (br0080) 1972; 28
Sakakibara, Misue, Koshiba (br0090) 1993
Henkel, Lahiri, Liblit, Reps (br0170) 2018
Hall, Beecham, Bowes, Gray, Counsell (br0430) 2011; 38
Sirres, Bissyandé, Kim, Lo, Klein, Kim, Traon (br0470) 2018; 23
Linares-Vásquez, Bavota, Di Penta, Oliveto, Poshyvanyk (br0510) 2014
Zhang, Oles (br0270) 2001; 4
Bergstra, Bardenet, Bengio, Kégl (br0350) 2011; 24
Cortes, Vapnik (br0290) 1995; 20
Joachims (br0300) 2002
Mikolov, Chen, Corrado, Dean (br0130) 2013
Baltes, Dumani, Treude, Diehl (br0340) 2018
Stone (br0370) 1974
Hellendoorn, Bird, Barr, Allamanis (br0590) 2018
Theeten, Vandeputte, Van Cutsem (br0180) 2019
Subramanian, Inozemtseva, Holmes (br0030) 2014
Malik, Patra, Pradel (br0600) 2019
Wang, Su Blended (br0650) 2020
Wei, Goyal, Durrett, Dillig (br0610) 2020
Xu, Zhang, Chen, Pei, Xu (br0580) 2016
Raychev, Vechev, Krause (br0570) 2015; 50
Xu, Guo, Ye, Cheng (br0110) 2012; 7
Wang, Xu, Li, Yuan, Xue (br0640) 2022
Benelallam, Harrand, Valero, Baudry, Barais (br0520) Nov. 2018
Ponzanelli, Bavota, Mocci, Oliveto, Penta, Haiduc, Russo, Lanza (br0480) 2019; 45
Dasarathy (br0200) 1991
Shah, Patel, Sanghvi, Shah (br0230) 2020; 5
Phan, Nguyen, Tran, Truong, Nguyen, Nguyen (br0040) 2018
Bouaziz, Dartigues-Pallez, da Costa Pereira, Precioso, Lloret (br0100) 2014
Soucy, Mineau (br0210) 2001
Dagenais, Robillard (br0450) 2012
Treude, Robillard (br0010) 2016
Bijalwan, Kumar, Kumari, Pascual (br0220) 2014; 7
Cai, Wang, Huang, Xia, Xing, Lo (br0490) 2019
Dong, Gu, Tian, Sun (br0560) 2022
Rubei, Di Sipio, Nguyen, Di Rocco, Di Ruscio (br0500) 2020; 127
Chen, Wu, Chen, Lu, Ding (br0260) 2022; 59
Ponzanelli, Bavota, Di Penta, Oliveto, Lanza (br0460) 2014
Sechidis, Tsoumakas, Vlahavas (br0380) 2011
Mikolov, Sutskever, Chen, Corrado, Dean (br0120) 2013
Breiman (br0250) 2001; 45
Bates (br0060) 1995; 92
Harris (br0070) 1954; 10
Nguyen, Nguyen, Nguyen (br0140) 2016
Peng, Gao, Li, Gao, Lo, Zhang, Lyu (br0620) 2022
Nguyen, Nguyen, Phan, Nguyen (br0150) 2017
Lehmann, Pradel (br0630) 2022
Terragni, Liu, Cheung CSnippEx (br0550) 2016
Press, Teukolsky, Vetterling, Flannery (br0280) 2007
Martins, Achar, Lopes (br0330) 2018
Bergstra, Yamins, Cox (br0360) 2013
Subramanian, Holmes (br0530) 2013
Di Nucci, Palomba, Tamburri, Serebrenik, De Lucia (br0420) 2018
Lipton, Elkan, Naryanaswamy (br0390) 2014
Luong, Manning (br0400) 2016
Bojanowski, Grave, Joulin, Mikolov (br0410) 2017; 5
Moonen (br0320) 2001
Yang, Hussain, Lopes (br0540) 2016
Ho (br0240) 1995; vol. 1
Alon, Zilberstein, Levy, Yahav (br0190) 2019; 3
Alon, Brody, Levy, Yahav (br0160) 2018
Ye, Shen, Ma, Bunescu, Liu (br0440) 2016
Lilleberg, Zhu, Zhang (br0310) 2015
Lin, Zampetti, Bavota, Di Penta, Lanza (br0020) 2019
Soucy (10.1016/j.scico.2023.102941_br0210) 2001
Dong (10.1016/j.scico.2023.102941_br0560) 2022
Shah (10.1016/j.scico.2023.102941_br0230) 2020; 5
Bojanowski (10.1016/j.scico.2023.102941_br0410) 2017; 5
Joachims (10.1016/j.scico.2023.102941_br0300) 2002
Bijalwan (10.1016/j.scico.2023.102941_br0220) 2014; 7
Baltes (10.1016/j.scico.2023.102941_br0340) 2018
Sechidis (10.1016/j.scico.2023.102941_br0380) 2011
Harris (10.1016/j.scico.2023.102941_br0070) 1954; 10
Nguyen (10.1016/j.scico.2023.102941_br0150) 2017
Nguyen (10.1016/j.scico.2023.102941_br0140) 2016
Ho (10.1016/j.scico.2023.102941_br0240) 1995; vol. 1
Stone (10.1016/j.scico.2023.102941_br0370) 1974
Wang (10.1016/j.scico.2023.102941_br0640) 2022
Ye (10.1016/j.scico.2023.102941_br0440) 2016
Bates (10.1016/j.scico.2023.102941_br0060) 1995; 92
Breiman (10.1016/j.scico.2023.102941_br0250) 2001; 45
Wang (10.1016/j.scico.2023.102941_br0650) 2020
Terragni (10.1016/j.scico.2023.102941_br0550) 2016
Subramanian (10.1016/j.scico.2023.102941_br0030) 2014
Luong (10.1016/j.scico.2023.102941_br0400)
Rubei (10.1016/j.scico.2023.102941_br0500) 2020; 127
Lin (10.1016/j.scico.2023.102941_br0020) 2019
Moonen (10.1016/j.scico.2023.102941_br0320) 2001
Peng (10.1016/j.scico.2023.102941_br0620)
Phan (10.1016/j.scico.2023.102941_br0040) 2018
Linares-Vásquez (10.1016/j.scico.2023.102941_br0510) 2014
Hellendoorn (10.1016/j.scico.2023.102941_br0590) 2018
Chen (10.1016/j.scico.2023.102941_br0260) 2022; 59
Di Nucci (10.1016/j.scico.2023.102941_br0420) 2018
Jones (10.1016/j.scico.2023.102941_br0080) 1972; 28
Treude (10.1016/j.scico.2023.102941_br0010) 2016
Yang (10.1016/j.scico.2023.102941_br0540) 2016
Zhang (10.1016/j.scico.2023.102941_br0270) 2001; 4
Ponzanelli (10.1016/j.scico.2023.102941_br0460) 2014
Xu (10.1016/j.scico.2023.102941_br0110) 2012; 7
Sakakibara (10.1016/j.scico.2023.102941_br0090) 1993
Bergstra (10.1016/j.scico.2023.102941_br0350) 2011; 24
Cortes (10.1016/j.scico.2023.102941_br0290) 1995; 20
Xu (10.1016/j.scico.2023.102941_br0580) 2016
Martins (10.1016/j.scico.2023.102941_br0330) 2018
Lehmann (10.1016/j.scico.2023.102941_br0630) 2022
Raychev (10.1016/j.scico.2023.102941_br0570) 2015; 50
Wei (10.1016/j.scico.2023.102941_br0610) 2020
Hall (10.1016/j.scico.2023.102941_br0430) 2011; 38
Benelallam (10.1016/j.scico.2023.102941_br0520)
Mikolov (10.1016/j.scico.2023.102941_br0120) 2013
Theeten (10.1016/j.scico.2023.102941_br0180) 2019
Sirres (10.1016/j.scico.2023.102941_br0470) 2018; 23
Alon (10.1016/j.scico.2023.102941_br0160)
Press (10.1016/j.scico.2023.102941_br0280) 2007
Bergstra (10.1016/j.scico.2023.102941_br0360) 2013
Cai (10.1016/j.scico.2023.102941_br0490) 2019
Lilleberg (10.1016/j.scico.2023.102941_br0310) 2015
Bouaziz (10.1016/j.scico.2023.102941_br0100) 2014
Mikolov (10.1016/j.scico.2023.102941_br0130)
Ponzanelli (10.1016/j.scico.2023.102941_br0480) 2019; 45
Lipton (10.1016/j.scico.2023.102941_br0390) 2014
Dagenais (10.1016/j.scico.2023.102941_br0450) 2012
Saifullah (10.1016/j.scico.2023.102941_br0050) 2019
Alon (10.1016/j.scico.2023.102941_br0190) 2019; 3
Henkel (10.1016/j.scico.2023.102941_br0170) 2018
Dasarathy (10.1016/j.scico.2023.102941_br0200) 1991
Subramanian (10.1016/j.scico.2023.102941_br0530) 2013
Malik (10.1016/j.scico.2023.102941_br0600) 2019
References_xml – volume: 10
  start-page: 146
  year: 1954
  end-page: 162
  ident: br0070
  article-title: Distributional structure
  publication-title: Word
– start-page: 47
  year: 2012
  end-page: 57
  ident: br0450
  article-title: Recovering traceability links between an API and its learning resources
  publication-title: Proceedings - International Conference on Software Engineering
– volume: 20
  start-page: 273
  year: 1995
  end-page: 297
  ident: br0290
  article-title: Support-vector networks
  publication-title: Mach. Learn.
– volume: 23
  year: 2018
  ident: br0470
  article-title: Augmenting and structuring user queries to support efficient free-form code search
  publication-title: Empir. Softw. Eng. (EMSE)
– start-page: 756
  year: 2016
  end-page: 758
  ident: br0140
  article-title: Mapping API elements for code migration with vector representations
  publication-title: Companion to the Proceedings of the 38th International Conference on Software Engineering (ICSE-C16)
– start-page: 121
  year: 2020
  end-page: 134
  ident: br0650
  article-title: Precise semantic program embeddings
  publication-title: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation
– start-page: 115
  year: 2013
  end-page: 123
  ident: br0360
  article-title: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures
  publication-title: International Conference on Machine Learning, PMLR
– volume: 59
  year: 2022
  ident: br0260
  article-title: A comparative study of automated legal text classification using random forests and deep learning
  publication-title: Inf. Process. Manag.
– volume: 38
  start-page: 1276
  year: 2011
  end-page: 1304
  ident: br0430
  article-title: A systematic literature review on fault prediction performance in software engineering
  publication-title: IEEE Trans. Softw. Eng.
– start-page: 145
  year: 2011
  end-page: 158
  ident: br0380
  article-title: On the stratification of multi-label data
  publication-title: Machine Learning and Knowledge Discovery in Databases
– start-page: 85
  year: 2013
  end-page: 88
  ident: br0530
  article-title: Making sense of online code snippets
  publication-title: IEEE International Working Conference on Mining Software Repositories
– volume: 28
  start-page: 11
  year: 1972
  end-page: 21
  ident: br0080
  article-title: A statistical interpretation of term specificity and its application in retrieval
  publication-title: J. Doc.
– volume: vol. 1
  start-page: 278
  year: 1995
  end-page: 282
  ident: br0240
  article-title: Random Decision Forests
  publication-title: Proceedings of 3rd International Conference on Document Analysis and Recognition
– start-page: 1075
  year: 2019
  end-page: 1079
  ident: br0490
  article-title: BIKER: a tool for bi-information source based API method recommendation
  publication-title: Proceedings of the 27th Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE19)
– volume: 92
  start-page: 9977
  year: 1995
  end-page: 9982
  ident: br0060
  article-title: Models of natural language understanding
  publication-title: Proc. Natl. Acad. Sci.
– start-page: 607
  year: 2016
  end-page: 618
  ident: br0580
  article-title: Python probabilistic type inference with natural language support
  publication-title: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering
– start-page: 392
  year: 2016
  end-page: 403
  ident: br0010
  article-title: Augmenting API documentation with insights from stack overflow
  publication-title: Proceedings of the 38th International Conference on Software Engineering (ICSE16)
– year: 2020
  ident: br0610
  article-title: LambdaNet: probabilistic type inference using graph neural networks
  publication-title: 2020 8th International Conference on Learning Representations
– start-page: 225
  year: 2014
  end-page: 239
  ident: br0390
  article-title: Optimal thresholding of classifiers to maximize F1 measure
  publication-title: Joint European Conference on Machine Learning and Knowledge Discovery in Databases
– start-page: 612
  year: 2018
  end-page: 621
  ident: br0420
  article-title: Detecting code smells using machine learning techniques: are we there yet?
  publication-title: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)
– year: 2007
  ident: br0280
  article-title: Numerical recipes
  publication-title: The Art of Scientific Computing
– start-page: 83
  year: 2014
  end-page: 94
  ident: br0510
  article-title: How do API changes trigger stack overflow discussions? A study on the Android SDK
  publication-title: 22nd International Conference on Program Comprehension, ICPC 2014 - Proceedings
– start-page: 288
  year: 2014
  end-page: 299
  ident: br0100
  article-title: Short text classification using semantic random forest
  publication-title: International Conference on Data Warehousing and Knowledge Discovery
– start-page: 18
  year: 2019
  end-page: 28
  ident: br0180
  article-title: Import2vec learning embeddings for software libraries
  publication-title: Proceedings of the 16th International Conference on Mining Software Repositories (MSR19)
– start-page: 163
  year: 2018
  end-page: 174
  ident: br0170
  article-title: Code vectors: understanding programs through embedded abstracted symbolic traces
  publication-title: Proceedings of the 26th Joint Meeting of European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE18)
– start-page: 1982
  year: 2022
  end-page: 1993
  ident: br0560
  article-title: SnR: constraint-based type inference for incomplete Java code snippets
  publication-title: Proceedings of the 44th International Conference on Software Engineering
– start-page: 152
  year: 2018
  end-page: 162
  ident: br0590
  article-title: Deep learning type inference
  publication-title: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
– year: 2022
  ident: br0620
  article-title: Static inference meets deep learning: a hybrid type inference approach for Python
– year: 2002
  ident: br0300
  article-title: Learning to Classify Text Using Support Vector Machines, vol. 668
– start-page: 632
  year: 2018
  end-page: 642
  ident: br0040
  article-title: Statistical learning of API fully qualified names in code snippets of online forums
  publication-title: Proceedings of the 40th International Conference on Software Engineering (ICSE18)
– start-page: 438
  year: 2017
  end-page: 449
  ident: br0150
  article-title: Exploring API embedding for API usages and applications
  publication-title: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)
– start-page: 13
  year: 2001
  end-page: 22
  ident: br0320
  article-title: Generating robust parsers using island grammars
  publication-title: Proceedings of the Eighth Working Conference on Reverse Engineering
– start-page: 111
  year: 1974
  end-page: 147
  ident: br0370
  article-title: Cross-validatory choice and assessment of statistical predictions
  publication-title: J. R. Stat. Soc., Ser. B, Methodol.
– start-page: 319
  year: 2018
  end-page: 330
  ident: br0340
  article-title: SOTorrent: reconstructing and analyzing the evolution of stack overflow posts
  publication-title: Proceedings of the 15th International Conference on Mining Software Repositories (MSR)
– volume: 3
  start-page: 40
  year: 2019
  ident: br0190
  article-title: code2vec: learning distributed representations of code
  publication-title: Proc. ACM Program. Lang.
– start-page: 391
  year: 2016
  end-page: 401
  ident: br0540
  article-title: From query to usable code: an analysis of stack overflow code snippets
  publication-title: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)
– start-page: 466
  year: 1993
  ident: br0090
  article-title: Text classification and keyword extraction by learning decision trees
  publication-title: Proceedings of 9th Conference on Artificial Intelligence for Applications (AIAI93)
– start-page: 131
  year: 2022
  end-page: 143
  ident: br0640
  article-title: Recovering container class types in C++ binaries
  publication-title: 2022 IEEE/ACM International Symposium on Code Generation and Optimization
– start-page: 404
  year: 2016
  end-page: 415
  ident: br0440
  article-title: From word embeddings to document similarities for improved information retrieval in software engineering
  publication-title: Proceedings of the 38th International Conference on Software Engineering
– start-page: 118
  year: 2016
  end-page: 129
  ident: br0550
  article-title: Automated synthesis of compilable code snippets from Q&A sites
  publication-title: Proceedings of the 25th International Symposium on Software Testing and Analysis
– volume: 5
  start-page: 1
  year: 2020
  end-page: 16
  ident: br0230
  article-title: A comparative analysis of logistic regression, random forest and KNN models for the text classification
  publication-title: Augment. Hum. Res.
– start-page: 102
  year: 2014
  end-page: 111
  ident: br0460
  article-title: Mining StackOverflow to turn the IDE into a self-confident programming prompter
  publication-title: 11th Working Conference on Mining Software Repositories, MSR 2014 - Proceedings
– volume: 45
  start-page: 5
  year: 2001
  end-page: 32
  ident: br0250
  article-title: Random forests
  publication-title: Mach. Learn.
– start-page: 647
  year: 2001
  end-page: 648
  ident: br0210
  article-title: A simple KNN algorithm for text categorization
  publication-title: Proceedings - IEEE International Conference on Data Mining
– start-page: 304
  year: 2019
  end-page: 315
  ident: br0600
  article-title: NL2Type: inferring JavaScript function types from natural language information
  publication-title: 2019 IEEE/ACM 41st International Conference on Software Engineering
– year: 2022
  ident: br0630
  article-title: Finding the dwarf: recovering precise types from WebAssembly binaries
  publication-title: 2022 43rd ACM SIGPLAN Conference on Programming Language Design and Implementation
– volume: 24
  year: 2011
  ident: br0350
  article-title: Algorithms for hyper-parameter optimization
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 5
  start-page: 135
  year: 2017
  end-page: 146
  ident: br0410
  article-title: Enriching word vectors with subword information
  publication-title: Trans. Assoc. Comput. Linguist.
– start-page: 136
  year: 2015
  end-page: 140
  ident: br0310
  article-title: Support vector machines and Word2vec for text classification with semantic features
  publication-title: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC)
– year: 2018
  ident: br0160
  article-title: code2seq: generating sequences from structured representations of code
– start-page: 548
  year: 2019
  end-page: 559
  ident: br0020
  article-title: Pattern-based mining of opinions in Q&A websites
  publication-title: Proceedings of the 41st International Conference on Software Engineering (ICSE19)
– start-page: 1
  year: 2018
  end-page: 5
  ident: br0330
  article-title: 50K-C: a dataset of compilable, and compiled, Java projects
  publication-title: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)
– volume: 7
  start-page: 2913
  year: 2012
  end-page: 2920
  ident: br0110
  article-title: An improved random forest classifier for text categorization
  publication-title: J. Comput.
– year: 1991
  ident: br0200
  article-title: Nearest neighbor (NN) norms: NN pattern classification techniques
  publication-title: IEEE Computer Society Tutorial
– volume: 50
  start-page: 111
  year: 2015
  end-page: 124
  ident: br0570
  article-title: Predicting program properties from “big code”
  publication-title: ACM SIGPLAN Not.
– volume: 45
  start-page: 464
  year: 2019
  end-page: 488
  ident: br0480
  article-title: Automatic identification and classification of software development video tutorial fragments
  publication-title: IEEE Trans. Softw. Eng.
– year: 2013
  ident: br0120
  article-title: Distributed representations of words and phrases and their compositionality
  publication-title: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS13)
– volume: 7
  start-page: 61
  year: 2014
  end-page: 70
  ident: br0220
  article-title: KNN based machine learning approach for text and document mining
  publication-title: Int. J. Database Theory Appl.
– year: Nov. 2018
  ident: br0520
  article-title: Maven central dependency graph
– start-page: 243
  year: 2019
  end-page: 254
  ident: br0050
  article-title: Learning from examples to find fully qualified names of API elements in code snippets
  publication-title: Proceedings of the 34th International Conference on Automated Software Engineering (ASE19)
– volume: 4
  start-page: 5
  year: 2001
  end-page: 31
  ident: br0270
  article-title: Text categorization based on regularized linear classification methods
  publication-title: Inf. Retr.
– year: 2016
  ident: br0400
  article-title: Achieving open vocabulary neural machine translation with hybrid word-character models
– volume: 127
  year: 2020
  ident: br0500
  article-title: PostFinder: mining stack overflow posts to support software developers
  publication-title: Inf. Softw. Technol.
– start-page: 643
  year: 2014
  end-page: 652
  ident: br0030
  article-title: Live API documentation
  publication-title: Proceedings of the 36th International Conference on Software Engineering (ICSE14)
– year: 2013
  ident: br0130
  article-title: Efficient estimation of word representations in vector space
– start-page: 438
  year: 2017
  ident: 10.1016/j.scico.2023.102941_br0150
  article-title: Exploring API embedding for API usages and applications
– start-page: 136
  year: 2015
  ident: 10.1016/j.scico.2023.102941_br0310
  article-title: Support vector machines and Word2vec for text classification with semantic features
– start-page: 111
  year: 1974
  ident: 10.1016/j.scico.2023.102941_br0370
  article-title: Cross-validatory choice and assessment of statistical predictions
  publication-title: J. R. Stat. Soc., Ser. B, Methodol.
  doi: 10.1111/j.2517-6161.1974.tb00994.x
– ident: 10.1016/j.scico.2023.102941_br0620
– start-page: 632
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0040
  article-title: Statistical learning of API fully qualified names in code snippets of online forums
– start-page: 466
  year: 1993
  ident: 10.1016/j.scico.2023.102941_br0090
  article-title: Text classification and keyword extraction by learning decision trees
– start-page: 607
  year: 2016
  ident: 10.1016/j.scico.2023.102941_br0580
  article-title: Python probabilistic type inference with natural language support
– start-page: 1982
  year: 2022
  ident: 10.1016/j.scico.2023.102941_br0560
  article-title: SnR: constraint-based type inference for incomplete Java code snippets
– start-page: 243
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0050
  article-title: Learning from examples to find fully qualified names of API elements in code snippets
– start-page: 304
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0600
  article-title: NL2Type: inferring JavaScript function types from natural language information
– year: 2022
  ident: 10.1016/j.scico.2023.102941_br0630
  article-title: Finding the dwarf: recovering precise types from WebAssembly binaries
– start-page: 115
  year: 2013
  ident: 10.1016/j.scico.2023.102941_br0360
  article-title: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures
– year: 2007
  ident: 10.1016/j.scico.2023.102941_br0280
  article-title: Numerical recipes
– start-page: 131
  year: 2022
  ident: 10.1016/j.scico.2023.102941_br0640
  article-title: Recovering container class types in C++ binaries
– volume: vol. 1
  start-page: 278
  year: 1995
  ident: 10.1016/j.scico.2023.102941_br0240
  article-title: Random Decision Forests
– start-page: 288
  year: 2014
  ident: 10.1016/j.scico.2023.102941_br0100
  article-title: Short text classification using semantic random forest
– start-page: 319
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0340
  article-title: SOTorrent: reconstructing and analyzing the evolution of stack overflow posts
– start-page: 163
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0170
  article-title: Code vectors: understanding programs through embedded abstracted symbolic traces
– volume: 7
  start-page: 2913
  issue: 12
  year: 2012
  ident: 10.1016/j.scico.2023.102941_br0110
  article-title: An improved random forest classifier for text categorization
  publication-title: J. Comput.
  doi: 10.4304/jcp.7.12.2913-2920
– start-page: 145
  year: 2011
  ident: 10.1016/j.scico.2023.102941_br0380
  article-title: On the stratification of multi-label data
– volume: 5
  start-page: 135
  year: 2017
  ident: 10.1016/j.scico.2023.102941_br0410
  article-title: Enriching word vectors with subword information
  publication-title: Trans. Assoc. Comput. Linguist.
  doi: 10.1162/tacl_a_00051
– start-page: 647
  year: 2001
  ident: 10.1016/j.scico.2023.102941_br0210
  article-title: A simple KNN algorithm for text categorization
– ident: 10.1016/j.scico.2023.102941_br0400
– start-page: 548
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0020
  article-title: Pattern-based mining of opinions in Q&A websites
– start-page: 85
  year: 2013
  ident: 10.1016/j.scico.2023.102941_br0530
  article-title: Making sense of online code snippets
– start-page: 18
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0180
  article-title: Import2vec learning embeddings for software libraries
– volume: 20
  start-page: 273
  issue: 3
  year: 1995
  ident: 10.1016/j.scico.2023.102941_br0290
  article-title: Support-vector networks
  publication-title: Mach. Learn.
  doi: 10.1023/A:1022627411411
– volume: 10
  start-page: 146
  issue: 2–3
  year: 1954
  ident: 10.1016/j.scico.2023.102941_br0070
  article-title: Distributional structure
  publication-title: Word
  doi: 10.1080/00437956.1954.11659520
– volume: 3
  start-page: 40
  issue: POPL
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0190
  article-title: code2vec: learning distributed representations of code
  publication-title: Proc. ACM Program. Lang.
  doi: 10.1145/3290353
– volume: 92
  start-page: 9977
  issue: 22
  year: 1995
  ident: 10.1016/j.scico.2023.102941_br0060
  article-title: Models of natural language understanding
  publication-title: Proc. Natl. Acad. Sci.
  doi: 10.1073/pnas.92.22.9977
– year: 2020
  ident: 10.1016/j.scico.2023.102941_br0610
  article-title: LambdaNet: probabilistic type inference using graph neural networks
– volume: 24
  year: 2011
  ident: 10.1016/j.scico.2023.102941_br0350
  article-title: Algorithms for hyper-parameter optimization
  publication-title: Adv. Neural Inf. Process. Syst.
– start-page: 643
  year: 2014
  ident: 10.1016/j.scico.2023.102941_br0030
  article-title: Live API documentation
– volume: 4
  start-page: 5
  issue: 1
  year: 2001
  ident: 10.1016/j.scico.2023.102941_br0270
  article-title: Text categorization based on regularized linear classification methods
  publication-title: Inf. Retr.
  doi: 10.1023/A:1011441423217
– year: 1991
  ident: 10.1016/j.scico.2023.102941_br0200
  article-title: Nearest neighbor (NN) norms: NN pattern classification techniques
– start-page: 121
  year: 2020
  ident: 10.1016/j.scico.2023.102941_br0650
  article-title: Precise semantic program embeddings
– start-page: 1
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0330
  article-title: 50K-C: a dataset of compilable, and compiled, Java projects
– ident: 10.1016/j.scico.2023.102941_br0160
– start-page: 391
  year: 2016
  ident: 10.1016/j.scico.2023.102941_br0540
  article-title: From query to usable code: an analysis of stack overflow code snippets
– volume: 5
  start-page: 1
  issue: 1
  year: 2020
  ident: 10.1016/j.scico.2023.102941_br0230
  article-title: A comparative analysis of logistic regression, random forest and KNN models for the text classification
  publication-title: Augment. Hum. Res.
  doi: 10.1007/s41133-020-00032-0
– start-page: 47
  year: 2012
  ident: 10.1016/j.scico.2023.102941_br0450
  article-title: Recovering traceability links between an API and its learning resources
– start-page: 756
  year: 2016
  ident: 10.1016/j.scico.2023.102941_br0140
  article-title: Mapping API elements for code migration with vector representations
– start-page: 13
  year: 2001
  ident: 10.1016/j.scico.2023.102941_br0320
  article-title: Generating robust parsers using island grammars
– start-page: 83
  year: 2014
  ident: 10.1016/j.scico.2023.102941_br0510
  article-title: How do API changes trigger stack overflow discussions? A study on the Android SDK
– start-page: 118
  year: 2016
  ident: 10.1016/j.scico.2023.102941_br0550
  article-title: Automated synthesis of compilable code snippets from Q&A sites
– year: 2013
  ident: 10.1016/j.scico.2023.102941_br0120
  article-title: Distributed representations of words and phrases and their compositionality
– start-page: 392
  year: 2016
  ident: 10.1016/j.scico.2023.102941_br0010
  article-title: Augmenting API documentation with insights from stack overflow
– volume: 45
  start-page: 5
  issue: 1
  year: 2001
  ident: 10.1016/j.scico.2023.102941_br0250
  article-title: Random forests
  publication-title: Mach. Learn.
  doi: 10.1023/A:1010933404324
– volume: 45
  start-page: 464
  issue: 5
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0480
  article-title: Automatic identification and classification of software development video tutorial fragments
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2017.2779479
– volume: 127
  year: 2020
  ident: 10.1016/j.scico.2023.102941_br0500
  article-title: PostFinder: mining stack overflow posts to support software developers
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2020.106367
– start-page: 612
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0420
  article-title: Detecting code smells using machine learning techniques: are we there yet?
– volume: 28
  start-page: 11
  issue: 1
  year: 1972
  ident: 10.1016/j.scico.2023.102941_br0080
  article-title: A statistical interpretation of term specificity and its application in retrieval
  publication-title: J. Doc.
  doi: 10.1108/eb026526
– volume: 50
  start-page: 111
  issue: 1
  year: 2015
  ident: 10.1016/j.scico.2023.102941_br0570
  article-title: Predicting program properties from “big code”
  publication-title: ACM SIGPLAN Not.
  doi: 10.1145/2775051.2677009
– start-page: 152
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0590
  article-title: Deep learning type inference
– start-page: 225
  year: 2014
  ident: 10.1016/j.scico.2023.102941_br0390
  article-title: Optimal thresholding of classifiers to maximize F1 measure
– ident: 10.1016/j.scico.2023.102941_br0520
– start-page: 404
  year: 2016
  ident: 10.1016/j.scico.2023.102941_br0440
  article-title: From word embeddings to document similarities for improved information retrieval in software engineering
– start-page: 102
  year: 2014
  ident: 10.1016/j.scico.2023.102941_br0460
  article-title: Mining StackOverflow to turn the IDE into a self-confident programming prompter
– volume: 7
  start-page: 61
  issue: 1
  year: 2014
  ident: 10.1016/j.scico.2023.102941_br0220
  article-title: KNN based machine learning approach for text and document mining
  publication-title: Int. J. Database Theory Appl.
  doi: 10.14257/ijdta.2014.7.1.06
– volume: 23
  issue: 5
  year: 2018
  ident: 10.1016/j.scico.2023.102941_br0470
  article-title: Augmenting and structuring user queries to support efficient free-form code search
  publication-title: Empir. Softw. Eng. (EMSE)
– ident: 10.1016/j.scico.2023.102941_br0130
– volume: 59
  issue: 2
  year: 2022
  ident: 10.1016/j.scico.2023.102941_br0260
  article-title: A comparative study of automated legal text classification using random forests and deep learning
  publication-title: Inf. Process. Manag.
  doi: 10.1016/j.ipm.2021.102798
– year: 2002
  ident: 10.1016/j.scico.2023.102941_br0300
– start-page: 1075
  year: 2019
  ident: 10.1016/j.scico.2023.102941_br0490
  article-title: BIKER: a tool for bi-information source based API method recommendation
– volume: 38
  start-page: 1276
  issue: 6
  year: 2011
  ident: 10.1016/j.scico.2023.102941_br0430
  article-title: A systematic literature review on fault prediction performance in software engineering
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2011.103
SSID ssj0006471
Score 2.35598
Snippet The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 102941
SubjectTerms Fully qualified name resolution
Machine learning
Stack overflow
Text classification
Title A text classification approach to API type resolution for incomplete code snippets
URI https://dx.doi.org/10.1016/j.scico.2023.102941
Volume 227
WOSCitedRecordID wos000958767800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-7964
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006471
  issn: 0167-6423
  databaseCode: AIEXJ
  dateStart: 20211213
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb5swFLeydIdd9j2t-5IPuzEqPgw2R5R1WqcpqqJuyg3ZxkypMogSUlVV__g-YxvSdYrWww5BiMDD4v2wf-_x8zNCH2lSQtjBU58GFfVJFMY-p5LqqU-SZZLSuNPm_PxOp1M2n2eno9G1mwtzsaR1zS4vs9V_dTUcA2frqbP3cHdvFA7APjgdtuB22P6T43NPizk8qWmx1gEZD7va4Zpr5qcnJvUKobZtSqc21IUadLHgVivYS-Vt6sUKSPVml8C6vsCq0fWCEE7j9duNgnr9LrXsPsGHVzDuXPmzpjSf5D__2pqU9UQnVpqeRi-86VZKO-l9vRj-UN6scSrTSWPnrdksRRTviFts4hI6ZIh14t2eNzJlAWzfCVQnM0Ww7nTrJsNwDgE_vB5H2v7RcPbtItp_DG695NCp2c6LzkihjRTGyAN0ENEkY2N0kJ8cz7_1I3lqAva-7a5qVacPvNOWvzObHbZy9hQ9tmEGzg08nqGRqp-jJ24JD2y9-ALNcqzRgm-jBTu04LbBgBas0YIHtGBACx7QgjVasEPLS_Tjy_HZ5Ktvl9nwZUxY6ysRU0EFIUJErCJCpim814kgisWsrESYiLAUJZMB_GTFAk6rjCRJFUhg41zFr9C4bmr1GuGUC86BBAZhJQjsZZKIElholnIO94gOUeQeUSFtDXq9FMqy2OOeQ_Spv2hlSrDsPz11z76wLNKwwwLQtO_CN_e7z1v0aAD6OzRu11v1Hj2UF-1is_5goXQDiuWVOg
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+text+classification+approach+to+API+type+resolution+for+incomplete+code+snippets&rft.jtitle=Science+of+computer+programming&rft.au=Vel%C3%A1zquez-Rodr%C3%ADguez%2C+Camilo&rft.au=Di+Nucci%2C+Dario&rft.au=De+Roover%2C+Coen&rft.date=2023-04-01&rft.issn=0167-6423&rft.volume=227&rft.spage=102941&rft_id=info:doi/10.1016%2Fj.scico.2023.102941&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_scico_2023_102941
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6423&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6423&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6423&client=summon