A text classification approach to API type resolution for incomplete code snippets
The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools rely on these code snippets as a source of information. Although code snippets are intended as examples for humans, they may not form comp...
Gespeichert in:
| Veröffentlicht in: | Science of computer programming Jg. 227; S. 102941 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier B.V
01.04.2023
|
| Schlagworte: | |
| ISSN: | 0167-6423, 1872-7964 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools rely on these code snippets as a source of information. Although code snippets are intended as examples for humans, they may not form compilation units. For instance, snippets illustrating how to use an API might lack the import statements for the corresponding API types. Thus, it becomes essential to determine the fully-qualified name of API types in incomplete snippets.
We present RESICO, a machine learning-based text classification approach to resolving the simple name of API types to their fully-qualified names. RESICO is trained on a corpus of Java programs for which a compiler can determine the fully-qualified names. For four machine learning classifiers, we evaluate the type resolution accuracy of the resulting models on the original and an extended version of datasets of snippets previously used to evaluate the current state-of-the-art approach based on information retrieval. Results show that our approach outperforms the state-of-the-art one, although the training phase is slightly slower. We observe that most of the incorrect type resolutions are not due to ambiguities among the simple names for API types but due to similarities among the contexts in which these types are used, representing a future research challenge.
•Stack Overflow code snippets might lack information about referenced API types.•RESICO is an approach to resolve simple API names to their fully qualified versions.•RESICO encodes API references and their contexts to train a classification algorithm.•Our approach outperforms the state-of-the-art COSTER in an in-depth evaluation.•Mispredictions by the models are mainly due to similar contexts around API usages. |
|---|---|
| AbstractList | The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools rely on these code snippets as a source of information. Although code snippets are intended as examples for humans, they may not form compilation units. For instance, snippets illustrating how to use an API might lack the import statements for the corresponding API types. Thus, it becomes essential to determine the fully-qualified name of API types in incomplete snippets.
We present RESICO, a machine learning-based text classification approach to resolving the simple name of API types to their fully-qualified names. RESICO is trained on a corpus of Java programs for which a compiler can determine the fully-qualified names. For four machine learning classifiers, we evaluate the type resolution accuracy of the resulting models on the original and an extended version of datasets of snippets previously used to evaluate the current state-of-the-art approach based on information retrieval. Results show that our approach outperforms the state-of-the-art one, although the training phase is slightly slower. We observe that most of the incorrect type resolutions are not due to ambiguities among the simple names for API types but due to similarities among the contexts in which these types are used, representing a future research challenge.
•Stack Overflow code snippets might lack information about referenced API types.•RESICO is an approach to resolve simple API names to their fully qualified versions.•RESICO encodes API references and their contexts to train a classification algorithm.•Our approach outperforms the state-of-the-art COSTER in an in-depth evaluation.•Mispredictions by the models are mainly due to similar contexts around API usages. |
| ArticleNumber | 102941 |
| Author | Di Nucci, Dario Velázquez-Rodríguez, Camilo De Roover, Coen |
| Author_xml | – sequence: 1 givenname: Camilo orcidid: 0000-0002-8360-1519 surname: Velázquez-Rodríguez fullname: Velázquez-Rodríguez, Camilo email: camilo.ernesto.velazquez.rodriguez@vub.be organization: Vrije Universiteit Brussel, Brussels, Belgium – sequence: 2 givenname: Dario surname: Di Nucci fullname: Di Nucci, Dario email: ddinucci@unisa.it organization: University of Salerno, Fisciano (SA), Italy – sequence: 3 givenname: Coen surname: De Roover fullname: De Roover, Coen email: coen.de.roover@vub.be organization: Vrije Universiteit Brussel, Brussels, Belgium |
| BookMark | eNqFkM1KAzEQgINUsK0-gZe8wNZkk_07eCjFn0JBET2HZHaCKdvNkkSxb--29eRBD8PAzHzDzDcjk973SMg1ZwvOeHmzXURw4Bc5y8VYyRvJz8iU11WeVU0pJ2Q6TlVZKXNxQWYxbhljpaz4lLwsacKvRKHTMTrrQCfne6qHIXgN7zR5unxe07QfkAaMvvs49q0P1PXgd0OHCSn4Fmns3TBgipfk3Oou4tVPnpO3-7vX1WO2eXpYr5abDISsU4ZGVKYyUhqT11YaKEsEKIzEWtStNbwwvDVtDWwMsDXTlW1kUVgGIpcaxZw0p70QfIwBrQKXjuenoF2nOFMHOWqrjnLUQY46yRlZ8YsdgtvpsP-Huj1ROL716TAcZrAHbF1ASKr17k_-G_Ytg6E |
| CitedBy_id | crossref_primary_10_1145_3715724 crossref_primary_10_1007_s11042_023_16812_w |
| Cites_doi | 10.1111/j.2517-6161.1974.tb00994.x 10.4304/jcp.7.12.2913-2920 10.1162/tacl_a_00051 10.1023/A:1022627411411 10.1080/00437956.1954.11659520 10.1145/3290353 10.1073/pnas.92.22.9977 10.1023/A:1011441423217 10.1007/s41133-020-00032-0 10.1023/A:1010933404324 10.1109/TSE.2017.2779479 10.1016/j.infsof.2020.106367 10.1108/eb026526 10.1145/2775051.2677009 10.14257/ijdta.2014.7.1.06 10.1016/j.ipm.2021.102798 10.1109/TSE.2011.103 |
| ContentType | Journal Article |
| Copyright | 2023 Elsevier B.V. |
| Copyright_xml | – notice: 2023 Elsevier B.V. |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.scico.2023.102941 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-7964 |
| ExternalDocumentID | 10_1016_j_scico_2023_102941 S0167642323000230 |
| GroupedDBID | --K --M .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 4.4 457 4G. 5VS 6I. 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABFNM ABJNI ABMAC ABTAH ABVKL ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADHUB ADMUD AEBSH AEKER AENEX AEXQZ AFFNX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC CS3 DU5 E.L EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HVGLF HZ~ IHE IXB J1W KOM LG9 M26 M41 MO0 N9A NCXOZ O-L O9- OAUVE OK1 OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSV SSZ T5K TN5 WUQ XPP ZMT ZY4 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c348t-eb37b7b44bb28f4bc66ecc5b4e838dfb15b1dbd8c0d8ccf80a7f9455f0c324ae3 |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000958767800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0167-6423 |
| IngestDate | Sat Nov 29 07:22:13 EST 2025 Tue Nov 18 21:51:37 EST 2025 Fri Feb 23 02:38:22 EST 2024 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Fully qualified name resolution Stack overflow Machine learning Text classification |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c348t-eb37b7b44bb28f4bc66ecc5b4e838dfb15b1dbd8c0d8ccf80a7f9455f0c324ae3 |
| ORCID | 0000-0002-8360-1519 |
| OpenAccessLink | https://www.sciencedirect.com/science/article/pii/S0167642323000230 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_scico_2023_102941 crossref_primary_10_1016_j_scico_2023_102941 elsevier_sciencedirect_doi_10_1016_j_scico_2023_102941 |
| PublicationCentury | 2000 |
| PublicationDate | April 2023 2023-04-00 |
| PublicationDateYYYYMMDD | 2023-04-01 |
| PublicationDate_xml | – month: 04 year: 2023 text: April 2023 |
| PublicationDecade | 2020 |
| PublicationTitle | Science of computer programming |
| PublicationYear | 2023 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Saifullah, Asaduzzaman, Roy (br0050) 2019 Jones (br0080) 1972; 28 Sakakibara, Misue, Koshiba (br0090) 1993 Henkel, Lahiri, Liblit, Reps (br0170) 2018 Hall, Beecham, Bowes, Gray, Counsell (br0430) 2011; 38 Sirres, Bissyandé, Kim, Lo, Klein, Kim, Traon (br0470) 2018; 23 Linares-Vásquez, Bavota, Di Penta, Oliveto, Poshyvanyk (br0510) 2014 Zhang, Oles (br0270) 2001; 4 Bergstra, Bardenet, Bengio, Kégl (br0350) 2011; 24 Cortes, Vapnik (br0290) 1995; 20 Joachims (br0300) 2002 Mikolov, Chen, Corrado, Dean (br0130) 2013 Baltes, Dumani, Treude, Diehl (br0340) 2018 Stone (br0370) 1974 Hellendoorn, Bird, Barr, Allamanis (br0590) 2018 Theeten, Vandeputte, Van Cutsem (br0180) 2019 Subramanian, Inozemtseva, Holmes (br0030) 2014 Malik, Patra, Pradel (br0600) 2019 Wang, Su Blended (br0650) 2020 Wei, Goyal, Durrett, Dillig (br0610) 2020 Xu, Zhang, Chen, Pei, Xu (br0580) 2016 Raychev, Vechev, Krause (br0570) 2015; 50 Xu, Guo, Ye, Cheng (br0110) 2012; 7 Wang, Xu, Li, Yuan, Xue (br0640) 2022 Benelallam, Harrand, Valero, Baudry, Barais (br0520) Nov. 2018 Ponzanelli, Bavota, Mocci, Oliveto, Penta, Haiduc, Russo, Lanza (br0480) 2019; 45 Dasarathy (br0200) 1991 Shah, Patel, Sanghvi, Shah (br0230) 2020; 5 Phan, Nguyen, Tran, Truong, Nguyen, Nguyen (br0040) 2018 Bouaziz, Dartigues-Pallez, da Costa Pereira, Precioso, Lloret (br0100) 2014 Soucy, Mineau (br0210) 2001 Dagenais, Robillard (br0450) 2012 Treude, Robillard (br0010) 2016 Bijalwan, Kumar, Kumari, Pascual (br0220) 2014; 7 Cai, Wang, Huang, Xia, Xing, Lo (br0490) 2019 Dong, Gu, Tian, Sun (br0560) 2022 Rubei, Di Sipio, Nguyen, Di Rocco, Di Ruscio (br0500) 2020; 127 Chen, Wu, Chen, Lu, Ding (br0260) 2022; 59 Ponzanelli, Bavota, Di Penta, Oliveto, Lanza (br0460) 2014 Sechidis, Tsoumakas, Vlahavas (br0380) 2011 Mikolov, Sutskever, Chen, Corrado, Dean (br0120) 2013 Breiman (br0250) 2001; 45 Bates (br0060) 1995; 92 Harris (br0070) 1954; 10 Nguyen, Nguyen, Nguyen (br0140) 2016 Peng, Gao, Li, Gao, Lo, Zhang, Lyu (br0620) 2022 Nguyen, Nguyen, Phan, Nguyen (br0150) 2017 Lehmann, Pradel (br0630) 2022 Terragni, Liu, Cheung CSnippEx (br0550) 2016 Press, Teukolsky, Vetterling, Flannery (br0280) 2007 Martins, Achar, Lopes (br0330) 2018 Bergstra, Yamins, Cox (br0360) 2013 Subramanian, Holmes (br0530) 2013 Di Nucci, Palomba, Tamburri, Serebrenik, De Lucia (br0420) 2018 Lipton, Elkan, Naryanaswamy (br0390) 2014 Luong, Manning (br0400) 2016 Bojanowski, Grave, Joulin, Mikolov (br0410) 2017; 5 Moonen (br0320) 2001 Yang, Hussain, Lopes (br0540) 2016 Ho (br0240) 1995; vol. 1 Alon, Zilberstein, Levy, Yahav (br0190) 2019; 3 Alon, Brody, Levy, Yahav (br0160) 2018 Ye, Shen, Ma, Bunescu, Liu (br0440) 2016 Lilleberg, Zhu, Zhang (br0310) 2015 Lin, Zampetti, Bavota, Di Penta, Lanza (br0020) 2019 Soucy (10.1016/j.scico.2023.102941_br0210) 2001 Dong (10.1016/j.scico.2023.102941_br0560) 2022 Shah (10.1016/j.scico.2023.102941_br0230) 2020; 5 Bojanowski (10.1016/j.scico.2023.102941_br0410) 2017; 5 Joachims (10.1016/j.scico.2023.102941_br0300) 2002 Bijalwan (10.1016/j.scico.2023.102941_br0220) 2014; 7 Baltes (10.1016/j.scico.2023.102941_br0340) 2018 Sechidis (10.1016/j.scico.2023.102941_br0380) 2011 Harris (10.1016/j.scico.2023.102941_br0070) 1954; 10 Nguyen (10.1016/j.scico.2023.102941_br0150) 2017 Nguyen (10.1016/j.scico.2023.102941_br0140) 2016 Ho (10.1016/j.scico.2023.102941_br0240) 1995; vol. 1 Stone (10.1016/j.scico.2023.102941_br0370) 1974 Wang (10.1016/j.scico.2023.102941_br0640) 2022 Ye (10.1016/j.scico.2023.102941_br0440) 2016 Bates (10.1016/j.scico.2023.102941_br0060) 1995; 92 Breiman (10.1016/j.scico.2023.102941_br0250) 2001; 45 Wang (10.1016/j.scico.2023.102941_br0650) 2020 Terragni (10.1016/j.scico.2023.102941_br0550) 2016 Subramanian (10.1016/j.scico.2023.102941_br0030) 2014 Luong (10.1016/j.scico.2023.102941_br0400) Rubei (10.1016/j.scico.2023.102941_br0500) 2020; 127 Lin (10.1016/j.scico.2023.102941_br0020) 2019 Moonen (10.1016/j.scico.2023.102941_br0320) 2001 Peng (10.1016/j.scico.2023.102941_br0620) Phan (10.1016/j.scico.2023.102941_br0040) 2018 Linares-Vásquez (10.1016/j.scico.2023.102941_br0510) 2014 Hellendoorn (10.1016/j.scico.2023.102941_br0590) 2018 Chen (10.1016/j.scico.2023.102941_br0260) 2022; 59 Di Nucci (10.1016/j.scico.2023.102941_br0420) 2018 Jones (10.1016/j.scico.2023.102941_br0080) 1972; 28 Treude (10.1016/j.scico.2023.102941_br0010) 2016 Yang (10.1016/j.scico.2023.102941_br0540) 2016 Zhang (10.1016/j.scico.2023.102941_br0270) 2001; 4 Ponzanelli (10.1016/j.scico.2023.102941_br0460) 2014 Xu (10.1016/j.scico.2023.102941_br0110) 2012; 7 Sakakibara (10.1016/j.scico.2023.102941_br0090) 1993 Bergstra (10.1016/j.scico.2023.102941_br0350) 2011; 24 Cortes (10.1016/j.scico.2023.102941_br0290) 1995; 20 Xu (10.1016/j.scico.2023.102941_br0580) 2016 Martins (10.1016/j.scico.2023.102941_br0330) 2018 Lehmann (10.1016/j.scico.2023.102941_br0630) 2022 Raychev (10.1016/j.scico.2023.102941_br0570) 2015; 50 Wei (10.1016/j.scico.2023.102941_br0610) 2020 Hall (10.1016/j.scico.2023.102941_br0430) 2011; 38 Benelallam (10.1016/j.scico.2023.102941_br0520) Mikolov (10.1016/j.scico.2023.102941_br0120) 2013 Theeten (10.1016/j.scico.2023.102941_br0180) 2019 Sirres (10.1016/j.scico.2023.102941_br0470) 2018; 23 Alon (10.1016/j.scico.2023.102941_br0160) Press (10.1016/j.scico.2023.102941_br0280) 2007 Bergstra (10.1016/j.scico.2023.102941_br0360) 2013 Cai (10.1016/j.scico.2023.102941_br0490) 2019 Lilleberg (10.1016/j.scico.2023.102941_br0310) 2015 Bouaziz (10.1016/j.scico.2023.102941_br0100) 2014 Mikolov (10.1016/j.scico.2023.102941_br0130) Ponzanelli (10.1016/j.scico.2023.102941_br0480) 2019; 45 Lipton (10.1016/j.scico.2023.102941_br0390) 2014 Dagenais (10.1016/j.scico.2023.102941_br0450) 2012 Saifullah (10.1016/j.scico.2023.102941_br0050) 2019 Alon (10.1016/j.scico.2023.102941_br0190) 2019; 3 Henkel (10.1016/j.scico.2023.102941_br0170) 2018 Dasarathy (10.1016/j.scico.2023.102941_br0200) 1991 Subramanian (10.1016/j.scico.2023.102941_br0530) 2013 Malik (10.1016/j.scico.2023.102941_br0600) 2019 |
| References_xml | – volume: 10 start-page: 146 year: 1954 end-page: 162 ident: br0070 article-title: Distributional structure publication-title: Word – start-page: 47 year: 2012 end-page: 57 ident: br0450 article-title: Recovering traceability links between an API and its learning resources publication-title: Proceedings - International Conference on Software Engineering – volume: 20 start-page: 273 year: 1995 end-page: 297 ident: br0290 article-title: Support-vector networks publication-title: Mach. Learn. – volume: 23 year: 2018 ident: br0470 article-title: Augmenting and structuring user queries to support efficient free-form code search publication-title: Empir. Softw. Eng. (EMSE) – start-page: 756 year: 2016 end-page: 758 ident: br0140 article-title: Mapping API elements for code migration with vector representations publication-title: Companion to the Proceedings of the 38th International Conference on Software Engineering (ICSE-C16) – start-page: 121 year: 2020 end-page: 134 ident: br0650 article-title: Precise semantic program embeddings publication-title: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation – start-page: 115 year: 2013 end-page: 123 ident: br0360 article-title: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures publication-title: International Conference on Machine Learning, PMLR – volume: 59 year: 2022 ident: br0260 article-title: A comparative study of automated legal text classification using random forests and deep learning publication-title: Inf. Process. Manag. – volume: 38 start-page: 1276 year: 2011 end-page: 1304 ident: br0430 article-title: A systematic literature review on fault prediction performance in software engineering publication-title: IEEE Trans. Softw. Eng. – start-page: 145 year: 2011 end-page: 158 ident: br0380 article-title: On the stratification of multi-label data publication-title: Machine Learning and Knowledge Discovery in Databases – start-page: 85 year: 2013 end-page: 88 ident: br0530 article-title: Making sense of online code snippets publication-title: IEEE International Working Conference on Mining Software Repositories – volume: 28 start-page: 11 year: 1972 end-page: 21 ident: br0080 article-title: A statistical interpretation of term specificity and its application in retrieval publication-title: J. Doc. – volume: vol. 1 start-page: 278 year: 1995 end-page: 282 ident: br0240 article-title: Random Decision Forests publication-title: Proceedings of 3rd International Conference on Document Analysis and Recognition – start-page: 1075 year: 2019 end-page: 1079 ident: br0490 article-title: BIKER: a tool for bi-information source based API method recommendation publication-title: Proceedings of the 27th Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE19) – volume: 92 start-page: 9977 year: 1995 end-page: 9982 ident: br0060 article-title: Models of natural language understanding publication-title: Proc. Natl. Acad. Sci. – start-page: 607 year: 2016 end-page: 618 ident: br0580 article-title: Python probabilistic type inference with natural language support publication-title: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering – start-page: 392 year: 2016 end-page: 403 ident: br0010 article-title: Augmenting API documentation with insights from stack overflow publication-title: Proceedings of the 38th International Conference on Software Engineering (ICSE16) – year: 2020 ident: br0610 article-title: LambdaNet: probabilistic type inference using graph neural networks publication-title: 2020 8th International Conference on Learning Representations – start-page: 225 year: 2014 end-page: 239 ident: br0390 article-title: Optimal thresholding of classifiers to maximize F1 measure publication-title: Joint European Conference on Machine Learning and Knowledge Discovery in Databases – start-page: 612 year: 2018 end-page: 621 ident: br0420 article-title: Detecting code smells using machine learning techniques: are we there yet? publication-title: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER) – year: 2007 ident: br0280 article-title: Numerical recipes publication-title: The Art of Scientific Computing – start-page: 83 year: 2014 end-page: 94 ident: br0510 article-title: How do API changes trigger stack overflow discussions? A study on the Android SDK publication-title: 22nd International Conference on Program Comprehension, ICPC 2014 - Proceedings – start-page: 288 year: 2014 end-page: 299 ident: br0100 article-title: Short text classification using semantic random forest publication-title: International Conference on Data Warehousing and Knowledge Discovery – start-page: 18 year: 2019 end-page: 28 ident: br0180 article-title: Import2vec learning embeddings for software libraries publication-title: Proceedings of the 16th International Conference on Mining Software Repositories (MSR19) – start-page: 163 year: 2018 end-page: 174 ident: br0170 article-title: Code vectors: understanding programs through embedded abstracted symbolic traces publication-title: Proceedings of the 26th Joint Meeting of European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE18) – start-page: 1982 year: 2022 end-page: 1993 ident: br0560 article-title: SnR: constraint-based type inference for incomplete Java code snippets publication-title: Proceedings of the 44th International Conference on Software Engineering – start-page: 152 year: 2018 end-page: 162 ident: br0590 article-title: Deep learning type inference publication-title: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering – year: 2022 ident: br0620 article-title: Static inference meets deep learning: a hybrid type inference approach for Python – year: 2002 ident: br0300 article-title: Learning to Classify Text Using Support Vector Machines, vol. 668 – start-page: 632 year: 2018 end-page: 642 ident: br0040 article-title: Statistical learning of API fully qualified names in code snippets of online forums publication-title: Proceedings of the 40th International Conference on Software Engineering (ICSE18) – start-page: 438 year: 2017 end-page: 449 ident: br0150 article-title: Exploring API embedding for API usages and applications publication-title: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE) – start-page: 13 year: 2001 end-page: 22 ident: br0320 article-title: Generating robust parsers using island grammars publication-title: Proceedings of the Eighth Working Conference on Reverse Engineering – start-page: 111 year: 1974 end-page: 147 ident: br0370 article-title: Cross-validatory choice and assessment of statistical predictions publication-title: J. R. Stat. Soc., Ser. B, Methodol. – start-page: 319 year: 2018 end-page: 330 ident: br0340 article-title: SOTorrent: reconstructing and analyzing the evolution of stack overflow posts publication-title: Proceedings of the 15th International Conference on Mining Software Repositories (MSR) – volume: 3 start-page: 40 year: 2019 ident: br0190 article-title: code2vec: learning distributed representations of code publication-title: Proc. ACM Program. Lang. – start-page: 391 year: 2016 end-page: 401 ident: br0540 article-title: From query to usable code: an analysis of stack overflow code snippets publication-title: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) – start-page: 466 year: 1993 ident: br0090 article-title: Text classification and keyword extraction by learning decision trees publication-title: Proceedings of 9th Conference on Artificial Intelligence for Applications (AIAI93) – start-page: 131 year: 2022 end-page: 143 ident: br0640 article-title: Recovering container class types in C++ binaries publication-title: 2022 IEEE/ACM International Symposium on Code Generation and Optimization – start-page: 404 year: 2016 end-page: 415 ident: br0440 article-title: From word embeddings to document similarities for improved information retrieval in software engineering publication-title: Proceedings of the 38th International Conference on Software Engineering – start-page: 118 year: 2016 end-page: 129 ident: br0550 article-title: Automated synthesis of compilable code snippets from Q&A sites publication-title: Proceedings of the 25th International Symposium on Software Testing and Analysis – volume: 5 start-page: 1 year: 2020 end-page: 16 ident: br0230 article-title: A comparative analysis of logistic regression, random forest and KNN models for the text classification publication-title: Augment. Hum. Res. – start-page: 102 year: 2014 end-page: 111 ident: br0460 article-title: Mining StackOverflow to turn the IDE into a self-confident programming prompter publication-title: 11th Working Conference on Mining Software Repositories, MSR 2014 - Proceedings – volume: 45 start-page: 5 year: 2001 end-page: 32 ident: br0250 article-title: Random forests publication-title: Mach. Learn. – start-page: 647 year: 2001 end-page: 648 ident: br0210 article-title: A simple KNN algorithm for text categorization publication-title: Proceedings - IEEE International Conference on Data Mining – start-page: 304 year: 2019 end-page: 315 ident: br0600 article-title: NL2Type: inferring JavaScript function types from natural language information publication-title: 2019 IEEE/ACM 41st International Conference on Software Engineering – year: 2022 ident: br0630 article-title: Finding the dwarf: recovering precise types from WebAssembly binaries publication-title: 2022 43rd ACM SIGPLAN Conference on Programming Language Design and Implementation – volume: 24 year: 2011 ident: br0350 article-title: Algorithms for hyper-parameter optimization publication-title: Adv. Neural Inf. Process. Syst. – volume: 5 start-page: 135 year: 2017 end-page: 146 ident: br0410 article-title: Enriching word vectors with subword information publication-title: Trans. Assoc. Comput. Linguist. – start-page: 136 year: 2015 end-page: 140 ident: br0310 article-title: Support vector machines and Word2vec for text classification with semantic features publication-title: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC) – year: 2018 ident: br0160 article-title: code2seq: generating sequences from structured representations of code – start-page: 548 year: 2019 end-page: 559 ident: br0020 article-title: Pattern-based mining of opinions in Q&A websites publication-title: Proceedings of the 41st International Conference on Software Engineering (ICSE19) – start-page: 1 year: 2018 end-page: 5 ident: br0330 article-title: 50K-C: a dataset of compilable, and compiled, Java projects publication-title: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR) – volume: 7 start-page: 2913 year: 2012 end-page: 2920 ident: br0110 article-title: An improved random forest classifier for text categorization publication-title: J. Comput. – year: 1991 ident: br0200 article-title: Nearest neighbor (NN) norms: NN pattern classification techniques publication-title: IEEE Computer Society Tutorial – volume: 50 start-page: 111 year: 2015 end-page: 124 ident: br0570 article-title: Predicting program properties from “big code” publication-title: ACM SIGPLAN Not. – volume: 45 start-page: 464 year: 2019 end-page: 488 ident: br0480 article-title: Automatic identification and classification of software development video tutorial fragments publication-title: IEEE Trans. Softw. Eng. – year: 2013 ident: br0120 article-title: Distributed representations of words and phrases and their compositionality publication-title: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS13) – volume: 7 start-page: 61 year: 2014 end-page: 70 ident: br0220 article-title: KNN based machine learning approach for text and document mining publication-title: Int. J. Database Theory Appl. – year: Nov. 2018 ident: br0520 article-title: Maven central dependency graph – start-page: 243 year: 2019 end-page: 254 ident: br0050 article-title: Learning from examples to find fully qualified names of API elements in code snippets publication-title: Proceedings of the 34th International Conference on Automated Software Engineering (ASE19) – volume: 4 start-page: 5 year: 2001 end-page: 31 ident: br0270 article-title: Text categorization based on regularized linear classification methods publication-title: Inf. Retr. – year: 2016 ident: br0400 article-title: Achieving open vocabulary neural machine translation with hybrid word-character models – volume: 127 year: 2020 ident: br0500 article-title: PostFinder: mining stack overflow posts to support software developers publication-title: Inf. Softw. Technol. – start-page: 643 year: 2014 end-page: 652 ident: br0030 article-title: Live API documentation publication-title: Proceedings of the 36th International Conference on Software Engineering (ICSE14) – year: 2013 ident: br0130 article-title: Efficient estimation of word representations in vector space – start-page: 438 year: 2017 ident: 10.1016/j.scico.2023.102941_br0150 article-title: Exploring API embedding for API usages and applications – start-page: 136 year: 2015 ident: 10.1016/j.scico.2023.102941_br0310 article-title: Support vector machines and Word2vec for text classification with semantic features – start-page: 111 year: 1974 ident: 10.1016/j.scico.2023.102941_br0370 article-title: Cross-validatory choice and assessment of statistical predictions publication-title: J. R. Stat. Soc., Ser. B, Methodol. doi: 10.1111/j.2517-6161.1974.tb00994.x – ident: 10.1016/j.scico.2023.102941_br0620 – start-page: 632 year: 2018 ident: 10.1016/j.scico.2023.102941_br0040 article-title: Statistical learning of API fully qualified names in code snippets of online forums – start-page: 466 year: 1993 ident: 10.1016/j.scico.2023.102941_br0090 article-title: Text classification and keyword extraction by learning decision trees – start-page: 607 year: 2016 ident: 10.1016/j.scico.2023.102941_br0580 article-title: Python probabilistic type inference with natural language support – start-page: 1982 year: 2022 ident: 10.1016/j.scico.2023.102941_br0560 article-title: SnR: constraint-based type inference for incomplete Java code snippets – start-page: 243 year: 2019 ident: 10.1016/j.scico.2023.102941_br0050 article-title: Learning from examples to find fully qualified names of API elements in code snippets – start-page: 304 year: 2019 ident: 10.1016/j.scico.2023.102941_br0600 article-title: NL2Type: inferring JavaScript function types from natural language information – year: 2022 ident: 10.1016/j.scico.2023.102941_br0630 article-title: Finding the dwarf: recovering precise types from WebAssembly binaries – start-page: 115 year: 2013 ident: 10.1016/j.scico.2023.102941_br0360 article-title: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures – year: 2007 ident: 10.1016/j.scico.2023.102941_br0280 article-title: Numerical recipes – start-page: 131 year: 2022 ident: 10.1016/j.scico.2023.102941_br0640 article-title: Recovering container class types in C++ binaries – volume: vol. 1 start-page: 278 year: 1995 ident: 10.1016/j.scico.2023.102941_br0240 article-title: Random Decision Forests – start-page: 288 year: 2014 ident: 10.1016/j.scico.2023.102941_br0100 article-title: Short text classification using semantic random forest – start-page: 319 year: 2018 ident: 10.1016/j.scico.2023.102941_br0340 article-title: SOTorrent: reconstructing and analyzing the evolution of stack overflow posts – start-page: 163 year: 2018 ident: 10.1016/j.scico.2023.102941_br0170 article-title: Code vectors: understanding programs through embedded abstracted symbolic traces – volume: 7 start-page: 2913 issue: 12 year: 2012 ident: 10.1016/j.scico.2023.102941_br0110 article-title: An improved random forest classifier for text categorization publication-title: J. Comput. doi: 10.4304/jcp.7.12.2913-2920 – start-page: 145 year: 2011 ident: 10.1016/j.scico.2023.102941_br0380 article-title: On the stratification of multi-label data – volume: 5 start-page: 135 year: 2017 ident: 10.1016/j.scico.2023.102941_br0410 article-title: Enriching word vectors with subword information publication-title: Trans. Assoc. Comput. Linguist. doi: 10.1162/tacl_a_00051 – start-page: 647 year: 2001 ident: 10.1016/j.scico.2023.102941_br0210 article-title: A simple KNN algorithm for text categorization – ident: 10.1016/j.scico.2023.102941_br0400 – start-page: 548 year: 2019 ident: 10.1016/j.scico.2023.102941_br0020 article-title: Pattern-based mining of opinions in Q&A websites – start-page: 85 year: 2013 ident: 10.1016/j.scico.2023.102941_br0530 article-title: Making sense of online code snippets – start-page: 18 year: 2019 ident: 10.1016/j.scico.2023.102941_br0180 article-title: Import2vec learning embeddings for software libraries – volume: 20 start-page: 273 issue: 3 year: 1995 ident: 10.1016/j.scico.2023.102941_br0290 article-title: Support-vector networks publication-title: Mach. Learn. doi: 10.1023/A:1022627411411 – volume: 10 start-page: 146 issue: 2–3 year: 1954 ident: 10.1016/j.scico.2023.102941_br0070 article-title: Distributional structure publication-title: Word doi: 10.1080/00437956.1954.11659520 – volume: 3 start-page: 40 issue: POPL year: 2019 ident: 10.1016/j.scico.2023.102941_br0190 article-title: code2vec: learning distributed representations of code publication-title: Proc. ACM Program. Lang. doi: 10.1145/3290353 – volume: 92 start-page: 9977 issue: 22 year: 1995 ident: 10.1016/j.scico.2023.102941_br0060 article-title: Models of natural language understanding publication-title: Proc. Natl. Acad. Sci. doi: 10.1073/pnas.92.22.9977 – year: 2020 ident: 10.1016/j.scico.2023.102941_br0610 article-title: LambdaNet: probabilistic type inference using graph neural networks – volume: 24 year: 2011 ident: 10.1016/j.scico.2023.102941_br0350 article-title: Algorithms for hyper-parameter optimization publication-title: Adv. Neural Inf. Process. Syst. – start-page: 643 year: 2014 ident: 10.1016/j.scico.2023.102941_br0030 article-title: Live API documentation – volume: 4 start-page: 5 issue: 1 year: 2001 ident: 10.1016/j.scico.2023.102941_br0270 article-title: Text categorization based on regularized linear classification methods publication-title: Inf. Retr. doi: 10.1023/A:1011441423217 – year: 1991 ident: 10.1016/j.scico.2023.102941_br0200 article-title: Nearest neighbor (NN) norms: NN pattern classification techniques – start-page: 121 year: 2020 ident: 10.1016/j.scico.2023.102941_br0650 article-title: Precise semantic program embeddings – start-page: 1 year: 2018 ident: 10.1016/j.scico.2023.102941_br0330 article-title: 50K-C: a dataset of compilable, and compiled, Java projects – ident: 10.1016/j.scico.2023.102941_br0160 – start-page: 391 year: 2016 ident: 10.1016/j.scico.2023.102941_br0540 article-title: From query to usable code: an analysis of stack overflow code snippets – volume: 5 start-page: 1 issue: 1 year: 2020 ident: 10.1016/j.scico.2023.102941_br0230 article-title: A comparative analysis of logistic regression, random forest and KNN models for the text classification publication-title: Augment. Hum. Res. doi: 10.1007/s41133-020-00032-0 – start-page: 47 year: 2012 ident: 10.1016/j.scico.2023.102941_br0450 article-title: Recovering traceability links between an API and its learning resources – start-page: 756 year: 2016 ident: 10.1016/j.scico.2023.102941_br0140 article-title: Mapping API elements for code migration with vector representations – start-page: 13 year: 2001 ident: 10.1016/j.scico.2023.102941_br0320 article-title: Generating robust parsers using island grammars – start-page: 83 year: 2014 ident: 10.1016/j.scico.2023.102941_br0510 article-title: How do API changes trigger stack overflow discussions? A study on the Android SDK – start-page: 118 year: 2016 ident: 10.1016/j.scico.2023.102941_br0550 article-title: Automated synthesis of compilable code snippets from Q&A sites – year: 2013 ident: 10.1016/j.scico.2023.102941_br0120 article-title: Distributed representations of words and phrases and their compositionality – start-page: 392 year: 2016 ident: 10.1016/j.scico.2023.102941_br0010 article-title: Augmenting API documentation with insights from stack overflow – volume: 45 start-page: 5 issue: 1 year: 2001 ident: 10.1016/j.scico.2023.102941_br0250 article-title: Random forests publication-title: Mach. Learn. doi: 10.1023/A:1010933404324 – volume: 45 start-page: 464 issue: 5 year: 2019 ident: 10.1016/j.scico.2023.102941_br0480 article-title: Automatic identification and classification of software development video tutorial fragments publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2017.2779479 – volume: 127 year: 2020 ident: 10.1016/j.scico.2023.102941_br0500 article-title: PostFinder: mining stack overflow posts to support software developers publication-title: Inf. Softw. Technol. doi: 10.1016/j.infsof.2020.106367 – start-page: 612 year: 2018 ident: 10.1016/j.scico.2023.102941_br0420 article-title: Detecting code smells using machine learning techniques: are we there yet? – volume: 28 start-page: 11 issue: 1 year: 1972 ident: 10.1016/j.scico.2023.102941_br0080 article-title: A statistical interpretation of term specificity and its application in retrieval publication-title: J. Doc. doi: 10.1108/eb026526 – volume: 50 start-page: 111 issue: 1 year: 2015 ident: 10.1016/j.scico.2023.102941_br0570 article-title: Predicting program properties from “big code” publication-title: ACM SIGPLAN Not. doi: 10.1145/2775051.2677009 – start-page: 152 year: 2018 ident: 10.1016/j.scico.2023.102941_br0590 article-title: Deep learning type inference – start-page: 225 year: 2014 ident: 10.1016/j.scico.2023.102941_br0390 article-title: Optimal thresholding of classifiers to maximize F1 measure – ident: 10.1016/j.scico.2023.102941_br0520 – start-page: 404 year: 2016 ident: 10.1016/j.scico.2023.102941_br0440 article-title: From word embeddings to document similarities for improved information retrieval in software engineering – start-page: 102 year: 2014 ident: 10.1016/j.scico.2023.102941_br0460 article-title: Mining StackOverflow to turn the IDE into a self-confident programming prompter – volume: 7 start-page: 61 issue: 1 year: 2014 ident: 10.1016/j.scico.2023.102941_br0220 article-title: KNN based machine learning approach for text and document mining publication-title: Int. J. Database Theory Appl. doi: 10.14257/ijdta.2014.7.1.06 – volume: 23 issue: 5 year: 2018 ident: 10.1016/j.scico.2023.102941_br0470 article-title: Augmenting and structuring user queries to support efficient free-form code search publication-title: Empir. Softw. Eng. (EMSE) – ident: 10.1016/j.scico.2023.102941_br0130 – volume: 59 issue: 2 year: 2022 ident: 10.1016/j.scico.2023.102941_br0260 article-title: A comparative study of automated legal text classification using random forests and deep learning publication-title: Inf. Process. Manag. doi: 10.1016/j.ipm.2021.102798 – year: 2002 ident: 10.1016/j.scico.2023.102941_br0300 – start-page: 1075 year: 2019 ident: 10.1016/j.scico.2023.102941_br0490 article-title: BIKER: a tool for bi-information source based API method recommendation – volume: 38 start-page: 1276 issue: 6 year: 2011 ident: 10.1016/j.scico.2023.102941_br0430 article-title: A systematic literature review on fault prediction performance in software engineering publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2011.103 |
| SSID | ssj0006471 |
| Score | 2.35598 |
| Snippet | The Stack Overflow Q&A platform boasts an active community of users who often include code snippets in their questions and answers. Several development tools... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 102941 |
| SubjectTerms | Fully qualified name resolution Machine learning Stack overflow Text classification |
| Title | A text classification approach to API type resolution for incomplete code snippets |
| URI | https://dx.doi.org/10.1016/j.scico.2023.102941 |
| Volume | 227 |
| WOSCitedRecordID | wos000958767800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-7964 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006471 issn: 0167-6423 databaseCode: AIEXJ dateStart: 20211213 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb5swFLeydIdd9j2t-5IPuzEqPgw2R5R1WqcpqqJuyg3ZxkypMogSUlVV__g-YxvSdYrWww5BiMDD4v2wf-_x8zNCH2lSQtjBU58GFfVJFMY-p5LqqU-SZZLSuNPm_PxOp1M2n2eno9G1mwtzsaR1zS4vs9V_dTUcA2frqbP3cHdvFA7APjgdtuB22P6T43NPizk8qWmx1gEZD7va4Zpr5qcnJvUKobZtSqc21IUadLHgVivYS-Vt6sUKSPVml8C6vsCq0fWCEE7j9duNgnr9LrXsPsGHVzDuXPmzpjSf5D__2pqU9UQnVpqeRi-86VZKO-l9vRj-UN6scSrTSWPnrdksRRTviFts4hI6ZIh14t2eNzJlAWzfCVQnM0Ww7nTrJsNwDgE_vB5H2v7RcPbtItp_DG695NCp2c6LzkihjRTGyAN0ENEkY2N0kJ8cz7_1I3lqAva-7a5qVacPvNOWvzObHbZy9hQ9tmEGzg08nqGRqp-jJ24JD2y9-ALNcqzRgm-jBTu04LbBgBas0YIHtGBACx7QgjVasEPLS_Tjy_HZ5Ktvl9nwZUxY6ysRU0EFIUJErCJCpim814kgisWsrESYiLAUJZMB_GTFAk6rjCRJFUhg41zFr9C4bmr1GuGUC86BBAZhJQjsZZKIElholnIO94gOUeQeUSFtDXq9FMqy2OOeQ_Spv2hlSrDsPz11z76wLNKwwwLQtO_CN_e7z1v0aAD6OzRu11v1Hj2UF-1is_5goXQDiuWVOg |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+text+classification+approach+to+API+type+resolution+for+incomplete+code+snippets&rft.jtitle=Science+of+computer+programming&rft.au=Vel%C3%A1zquez-Rodr%C3%ADguez%2C+Camilo&rft.au=Di+Nucci%2C+Dario&rft.au=De+Roover%2C+Coen&rft.date=2023-04-01&rft.issn=0167-6423&rft.volume=227&rft.spage=102941&rft_id=info:doi/10.1016%2Fj.scico.2023.102941&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_scico_2023_102941 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6423&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6423&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6423&client=summon |