Classification of APIs by Hierarchical Clustering
APIs can be classified according to the programming domains (e.g., GUIs, databases, collections, or security) that they address. Such classification is vital in searching repositories (e.g., the Maven Central Repository for Java) and for understanding the technology stack used in software projects....
Gespeichert in:
| Veröffentlicht in: | 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC) S. 233 - 243 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
ACM
01.05.2018
|
| Schlagworte: | |
| ISSN: | 2643-7171 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | APIs can be classified according to the programming domains (e.g., GUIs, databases, collections, or security) that they address. Such classification is vital in searching repositories (e.g., the Maven Central Repository for Java) and for understanding the technology stack used in software projects. We apply hierarchical clustering to a curated suite of Java APIs to compare the computed API clusters with preexisting API classifications. Clustering entails various parameters (e.g., the choice of IDF versus LSI versus LDA). We describe the corresponding variability in terms of a feature model. We exercise all possible con gurations to determine the maximum correlation with respect to two baselines: i) a smaller suite of APIs manually classified in previous research; ii) a larger suite of APIs from the Maven Central Repository, thereby taking advantage of crowd-sourced classification while relying on a threshold-based approach for identifying important APIs and versions thereof, subject to an API dependency analysis on GitHub. We discuss the configurations found in this way and we examine the influence of particular features on the correlation between computed clusters and baselines. To this end, we also leverage interactive exploration of the parameter space and the resulting dendrograms. In this manner, we can also identify issues with the use of classifiers (e.g., missing classifiers) in the baselines and limitations of the clustering approach. |
|---|---|
| AbstractList | APIs can be classified according to the programming domains (e.g., GUIs, databases, collections, or security) that they address. Such classification is vital in searching repositories (e.g., the Maven Central Repository for Java) and for understanding the technology stack used in software projects. We apply hierarchical clustering to a curated suite of Java APIs to compare the computed API clusters with preexisting API classifications. Clustering entails various parameters (e.g., the choice of IDF versus LSI versus LDA). We describe the corresponding variability in terms of a feature model. We exercise all possible con gurations to determine the maximum correlation with respect to two baselines: i) a smaller suite of APIs manually classified in previous research; ii) a larger suite of APIs from the Maven Central Repository, thereby taking advantage of crowd-sourced classification while relying on a threshold-based approach for identifying important APIs and versions thereof, subject to an API dependency analysis on GitHub. We discuss the configurations found in this way and we examine the influence of particular features on the correlation between computed clusters and baselines. To this end, we also leverage interactive exploration of the parameter space and the resulting dendrograms. In this manner, we can also identify issues with the use of classifiers (e.g., missing classifiers) in the baselines and limitations of the clustering approach. |
| Author | Hartel, Johannes Lammel, Ralf Aksu, Hakan |
| Author_xml | – sequence: 1 givenname: Johannes surname: Hartel fullname: Hartel, Johannes organization: University of Koblenz-Landau – sequence: 2 givenname: Hakan surname: Aksu fullname: Aksu, Hakan organization: University of Koblenz-Landau – sequence: 3 givenname: Ralf surname: Lammel fullname: Lammel, Ralf organization: University of Koblenz-Landau |
| BookMark | eNotjjFPwzAUhA0CibZ0ZmDxH0jx87Nje6wioJUqwQBz9ew4YBQSZIeh_54I0A2fdCfd3ZJdDOMQGbsBsQFQ-g7B1Shh80ulzthydgVqA8qes4WsFVYGDFyxdSkfQgiUApUxCwZNT6WkLgWa0jjwsePb533h_sR3KWbK4X2Oet7032WKOQ1v1-yyo77E9T9X7PXh_qXZVYenx32zPVQklZkqa4BoHmm1115EC044AAhdIFAutDQrSKqVJZzP6rr1PnonCSmQ6wBX7PavN8UYj185fVI-Ha0zKBDwB8c3RT4 |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/3196321.3196344 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 1450357148 9781450357142 |
| EISSN | 2643-7171 |
| EndPage | 243 |
| ExternalDocumentID | 8973031 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
| ID | FETCH-LOGICAL-a247t-871aa034d5b5b0e81909111cfca149cdadadc2a648a314556dbbeb92a3aca9f13 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 7 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000555427300023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 13 06:23:00 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a247t-871aa034d5b5b0e81909111cfca149cdadadc2a648a314556dbbeb92a3aca9f13 |
| PageCount | 11 |
| ParticipantIDs | ieee_primary_8973031 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-May |
| PublicationDateYYYYMMDD | 2018-05-01 |
| PublicationDate_xml | – month: 05 year: 2018 text: 2018-May |
| PublicationDecade | 2010 |
| PublicationTitle | 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC) |
| PublicationTitleAbbrev | ICPC |
| PublicationYear | 2018 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0003203477 ssj0002869941 |
| Score | 2.0865493 |
| Snippet | APIs can be classified according to the programming domains (e.g., GUIs, databases, collections, or security) that they address. Such classification is vital... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 233 |
| SubjectTerms | APIs Clustering exploration Computational modeling Correlation GitHub Hierarchical clustering. Feature modeling Java Large scale integration Maven Central Repository Programming Security Software Software development management |
| Title | Classification of APIs by Hierarchical Clustering |
| URI | https://ieeexplore.ieee.org/document/8973031 |
| WOSCitedRecordID | wos000555427300023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LSsNAFL3U4sJV1VZ8MwuXtk0mk0xmKcVSQUoXCt2VOy8olFb6EPx7507SiuBGskhmFiHzyJybyT3nADwEFLMuxAHdUhuyMMtIAzIMSIiQdKKt5aZyLXmV43E5napJAx4PXBjnXEw-cz26jP_y7crsaKusX6owH4k0fSSlrLhah_0UXhZqz8mkcsaTTEhZq_mkIu_HycbTXjwL8ctOJaLJsPW_5ziFzg8tj00OgHMGDbc8h9bel4HVr2kb0uh0STlAsdvZyrOnycuG6S82mhPhOPqfLNhgsSOZhHCvDrwPn98Go25tjdBFLuQ2rGEpYmigzXWuE0ewTquW8QbDJ4-xGA7DsRAlZiRFXlitnVYcMzSofJpdQHO5WrpLYD7JufEYcN8XAp1DQfxTm0ijuFTKX0GbemD2UalfzOrGX_9dfQMnIaQoq5TAW2hu1zt3B8fmczvfrO_jkH0DZlOVgA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LasJAFL2ILbQr22rpu7PosmoymTxmWaQSaSouLLiTeYIgKj4K_fvOnURLoZuSRTKzCJlH5txM7jkH4MmhmDYuDmhnUqGFWYQakG5AXIQkA6k1VaVrSZEOh9lkwkc1eD5wYYwxPvnMdPDS_8vXS7XDrbJuxt18RNL0UcwYDUu21mFHhWYJ37MysRzRIGJpWun5hCzu-ulGw44_M_bLUMXjSb_xvyc5g9YPMY-MDpBzDjWzuIDG3pmBVC9qE0LvdYlZQL7jydKSl9FgQ-QXyWdIOfYOKHPSm-9QKMHdqwUf_ddxL29X5ghtQVm6datYKIRroI5lLAODwI7rlrJKuI8epYU7FBUJy0SEYuSJltJITkUklOA2jC6hvlguzBUQG8RUWeGQ3yZMGCMYMlB1kCpOU87tNTSxB6arUv9iWjX-5u_qRzjJx-_FtBgM327h1AUYWZkgeAf17Xpn7uFYfW5nm_WDH75vYESYxw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2018+IEEE%2FACM+26th+International+Conference+on+Program+Comprehension+%28ICPC%29&rft.atitle=Classification+of+APIs+by+Hierarchical+Clustering&rft.au=Hartel%2C+Johannes&rft.au=Aksu%2C+Hakan&rft.au=Lammel%2C+Ralf&rft.date=2018-05-01&rft.pub=ACM&rft.eissn=2643-7171&rft.spage=233&rft.epage=243&rft_id=info:doi/10.1145%2F3196321.3196344&rft.externalDocID=8973031 |