Categorizing npm Packages by Analyzing the Text Information in Software Repositories

To prevent JavaScript developers from reinventing wheels, npm ecosystem provides numerous third-party libraries for developers to realize relevant functionalities. Npm displays the tags provided by the creators for these packages to help developers find suitable ones. However, not all creators have...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings / Asia Pacific Software Engineering Conference pp. 53 - 60
Main Authors: Wang, Yu, Liu, Huaxiao, Gao, Shanquan, Li, Shujia
Format: Conference Proceeding
Language:English
Published: IEEE 01.12.2021
Subjects:
ISSN:2640-0715
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract To prevent JavaScript developers from reinventing wheels, npm ecosystem provides numerous third-party libraries for developers to realize relevant functionalities. Npm displays the tags provided by the creators for these packages to help developers find suitable ones. However, not all creators have the habit of tagging their packages, and thus npm cannot provide tag information of a lot of packages for developers to help them understand the package functionalities effectively. Considering that many tags are unrelated to the functionality of packages, we propose a method to find out the tags that are important to distinguish the functionality categories of packages and assign them to untagged packages for assisting developers in the process of retrieving the packages. Firstly, we analyze the attribute of existing tags in npm to establish category tags (functionality categories). Then, we further mine the readme of tagged packages to generate keywords for each category tag. Finally, our method identifies category tags for untagged packages by measuring the similarity between their readme and the keywords of category tags. The evaluation demonstrates that our approach has a good performance in assigning category tags to untagged packages.
AbstractList To prevent JavaScript developers from reinventing wheels, npm ecosystem provides numerous third-party libraries for developers to realize relevant functionalities. Npm displays the tags provided by the creators for these packages to help developers find suitable ones. However, not all creators have the habit of tagging their packages, and thus npm cannot provide tag information of a lot of packages for developers to help them understand the package functionalities effectively. Considering that many tags are unrelated to the functionality of packages, we propose a method to find out the tags that are important to distinguish the functionality categories of packages and assign them to untagged packages for assisting developers in the process of retrieving the packages. Firstly, we analyze the attribute of existing tags in npm to establish category tags (functionality categories). Then, we further mine the readme of tagged packages to generate keywords for each category tag. Finally, our method identifies category tags for untagged packages by measuring the similarity between their readme and the keywords of category tags. The evaluation demonstrates that our approach has a good performance in assigning category tags to untagged packages.
Author Wang, Yu
Gao, Shanquan
Liu, Huaxiao
Li, Shujia
Author_xml – sequence: 1
  givenname: Yu
  surname: Wang
  fullname: Wang, Yu
  email: 1754449684@qq.com
  organization: College of computer science and technology, Jilin University,Changchun,China
– sequence: 2
  givenname: Huaxiao
  surname: Liu
  fullname: Liu, Huaxiao
  email: liuhuaxiao@jlu.edu.cn
  organization: College of computer science and technology, Jilin University,Changchun,China
– sequence: 3
  givenname: Shanquan
  surname: Gao
  fullname: Gao, Shanquan
  organization: College of computer science and technology, Jilin University,Changchun,China
– sequence: 4
  givenname: Shujia
  surname: Li
  fullname: Li, Shujia
  organization: College of computer science and technology, Jilin University,Changchun,China
BookMark eNotjN1KAkEYQKcoSM0niGBeYO375nfnUhYzQUjSrmVm91ub0lnZXSh7-qS6OnAOnCG7Sk0ixu4RJojgHqar9azQMjf5RIDACQCgvGBjZ3M0Ritpc6Uu2UAYBRlY1Dds2HXvAAIU6AHbFL6nXdPG75h2PB0PfOXLD7-jjocTnya_P_2W_o34hr56vkh10x58H5vEY-Lrpu4_fUv8hY5NF_vzibpbdl37fUfjf47Y6-NsUzxly-f5opgus4hS9pmBqtIl1U5IaYKorEYZQChvKochlLa0skajzhIpYF4G7yrngfJS5TqgHLG7v28kou2xjQffnrbOIjpr5Q8fS1QS
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/APSEC53868.2021.00013
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781665437844
1665437847
EISSN 2640-0715
EndPage 60
ExternalDocumentID 9711977
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i133t-60dd5cef92336b2d7513b024a6d91bbc7c73f1643b01eb18cba9d9a0e8c485b13
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000802192700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:24:18 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i133t-60dd5cef92336b2d7513b024a6d91bbc7c73f1643b01eb18cba9d9a0e8c485b13
PageCount 8
ParticipantIDs ieee_primary_9711977
PublicationCentury 2000
PublicationDate 2021-Dec.
PublicationDateYYYYMMDD 2021-12-01
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-Dec.
PublicationDecade 2020
PublicationTitle Proceedings / Asia Pacific Software Engineering Conference
PublicationTitleAbbrev APSEC
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020405
Score 2.1720774
Snippet To prevent JavaScript developers from reinventing wheels, npm ecosystem provides numerous third-party libraries for developers to realize relevant...
SourceID ieee
SourceType Publisher
StartPage 53
SubjectTerms Codes
Correlation
Ecosystems
JavaScript
Libraries
Multi-label Classification
npm
Software
Tag
Tagging
Wheels
Title Categorizing npm Packages by Analyzing the Text Information in Software Repositories
URI https://ieeexplore.ieee.org/document/9711977
WOSCitedRecordID wos000802192700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8NAEB3a4sFT1Vb8Zg8eTds02d3sUUqLpxJohd7KfgWCmJakVfTXO7uJFcGLt7C7ENjJzpvJzpsHcC-9crHVQYZgFcSGh4FKGA00ruaWSyGo8mITfD5PViuRtuDhwIWx1vriMztwj_4u32z03v0qGwruLr14G9qcs5qrdUiu8GOkDUMnHInhY7qYTvAwM1e-NQ4HPtb5paDiAWTW_d-rT6D_w8Qj6QFjTqFlizPofksxkOZk9mA5cR0fNmX-ictIsX0lqdQv6Csqoj6I7zziZzDcI0v0x6ShITmzkLwgC_TG77K0xAXkVe46h9iqD8-z6XLyFDSCCUGOqeYuYCNjqLYZBm0RU2PDaRgpBGHJjAiV0lzzKMP8CAdD9NGJVlIYIUc20XFCVRidQ6fYFPYCSKZZoqiVMuY4FwsZycQglnOqDI-puYSe26T1tu6JsW725-rv4Ws4dlaoy0BuoLMr9_YWjvTbLq_KO2_IL2l6oQo
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8NAEB1qFfRUtRW_3YNH0zZNNps9SmmpWEugEXor-1UIYlqaVtFf7-w2VgQv3sLuQmAnO28mO28ewK1wysVGeTMEKy_UzPdkHFFP4WpmmOCcSic2wUajeDLhSQXutlwYY4wrPjNN--ju8vVcre2vshZn9tKL7cAuDcNOe8PW2qZX-DnSkqPjt3nrPhn3unicI1vA1fGbLtr5paHiIKRf-9_LD6Hxw8UjyRZljqBi8mOofYsxkPJs1iHt2p4P82X2ictIvngliVAv6C0KIj-I6z3iZjDgIyl6ZFISkaxhSJaTMfrjd7E0xIbkRWZ7h5iiAc_9XtodeKVkgpdhsrnyorbWVJkZhm1BJDuaUT-QCMMi0tyXUjHFghlmSDjoo5eOlRRcc9E2sQpjKv3gBKr5PDenQGYqiiU1QoQM50IuAhFrRHNGpWYh1WdQt5s0XWy6YkzL_Tn_e_gG9gfp03A6fBg9XsCBtcimKOQSqqvl2lzBnnpbZcXy2hn1C_dfpFE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+Asia+Pacific+Software+Engineering+Conference&rft.atitle=Categorizing+npm+Packages+by+Analyzing+the+Text+Information+in+Software+Repositories&rft.au=Wang%2C+Yu&rft.au=Liu%2C+Huaxiao&rft.au=Gao%2C+Shanquan&rft.au=Li%2C+Shujia&rft.date=2021-12-01&rft.pub=IEEE&rft.eissn=2640-0715&rft.spage=53&rft.epage=60&rft_id=info:doi/10.1109%2FAPSEC53868.2021.00013&rft.externalDocID=9711977