Research paper classification systems based on TF-IDF and LDA schemes
With the increasing advance of computer and information technologies, numerous research papers have been published online as well as offline, and as new research fields have been continuingly created, users have a lot of trouble in finding and categorizing their interesting research papers. In order...
Uloženo v:
| Vydáno v: | Human-centric computing and information sciences Ročník 9; číslo 1; s. 1 - 21 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Berlin/Heidelberg
Springer Berlin Heidelberg
26.08.2019
Korea Information Processing Society, Computer Software Research Group |
| Témata: | |
| ISSN: | 2192-1962, 2192-1962 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | With the increasing advance of computer and information technologies, numerous research papers have been published online as well as offline, and as new research fields have been continuingly created, users have a lot of trouble in finding and categorizing their interesting research papers. In order to overcome the limitations, this paper proposes a research paper classification system that can cluster research papers into the meaningful class in which papers are very likely to have similar subjects. The proposed system extracts representative keywords from the abstracts of each paper and topics by Latent Dirichlet allocation (LDA) scheme. Then, the K-means clustering algorithm is applied to classify the whole papers into research papers with similar subjects, based on the Term frequency-inverse document frequency (TF-IDF) values of each paper. |
|---|---|
| AbstractList | With the increasing advance of computer and information technologies, numerous research papers have been published online as well as offline, and as new research fields have been continuingly created, users have a lot of trouble in finding and categorizing their interesting research papers. In order to overcome the limitations, this paper proposes a research paper classification system that can cluster research papers into the meaningful class in which papers are very likely to have similar subjects. The proposed system extracts representative keywords from the abstracts of each paper and topics by Latent Dirichlet allocation (LDA) scheme. Then, the K-means clustering algorithm is applied to classify the whole papers into research papers with similar subjects, based on the Term frequency-inverse document frequency (TF-IDF) values of each paper. |
| ArticleNumber | 30 |
| Author | Gil, Joon-Min Kim, Sang-Woon |
| Author_xml | – sequence: 1 givenname: Sang-Woon surname: Kim fullname: Kim, Sang-Woon organization: Department of Police Administration, Daegu Catholic University – sequence: 2 givenname: Joon-Min orcidid: 0000-0001-6774-8476 surname: Gil fullname: Gil, Joon-Min email: jmgil@cu.ac.kr organization: School of Information Technology Eng., Daegu Catholic University |
| BookMark | eNp9kE1Lw0AQhhepYK39Ad4WPEf3K9nmWPqhhYAg9bxsNrM2pU3iTnrovzcxgiLo4WVehnlmhveajKq6AkJuObvnfJY8IJeJlhHjaS8R6QsyFr3haSJGP_wVmSLuGWOcaRFrOSarF0Cwwe1oYxsI1B0sYulLZ9uyriiesYUj0twiFLRrbNfRZrmmtipotpxTdDs4At6QS28PCNOvOiGv69V28RRlz4-bxTyLnJKijXQBqeZKKpEXXKd5nIJXkvnCz7zXlsdFmoACHQOA4i7Pfc4U0xwKYMrNEjkhd8PeJtTvJ8DW7OtTqLqTRgidJkqrRHZTfJhyoUYM4E0TyqMNZ8OZ6QMzQ2CmC6uXMLpj9C_Gle1nBm2w5eFfUgwkdleqNwjfP_0NfQBlIIAX |
| CitedBy_id | crossref_primary_10_1007_s44248_025_00052_4 crossref_primary_10_1109_ACCESS_2022_3223094 crossref_primary_10_1038_s41598_024_77240_w crossref_primary_10_1177_21582440221141867 crossref_primary_10_1016_j_cose_2025_104391 crossref_primary_10_1002_smr_70012 crossref_primary_10_1016_j_matcom_2020_12_009 crossref_primary_10_1016_j_engappai_2025_111039 crossref_primary_10_3390_molecules27186042 crossref_primary_10_3390_su16146121 crossref_primary_10_2196_47934 crossref_primary_10_1007_s00521_020_05662_4 crossref_primary_10_1080_1475939X_2023_2218390 crossref_primary_10_1080_23311916_2024_2359850 crossref_primary_10_1134_S1054661824700792 crossref_primary_10_1016_j_engappai_2024_107962 crossref_primary_10_1016_j_aei_2025_103468 crossref_primary_10_1016_j_procs_2022_11_308 crossref_primary_10_3390_buildings14041083 crossref_primary_10_3138_slte_2025_0002 crossref_primary_10_3390_app11199080 crossref_primary_10_7717_peerj_cs_1940 crossref_primary_10_1108_RIA_04_2023_0047 crossref_primary_10_3390_foods10112767 crossref_primary_10_1108_JRIT_01_2024_0016 crossref_primary_10_3390_biomimetics10050275 crossref_primary_10_1007_s12652_021_03401_8 crossref_primary_10_3233_IDA_240075 crossref_primary_10_14201_ADCAIJ2020924968 crossref_primary_10_1007_s12530_022_09450_4 crossref_primary_10_3390_electronics8111250 crossref_primary_10_1080_02664763_2023_2247617 crossref_primary_10_3390_info15060351 crossref_primary_10_53759_7669_jmc202505127 crossref_primary_10_1155_2021_5051667 crossref_primary_10_3390_buildings15132201 crossref_primary_10_1016_j_procs_2024_03_039 crossref_primary_10_1134_S1054661823030288 crossref_primary_10_1108_NEJE_10_2023_0088 crossref_primary_10_1109_ACCESS_2021_3069248 crossref_primary_10_1007_s40622_024_00378_z crossref_primary_10_3390_ijgi13100352 crossref_primary_10_3390_mca29060106 crossref_primary_10_1016_j_measurement_2022_110957 crossref_primary_10_2196_47408 crossref_primary_10_3390_math10030449 crossref_primary_10_1186_s40854_023_00587_y crossref_primary_10_3390_coatings14081027 crossref_primary_10_1016_j_artmed_2023_102716 crossref_primary_10_1007_s11042_023_16615_z crossref_primary_10_1007_s11192_024_05086_0 crossref_primary_10_3389_fgene_2023_1166975 crossref_primary_10_3390_app12073656 crossref_primary_10_1007_s11135_022_01444_3 crossref_primary_10_1007_s40558_023_00278_5 crossref_primary_10_1007_s12065_023_00825_3 crossref_primary_10_1016_j_trf_2025_05_005 crossref_primary_10_3390_su131910856 crossref_primary_10_1080_08839514_2022_2145637 crossref_primary_10_1051_e3sconf_202344802048 crossref_primary_10_1057_s41599_024_03530_3 crossref_primary_10_1186_s13326_023_00298_4 crossref_primary_10_1016_j_rser_2024_115326 crossref_primary_10_1109_ACCESS_2024_3385860 crossref_primary_10_1016_j_eswa_2022_119028 crossref_primary_10_1080_09640568_2023_2240951 crossref_primary_10_3390_data8120180 crossref_primary_10_1016_j_cities_2025_106440 crossref_primary_10_1093_comjnl_bxae042 crossref_primary_10_1016_j_apacoust_2025_111084 crossref_primary_10_1007_s13278_022_00977_7 crossref_primary_10_1007_s11334_022_00516_9 crossref_primary_10_1080_13467581_2024_2399681 crossref_primary_10_1038_s41598_024_53345_0 crossref_primary_10_1038_s41598_025_05842_z crossref_primary_10_1186_s13673_020_00229_7 crossref_primary_10_1016_j_renene_2025_123253 crossref_primary_10_3390_bdcc6040123 crossref_primary_10_5572_KOSAE_2025_41_4_667 crossref_primary_10_3390_app15031149 crossref_primary_10_1109_JTEHM_2023_3241635 crossref_primary_10_1109_ACCESS_2023_3237463 crossref_primary_10_1016_j_eswa_2024_123319 crossref_primary_10_32604_cmc_2022_020480 crossref_primary_10_1016_j_procs_2022_09_403 crossref_primary_10_3389_fpubh_2022_1023890 crossref_primary_10_3233_JIFS_237749 crossref_primary_10_3390_info12120508 crossref_primary_10_1051_itmconf_20224403011 crossref_primary_10_1134_S1995080222150239 crossref_primary_10_4018_IJSWIS_388181 crossref_primary_10_1016_j_ipm_2025_104168 crossref_primary_10_1016_j_jrtpm_2021_100265 crossref_primary_10_1371_journal_pone_0303996 crossref_primary_10_3390_app11125694 crossref_primary_10_3390_plants11223097 crossref_primary_10_1109_ACCESS_2024_3368003 crossref_primary_10_1051_e3sconf_202449901016 crossref_primary_10_1109_THMS_2023_3319290 crossref_primary_10_1108_K_11_2023_2268 crossref_primary_10_1016_j_joi_2022_101262 crossref_primary_10_3390_su142315681 crossref_primary_10_1371_journal_pone_0280221 crossref_primary_10_1142_S2282717X24300022 crossref_primary_10_1016_j_stae_2025_100096 |
| Cites_doi | 10.1080/03081079.2017.1291635 10.1016/j.future.2015.01.005 10.1002/9780470382776 10.1109/ICEEOT.2016.7754750 10.1137/1.9780898718348 10.1016/0377-0427(87)90125-7 10.7152/acro.v11i1.12774 10.1145/1327452.1327492 10.1109/WISP.2009.5286530 10.1186/s13673-017-0116-3 10.1007/978-3-642-35527-1_27 10.1016/j.inffus.2015.07.003 10.1145/2365952.2366004 10.1007/978-3-642-38824-8_25 10.1016/j.future.2016.06.006 10.1186/s40537-015-0020-5 10.1145/2023568.2023579 10.1016/j.proeng.2014.03.129 10.1016/j.ipm.2015.07.004 10.1007/978-1-4614-3223-4_6 10.1007/s11192-014-1321-8 10.1016/j.neucom.2016.07.074 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2019 Human-centric Computing and Information Sciences is a copyright of Springer, (2019). All Rights Reserved. © 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: The Author(s) 2019 – notice: Human-centric Computing and Information Sciences is a copyright of Springer, (2019). All Rights Reserved. © 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | C6C AAYXX CITATION 3V. 7XB 8AL 8FE 8FG 8FK ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- M0N P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS Q9U |
| DOI | 10.1186/s13673-019-0192-7 |
| DatabaseName | Springer Nature OA Free Journals CrossRef ProQuest Central (Corporate) ProQuest Central (purchase pre-March 2016) Computing Database (Alumni Edition) ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student ProQuest SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Computing Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest - Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic |
| DatabaseTitle | CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Central (New) Advanced Technologies & Aerospace Collection ProQuest Computing ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
| DatabaseTitleList | Publicly Available Content Database CrossRef |
| Database_xml | – sequence: 1 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2192-1962 |
| EndPage | 21 |
| ExternalDocumentID | 10_1186_s13673_019_0192_7 |
| GroupedDBID | -A0 0R~ 3V. 40G 5VS 8FE 8FG AAFWJ AAKKN ABEEZ ABFTD ABUWG ACACY ACGFS ACULB ADBBV ADINQ AFGXO AFKRA AHBYD AHYZX ALMA_UNASSIGNED_HOLDINGS AMKLP ARAPS ARCSS AZQEC BCNDV BENPR BGLVJ BPHCQ C24 C6C CCPQU DWQXO EBS EJD GNUQQ GROUPED_DOAJ HCIFZ IAO ISR ITC K6V K7- KQ8 M0N M~E OK1 P62 PIMPY PQQKQ PROAC RSV SCO SOJ U2A AAYXX AFFHD CITATION PHGZM PHGZT PQGLB 7XB 8AL 8FK JQ2 PKEHL PQEST PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c432t-7de9714342bd179b59ef430fdf8ff7a15d96e4e75eee41cbbfb04071ede04c863 |
| IEDL.DBID | BENPR |
| ISICitedReferencesCount | 152 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000483511900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2192-1962 |
| IngestDate | Sun Nov 09 06:25:44 EST 2025 Sat Nov 29 02:28:25 EST 2025 Tue Nov 18 21:49:16 EST 2025 Fri Feb 21 02:34:23 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | LDA K-means clustering TF-IDF Paper classification |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c432t-7de9714342bd179b59ef430fdf8ff7a15d96e4e75eee41cbbfb04071ede04c863 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-6774-8476 |
| OpenAccessLink | https://www.proquest.com/docview/2279647463?pq-origsite=%requestingapplication% |
| PQID | 2279647463 |
| PQPubID | 2034751 |
| PageCount | 21 |
| ParticipantIDs | proquest_journals_2279647463 crossref_primary_10_1186_s13673_019_0192_7 crossref_citationtrail_10_1186_s13673_019_0192_7 springer_journals_10_1186_s13673_019_0192_7 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-08-26 |
| PublicationDateYYYYMMDD | 2019-08-26 |
| PublicationDate_xml | – month: 08 year: 2019 text: 2019-08-26 day: 26 |
| PublicationDecade | 2010 |
| PublicationPlace | Berlin/Heidelberg |
| PublicationPlace_xml | – name: Berlin/Heidelberg – name: Heidelberg |
| PublicationTitle | Human-centric computing and information sciences |
| PublicationTitleAbbrev | Hum. Cent. Comput. Inf. Sci |
| PublicationYear | 2019 |
| Publisher | Springer Berlin Heidelberg Korea Information Processing Society, Computer Software Research Group |
| Publisher_xml | – name: Springer Berlin Heidelberg – name: Korea Information Processing Society, Computer Software Research Group |
| References | HavrlantLKreinovichVA simple probabilistic explanation of term frequency-inverse document frequency (TF-IDF) heuristic (and variations motivated by this explanation)Int J Gen Syst20174612736362332710.1080/03081079.2017.1291635 IbrahimSPhanT-DCarpen-AmarieAChihoubH-EMoiseDAntoniuGGoverning energy consumption in Hadoop through CPU frequency scaling: an analysisFuture Gener Comput Syst20165421923210.1016/j.future.2015.01.005 VisentiniISnidaroLForestiGLDiversity-aware classifier ensemble selection via F-scoreInf Fus201628244310.1016/j.inffus.2015.07.003 Scikit-Learn. http://scikit-learn.org/stable/modules/classes.html. Accessed 15 Aug 2018. ChoWChoiEDTG big data analysis for fuel consumption estimationJ Inf Process Syst2017132285304 GurusamyRSubramaniamVA machine learning approach for MRI brain tumor classificationComput Mater Continua201753291108 SinghJSinghGSinghROptimization of sentiment analysis using machine learning classifiersHum-cent Comput Inf Sci201773210.1186/s13673-017-0116-3 MahendranA“Opinion Mining for text classification,” IntJ Sci Eng Technol201326589594 GanGMaCWuJData clustering: theory, algorithms, and applications2007AlexandriaSIAM1185.6827410.1137/1.9780898718348 GuptaHSrivastavaRK-means based document clustering with automatic “K” selection and cluster refinementInt J Comput Sci Mob Appl201425713 RossiRGLopesAARezendeSOOptimization and label propagation in bipartite heterogeneous networks to improve transductive classification of textsInf Process Manag201652221725710.1016/j.ipm.2015.07.004 GurungPWaghRA study on topic identification using K means clustering algorithm: big vs. small documentsAdv Comput Sci Technol2017102221233 DeanJGhemawatSMapReduce: simplified data processing on large clustersCommun ACM200851110711310.1145/1327452.1327492 NanbaHKandoNOkumuraMClassification of research papers using citation links and citation types: towards automatic review article generationAdv Classif Res Online201111111713410.7152/acro.v11i1.12774 BleiDMNgAYJordanMILatent Dirichlet allocationJ Mach Learn Res2003399310221112.68379 BalabantarayRCSarmaCJhaMDocument clustering using K-means and K-medoidsInt J Knowl Based Comput Syst201311713 Xuan J et al. (2017) Automatic bug triage using semi-supervised text classification. arXiv preprint arXiv:1704.04769 GuiYaochengGaoZhiqiangLiRenyongYangXinHierarchical Text Classification for News Articles Based-on Named EntitiesAdvanced Data Mining and Applications2012Berlin, HeidelbergSpringer Berlin Heidelberg31832910.1007/978-3-642-35527-1_27 FGCS Journal. https://www.journals.elsevier.com/future-generation-computer-systems. Accessed 15 Aug 2018. XuRWunschDClustering2008HobokenWiley10.1002/9780470382776 DudaROHartPEStorkDGPattern classification2012HobokenWiley0968.68140 AggarwalCharu C.ZhaiChengXiangA Survey of Text Classification AlgorithmsMining Text Data2012Boston, MASpringer US16322210.1007/978-1-4614-3223-4_6 AlsmadiIAlhamiIClustering and classification of email contentsJ King Saud Univ Comput Inf Sci.20152714657 KimJ-JHadoop based wavelet histogram for big data in cloudJ Inf Process Syst2017134668676 YauC-KClustering scientific documents with topic modelingScientometrics2014100376778610.1007/s11192-014-1321-8 NguyenThien HaiShiraiKiyoakiText Classification of Technical Papers Based on Text SegmentationNatural Language Processing and Information Systems2013Berlin, HeidelbergSpringer Berlin Heidelberg27828410.1007/978-3-642-38824-8_25 OliveiraGVImproving K-means through distributed scalable metaheuristicsNeurocomputing2017246455710.1016/j.neucom.2016.07.074 VeigaJExpositoRRTaboadaGLTounnoJFlame-MR: an event-driven architecture for MapReduce applicationsFuture Gener Comput Syst201665465610.1016/j.future.2016.06.006 TrstenjakBMikacSDonkoDKNN with TF-IDF based framework for text categorizationProcedia Eng2014691356136410.1016/j.proeng.2014.03.129 BarigouFImpact of instance selection on kNN-based text categorizationJ Inf Process Syst2018142418434 Mohsen T (2011) Subject classification of research papers based on interrelationships analysis. In: Proceeding of the 2011 workshop on knowledge discovery, modeling and simulation. pp 39–44 RousseeuwPJSilhouettes: a graphical aid to the interpretation and validation of cluster analysisJ Comput Appl Math19872053650636.6205910.1016/0377-0427(87)90125-7 NagwaniNKSummarizing large text collection using topic modeling and clustering based on MapReduce frameworkJ Big Data20152111810.1186/s40537-015-0020-5 Baker K, Bhandari A, Thotakura R (2009) An interactive automatic document classification prototype. In: Proc. of the third workshop on human-computer interaction and information retrieval. pp 30–33 Ramos J (2003) Using TF-IDF to determine word relevance in document queries. In: Proc. of the first int. conf. on machine learning Jiang Y, Jia A, Feng Y, Zhao D (2012) Recommending academic papers via users’ reading purposes. In: Proc. of the sixth ACM conf. on recommender systems. pp 241–244 Bravo-Alcobendas D, Sorzano COS (2009) Clustering of biomedical scientific papers. In: 2009 IEEE Int. symp. on intelligent signal processing. pp 205–209 KodinariyaTMMakwanaPRReview on determining number of cluster in K-means clusteringInt J Adv Res Comput Sci Manag Stud2013169095 Bafna P, Pramod D, Vaidya A (2016) Document clustering: TF-IDF approach. In: IEEE int. conf. on electrical, electronics, and optimization techniques (ICEEOT). pp 61–66 Taheriyan M (2011) Subject classification of research papers based on interrelationships analysis. In: ACM proc. of the 2011 workshop on knowledge discovery, modeling and simulation. pp 39–44 HanyurwimfuraDBoLNjagiDDukuzumuremyiJPA centroid and Relationship based clustering for organizing research papersInt J Multimed Ubiquitous Eng201493219234 DM Blei (192_CR31) 2003; 3 H Gupta (192_CR7) 2014; 2 J Singh (192_CR15) 2017; 7 D Hanyurwimfura (192_CR26) 2014; 9 192_CR28 J-J Kim (192_CR10) 2017; 13 192_CR21 RO Duda (192_CR23) 2012 S Ibrahim (192_CR40) 2016; 54 192_CR20 GV Oliveira (192_CR36) 2017; 246 Yaocheng Gui (192_CR14) 2012 192_CR25 192_CR24 P Gurung (192_CR30) 2017; 10 B Trstenjak (192_CR4) 2014; 69 RC Balabantaray (192_CR6) 2013; 1 G Gan (192_CR34) 2007 L Havrlant (192_CR3) 2017; 46 J Dean (192_CR11) 2008; 51 A Mahendran (192_CR16) 2013; 2 F Barigou (192_CR19) 2018; 14 C-K Yau (192_CR5) 2014; 100 192_CR38 TM Kodinariya (192_CR35) 2013; 1 J Veiga (192_CR39) 2016; 65 192_CR32 192_CR2 I Alsmadi (192_CR17) 2015; 27 I Visentini (192_CR41) 2016; 28 R Gurusamy (192_CR8) 2017; 53 192_CR1 192_CR13 Charu C. Aggarwal (192_CR22) 2012 Thien Hai Nguyen (192_CR29) 2013 H Nanba (192_CR27) 2011; 11 W Cho (192_CR12) 2017; 13 NK Nagwani (192_CR9) 2015; 2 PJ Rousseeuw (192_CR37) 1987; 20 R Xu (192_CR33) 2008 RG Rossi (192_CR18) 2016; 52 |
| References_xml | – reference: YauC-KClustering scientific documents with topic modelingScientometrics2014100376778610.1007/s11192-014-1321-8 – reference: KimJ-JHadoop based wavelet histogram for big data in cloudJ Inf Process Syst2017134668676 – reference: NanbaHKandoNOkumuraMClassification of research papers using citation links and citation types: towards automatic review article generationAdv Classif Res Online201111111713410.7152/acro.v11i1.12774 – reference: RousseeuwPJSilhouettes: a graphical aid to the interpretation and validation of cluster analysisJ Comput Appl Math19872053650636.6205910.1016/0377-0427(87)90125-7 – reference: Xuan J et al. (2017) Automatic bug triage using semi-supervised text classification. arXiv preprint arXiv:1704.04769 – reference: NguyenThien HaiShiraiKiyoakiText Classification of Technical Papers Based on Text SegmentationNatural Language Processing and Information Systems2013Berlin, HeidelbergSpringer Berlin Heidelberg27828410.1007/978-3-642-38824-8_25 – reference: VeigaJExpositoRRTaboadaGLTounnoJFlame-MR: an event-driven architecture for MapReduce applicationsFuture Gener Comput Syst201665465610.1016/j.future.2016.06.006 – reference: AlsmadiIAlhamiIClustering and classification of email contentsJ King Saud Univ Comput Inf Sci.20152714657 – reference: DudaROHartPEStorkDGPattern classification2012HobokenWiley0968.68140 – reference: AggarwalCharu C.ZhaiChengXiangA Survey of Text Classification AlgorithmsMining Text Data2012Boston, MASpringer US16322210.1007/978-1-4614-3223-4_6 – reference: RossiRGLopesAARezendeSOOptimization and label propagation in bipartite heterogeneous networks to improve transductive classification of textsInf Process Manag201652221725710.1016/j.ipm.2015.07.004 – reference: HavrlantLKreinovichVA simple probabilistic explanation of term frequency-inverse document frequency (TF-IDF) heuristic (and variations motivated by this explanation)Int J Gen Syst20174612736362332710.1080/03081079.2017.1291635 – reference: GurungPWaghRA study on topic identification using K means clustering algorithm: big vs. small documentsAdv Comput Sci Technol2017102221233 – reference: VisentiniISnidaroLForestiGLDiversity-aware classifier ensemble selection via F-scoreInf Fus201628244310.1016/j.inffus.2015.07.003 – reference: GurusamyRSubramaniamVA machine learning approach for MRI brain tumor classificationComput Mater Continua201753291108 – reference: Mohsen T (2011) Subject classification of research papers based on interrelationships analysis. In: Proceeding of the 2011 workshop on knowledge discovery, modeling and simulation. pp 39–44 – reference: GuiYaochengGaoZhiqiangLiRenyongYangXinHierarchical Text Classification for News Articles Based-on Named EntitiesAdvanced Data Mining and Applications2012Berlin, HeidelbergSpringer Berlin Heidelberg31832910.1007/978-3-642-35527-1_27 – reference: BarigouFImpact of instance selection on kNN-based text categorizationJ Inf Process Syst2018142418434 – reference: NagwaniNKSummarizing large text collection using topic modeling and clustering based on MapReduce frameworkJ Big Data20152111810.1186/s40537-015-0020-5 – reference: Baker K, Bhandari A, Thotakura R (2009) An interactive automatic document classification prototype. In: Proc. of the third workshop on human-computer interaction and information retrieval. pp 30–33 – reference: Ramos J (2003) Using TF-IDF to determine word relevance in document queries. In: Proc. of the first int. conf. on machine learning – reference: GuptaHSrivastavaRK-means based document clustering with automatic “K” selection and cluster refinementInt J Comput Sci Mob Appl201425713 – reference: BalabantarayRCSarmaCJhaMDocument clustering using K-means and K-medoidsInt J Knowl Based Comput Syst201311713 – reference: KodinariyaTMMakwanaPRReview on determining number of cluster in K-means clusteringInt J Adv Res Comput Sci Manag Stud2013169095 – reference: Taheriyan M (2011) Subject classification of research papers based on interrelationships analysis. In: ACM proc. of the 2011 workshop on knowledge discovery, modeling and simulation. pp 39–44 – reference: ChoWChoiEDTG big data analysis for fuel consumption estimationJ Inf Process Syst2017132285304 – reference: IbrahimSPhanT-DCarpen-AmarieAChihoubH-EMoiseDAntoniuGGoverning energy consumption in Hadoop through CPU frequency scaling: an analysisFuture Gener Comput Syst20165421923210.1016/j.future.2015.01.005 – reference: TrstenjakBMikacSDonkoDKNN with TF-IDF based framework for text categorizationProcedia Eng2014691356136410.1016/j.proeng.2014.03.129 – reference: MahendranA“Opinion Mining for text classification,” IntJ Sci Eng Technol201326589594 – reference: FGCS Journal. https://www.journals.elsevier.com/future-generation-computer-systems. Accessed 15 Aug 2018. – reference: XuRWunschDClustering2008HobokenWiley10.1002/9780470382776 – reference: OliveiraGVImproving K-means through distributed scalable metaheuristicsNeurocomputing2017246455710.1016/j.neucom.2016.07.074 – reference: GanGMaCWuJData clustering: theory, algorithms, and applications2007AlexandriaSIAM1185.6827410.1137/1.9780898718348 – reference: Bafna P, Pramod D, Vaidya A (2016) Document clustering: TF-IDF approach. In: IEEE int. conf. on electrical, electronics, and optimization techniques (ICEEOT). pp 61–66 – reference: HanyurwimfuraDBoLNjagiDDukuzumuremyiJPA centroid and Relationship based clustering for organizing research papersInt J Multimed Ubiquitous Eng201493219234 – reference: BleiDMNgAYJordanMILatent Dirichlet allocationJ Mach Learn Res2003399310221112.68379 – reference: Scikit-Learn. http://scikit-learn.org/stable/modules/classes.html. Accessed 15 Aug 2018. – reference: DeanJGhemawatSMapReduce: simplified data processing on large clustersCommun ACM200851110711310.1145/1327452.1327492 – reference: SinghJSinghGSinghROptimization of sentiment analysis using machine learning classifiersHum-cent Comput Inf Sci201773210.1186/s13673-017-0116-3 – reference: Bravo-Alcobendas D, Sorzano COS (2009) Clustering of biomedical scientific papers. In: 2009 IEEE Int. symp. on intelligent signal processing. pp 205–209 – reference: Jiang Y, Jia A, Feng Y, Zhao D (2012) Recommending academic papers via users’ reading purposes. In: Proc. of the sixth ACM conf. on recommender systems. pp 241–244 – ident: 192_CR28 – volume: 46 start-page: 27 issue: 1 year: 2017 ident: 192_CR3 publication-title: Int J Gen Syst doi: 10.1080/03081079.2017.1291635 – volume: 54 start-page: 219 year: 2016 ident: 192_CR40 publication-title: Future Gener Comput Syst doi: 10.1016/j.future.2015.01.005 – volume: 2 start-page: 7 issue: 5 year: 2014 ident: 192_CR7 publication-title: Int J Comput Sci Mob Appl – volume-title: Clustering year: 2008 ident: 192_CR33 doi: 10.1002/9780470382776 – volume: 13 start-page: 285 issue: 2 year: 2017 ident: 192_CR12 publication-title: J Inf Process Syst – volume: 9 start-page: 219 issue: 3 year: 2014 ident: 192_CR26 publication-title: Int J Multimed Ubiquitous Eng – ident: 192_CR1 doi: 10.1109/ICEEOT.2016.7754750 – ident: 192_CR38 – ident: 192_CR2 – volume-title: Data clustering: theory, algorithms, and applications year: 2007 ident: 192_CR34 doi: 10.1137/1.9780898718348 – volume: 20 start-page: 53 year: 1987 ident: 192_CR37 publication-title: J Comput Appl Math doi: 10.1016/0377-0427(87)90125-7 – volume: 1 start-page: 7 issue: 1 year: 2013 ident: 192_CR6 publication-title: Int J Knowl Based Comput Syst – volume: 11 start-page: 117 issue: 1 year: 2011 ident: 192_CR27 publication-title: Adv Classif Res Online doi: 10.7152/acro.v11i1.12774 – ident: 192_CR21 – volume: 51 start-page: 107 issue: 1 year: 2008 ident: 192_CR11 publication-title: Commun ACM doi: 10.1145/1327452.1327492 – ident: 192_CR13 – volume: 27 start-page: 46 issue: 1 year: 2015 ident: 192_CR17 publication-title: J King Saud Univ Comput Inf Sci. – ident: 192_CR24 doi: 10.1109/WISP.2009.5286530 – volume: 1 start-page: 90 issue: 6 year: 2013 ident: 192_CR35 publication-title: Int J Adv Res Comput Sci Manag Stud – volume: 7 start-page: 32 year: 2017 ident: 192_CR15 publication-title: Hum-cent Comput Inf Sci doi: 10.1186/s13673-017-0116-3 – start-page: 318 volume-title: Advanced Data Mining and Applications year: 2012 ident: 192_CR14 doi: 10.1007/978-3-642-35527-1_27 – volume: 2 start-page: 589 issue: 6 year: 2013 ident: 192_CR16 publication-title: J Sci Eng Technol – volume: 28 start-page: 24 year: 2016 ident: 192_CR41 publication-title: Inf Fus doi: 10.1016/j.inffus.2015.07.003 – ident: 192_CR32 doi: 10.1145/2365952.2366004 – volume: 14 start-page: 418 issue: 2 year: 2018 ident: 192_CR19 publication-title: J Inf Process Syst – volume-title: Pattern classification year: 2012 ident: 192_CR23 – start-page: 278 volume-title: Natural Language Processing and Information Systems year: 2013 ident: 192_CR29 doi: 10.1007/978-3-642-38824-8_25 – volume: 10 start-page: 221 issue: 2 year: 2017 ident: 192_CR30 publication-title: Adv Comput Sci Technol – volume: 13 start-page: 668 issue: 4 year: 2017 ident: 192_CR10 publication-title: J Inf Process Syst – volume: 65 start-page: 46 year: 2016 ident: 192_CR39 publication-title: Future Gener Comput Syst doi: 10.1016/j.future.2016.06.006 – volume: 53 start-page: 91 issue: 2 year: 2017 ident: 192_CR8 publication-title: Comput Mater Continua – volume: 2 start-page: 1 issue: 1 year: 2015 ident: 192_CR9 publication-title: J Big Data doi: 10.1186/s40537-015-0020-5 – volume: 3 start-page: 993 year: 2003 ident: 192_CR31 publication-title: J Mach Learn Res – ident: 192_CR20 – ident: 192_CR25 doi: 10.1145/2023568.2023579 – volume: 69 start-page: 1356 year: 2014 ident: 192_CR4 publication-title: Procedia Eng doi: 10.1016/j.proeng.2014.03.129 – volume: 52 start-page: 217 issue: 2 year: 2016 ident: 192_CR18 publication-title: Inf Process Manag doi: 10.1016/j.ipm.2015.07.004 – start-page: 163 volume-title: Mining Text Data year: 2012 ident: 192_CR22 doi: 10.1007/978-1-4614-3223-4_6 – volume: 100 start-page: 767 issue: 3 year: 2014 ident: 192_CR5 publication-title: Scientometrics doi: 10.1007/s11192-014-1321-8 – volume: 246 start-page: 45 year: 2017 ident: 192_CR36 publication-title: Neurocomputing doi: 10.1016/j.neucom.2016.07.074 |
| SSID | ssj0001072573 |
| Score | 2.5948162 |
| Snippet | With the increasing advance of computer and information technologies, numerous research papers have been published online as well as offline, and as new... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Algorithms Artificial Intelligence Big data Classification Cloud Computing for Human-centric Computing Cluster analysis Clustering Communications Engineering Computer Science Computer Systems Organization and Communication Networks Dirichlet problem Information Systems and Communication Service Information Systems Applications (incl.Internet) IoT Networks Scientific papers User Interfaces and Human Computer Interaction Vector quantization |
| SummonAdditionalLinks | – databaseName: SpringerOpen dbid: C24 link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA66evDi-sTVVXLwpBS7TZo0x2UfKCyLB5W9lTymIGhdtqu_3yRNXRQV9NBLk5Qyj8xM5ssMQudacKIZIRGLjQ1QQJkok0xH1rpwBYk0qb-V9jDh02k2m4nbcI-7atDuTUrS79RerTN2VbniYg774_A9wrqF62jDVRNzOK5BuOLgD1ZibsWQhAzmtys_26CVY_klF-pNzLj9r5_bQdvBo8T9WgR20RqUe6jddGvAQXn30agB2eG5nNsB7fxmBxTyvMF1SecKO7NmsH1h5fZmOMayNHgy7GMbBcMzVAfofjy6G1xHoYlCpClJlhE3IFyPc5ooY5VPpQIKSuLCFFlRcNlLjWBAgacAQHtaqULFLsgDAzHVGSOHqFW-lHCEsI1mFRFSUUkZhZ6WYIxIpWJCZkIT0UFxQ9ZchwrjrtHFU-4jjYzlNZlySyL3JDnvoIuPJfO6vMZvk7sNr_KgaVXuKiAyyikjHXTZ8GY1_OPHjv80-wRtJZ65dlthXdRaLl7hFG3qt-VjtTjzAvgOcaXVDQ priority: 102 providerName: Springer Nature |
| Title | Research paper classification systems based on TF-IDF and LDA schemes |
| URI | https://link.springer.com/article/10.1186/s13673-019-0192-7 https://www.proquest.com/docview/2279647463 |
| Volume | 9 |
| WOSCitedRecordID | wos000483511900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2192-1962 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001072573 issn: 2192-1962 databaseCode: M~E dateStart: 20110101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 2192-1962 dateEnd: 20201231 omitProxy: false ssIdentifier: ssj0001072573 issn: 2192-1962 databaseCode: K7- dateStart: 20111101 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest advanced technologies & aerospace journals customDbUrl: eissn: 2192-1962 dateEnd: 20201231 omitProxy: false ssIdentifier: ssj0001072573 issn: 2192-1962 databaseCode: P5Z dateStart: 20111101 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 2192-1962 dateEnd: 20201231 omitProxy: false ssIdentifier: ssj0001072573 issn: 2192-1962 databaseCode: BENPR dateStart: 20111101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 2192-1962 dateEnd: 20201231 omitProxy: false ssIdentifier: ssj0001072573 issn: 2192-1962 databaseCode: PIMPY dateStart: 20111101 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest – providerCode: PRVAVX databaseName: SpringerOpen customDbUrl: eissn: 2192-1962 dateEnd: 20211231 omitProxy: false ssIdentifier: ssj0001072573 issn: 2192-1962 databaseCode: C24 dateStart: 20111201 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpR3LTttAcFSgBy6kpSACFO2hJyoLx7ve9Z5QComKCpFVtQi4WPsYS0htEnDosd_eWWfdqJXKpQfPwWOvVp6ZnadnAN45rbiTnCcy9eSgoPVJYaRLSLsoi5nxeftX2vWlmkyKmxtdxoBbE8squzOxPaj9zIUY-UnodCeFEpKfzh-SMDUqZFfjCI012AidyojPNz6MJuXnVZQlVcSTPKYzB4U8aUKPslBCFMqENFmXfyqklZX5V2K01Tfj3v_u9BVsRUuTDZes8Rpe4HQbet0UBxaF-g2MuuI7NjdzQrhgT4cCopZmbNnquWFB3XlGN4ifL87HzEw9uzwfMvKO8Ts2O_B1PPpy9jGJwxUSJ3i2SJRHHWafi8x6Ekqba6wFT2tfF3WtzCD3WqJAlSOiGDhra5sG5w89psIVku_C-nQ2xT1g5OVaro0VRkiBA2fQe50bK7UptOO6D2n3hSsXO4-HARjfqtYDKWS1JEpFBAlXVqk-HP9-Zb5su_Hcw4cdIaoogU21okIf3nekXKH_udj-84sdwGbW8g6dL_IQ1hePT_gWXrofi_vm8Siy3xGsnWWC4CeVELz6OSJY5neELy-uyttfI9_m_g |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEB6FpBJcSClUhL72ABeQVce73vUeUFU1jRo1jXIIKJzMPsZSpZKGOID4U_2N7PpB1ErNLYcefPHaI9nz7cx-O7MzAO-NFNRwSgMeWkdQUNsgUdwEzrsIjZGycXEq7etQjEbJdCrHDbirz8L4tMraJhaG2t4av0d-7CvdcSYYpyfzn4HvGuWjq3ULjRIWl_j3j6Ns-edBz-n3QxT1zydnF0HVVSAwjEbLQFiUvuk3i7R1aNSxxIzRMLNZkmVCdWMrOTIUMSKyrtE606FnPWgxZCbh1Ml9Bi3mwB42oTUeXI2_rXZ1QuHmAK3Cp92EH-e-JppPWfJpSdKtZu87wNWq9kEgtvBv_fZT-zPb8LJaSZPTEvqvoIGzHWjXXSpIZbRew3mdXEjmau4GjOcLPkGqwCQpS1nnxLtzS9wNN18HvT5RM0uGvVPi2D_-wPwNfNnIx-xCc3Y7w7dAHIvXVCrNFOMMu0ahtTJWmkuVSENlB8Jao6mpKqv7Bh83acGwEp6WIEgdAPwVpaIDH_-_Mi_Liqx7eL9WfFpZmDxdab0Dn2rorIYfFfZuvbAjeH4xuRqmw8Hocg9eRAVunS3l-9BcLn7hAWyZ38vrfHFYQZ_A901j6h9kLEDI |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEB6VFKFeaHmpgT72ABeQFce73vUeEGpJokaNoggB6s3sYywhQZLWKah_rb-OWT-IQGpvPfTgi9ceyd5vZvbbmZ0BeO204k5yHsnYE0FB66PMSBeRd1EWE-PT6lTa14maTrOzMz3bgOv2LExIq2xtYmWo_cKFPfJeqHQnhRKS94omLWI2GH1Ynkehg1SItLbtNGqInOLVb6Jv5fvxgOb6TZKMhp8_nkRNh4HICZ6sIuVRhwbgIrGekGlTjYXgceGLrCiU6adeSxSoUkQUfWdtYePAgNBjLFwmOcl9AJuKE-npwObxcDr7tN7hiRXpA29Cqf1M9spQHy2kL4UUJU0r23-d4XqF-19QtvJ1o-37_Jd24HGzwmZHtUo8gQ2cP4XttnsFa4zZMxi2SYdsaZY04AKPCIlTFVZZXeK6ZMHNe0Y3SI_HgxEzc88mgyNWEtp_YvkcvtzJx7yAznwxx11gxO4t18YKI6TAvjPovU6Nldpk2nHdhbid3dw1FddD448fecW8MpnXgMgJDOFKctWFt39fWdblRm57eK8FQd5YnjJfI6AL71oYrYdvFPbydmGH8IiAlE_G09NXsJVUECYTK_egs7q4xH146H6tvpcXB40WMPh215D6A3S4SWI |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Research+paper+classification+systems+based+on+TF-IDF+and+LDA+schemes&rft.jtitle=Human-centric+computing+and+information+sciences&rft.au=Sang-Woon%2C+Kim&rft.au=Joon-Min%2C+Gil&rft.date=2019-08-26&rft.pub=Korea+Information+Processing+Society%2C+Computer+Software+Research+Group&rft.eissn=2192-1962&rft.volume=9&rft.issue=1&rft.spage=1&rft.epage=21&rft_id=info:doi/10.1186%2Fs13673-019-0192-7&rft.externalDBID=HAS_PDF_LINK |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2192-1962&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2192-1962&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2192-1962&client=summon |