Text Document Categorization using Modified K-Means Clustering Algorithm

Uložené v:
Podrobná bibliografia
Názov: Text Document Categorization using Modified K-Means Clustering Algorithm
Zdroj: International Journal of Recent Technology and Engineering. 8:508-511
Informácie o vydavateľovi: Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP, 2019.
Rok vydania: 2019
Popis: The volume of the information that is to be managed is increasing at exponential pace. The challenge arises how to manage this large data effectively. There are many parameters on which the performance of such a system can be measured such as time to retrieve the data, similarity of documents placed in same cluster etc. The paper presents an approach for auto-document categorization using a modified k-means. The proposed methodology has been tested on three different data sets. Experimental findings suggest that proposed methodology is accurate and robust for creating accurate clusters of documents. The proposed methodology uses cosine similarity measure and a fuzzy k-means clustering approach to yield the results very fast and accurately.
Druh dokumentu: Article
Jazyk: English
ISSN: 2277-3878
DOI: 10.35940/ijrte.b1095.0782s719
Prístupové číslo: edsair.doi...........c702e5f00712e31292d35ef7e90c3dbd
Databáza: OpenAIRE
Popis
Abstrakt:The volume of the information that is to be managed is increasing at exponential pace. The challenge arises how to manage this large data effectively. There are many parameters on which the performance of such a system can be measured such as time to retrieve the data, similarity of documents placed in same cluster etc. The paper presents an approach for auto-document categorization using a modified k-means. The proposed methodology has been tested on three different data sets. Experimental findings suggest that proposed methodology is accurate and robust for creating accurate clusters of documents. The proposed methodology uses cosine similarity measure and a fuzzy k-means clustering approach to yield the results very fast and accurately.
ISSN:22773878
DOI:10.35940/ijrte.b1095.0782s719