Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data
Clustering as a fundamental unsupervised learning is considered an important method of data analysis, and -means is demonstrably the most popular clustering algorithm. In this paper, we consider clustering on feature space to solve the low efficiency caused in the Big Data clustering by -means. Diff...
Uloženo v:
| Vydáno v: | Applied mathematics and nonlinear sciences Ročník 5; číslo 1; s. 1 - 10 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Beirut
Sciendo
01.01.2020
De Gruyter Brill Sp. z o.o., Paradigm Publishing Services |
| Témata: | |
| ISSN: | 2444-8656, 2444-8656 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Clustering as a fundamental unsupervised learning is considered an important method of data analysis, and
-means is demonstrably the most popular clustering algorithm. In this paper, we consider clustering on feature space to solve the low efficiency caused in the Big Data clustering by
-means. Different from the traditional methods, the algorithm guaranteed the consistency of the clustering accuracy before and after descending dimension, accelerated
-means when the clustering centeres and distance functions satisfy certain conditions, completely matched in the preprocessing step and clustering step, and improved the efficiency and accuracy. Experimental results have demonstrated the effectiveness of the proposed algorithm. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2444-8656 2444-8656 |
| DOI: | 10.2478/amns.2020.1.00001 |