Optimization and Parallelization of Fuzzy Clustering Algorithm Based on the Improved Kmeans++ Clustering
In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improve...
Uložené v:
| Vydané v: | IOP conference series. Materials Science and Engineering Ročník 768; číslo 7; s. 72106 - 72112 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Bristol
IOP Publishing
01.03.2020
|
| Predmet: | |
| ISSN: | 1757-8981, 1757-899X |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improved fuzzy clustering algorithm based on Spark is proposed in this article. The proposed algorithm integrates the L2 norm and uses the kmeans++ algorithm improved by the Canopy algorithm to initialize the cluster center. Experimental results show that the proposed algorithm performs well in clustering accuracy and computational performance. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1757-8981 1757-899X |
| DOI: | 10.1088/1757-899X/768/7/072106 |