Optimization and Parallelization of Fuzzy Clustering Algorithm Based on the Improved Kmeans++ Clustering

In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improve...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IOP conference series. Materials Science and Engineering Ročník 768; číslo 7; s. 72106 - 72112
Hlavní autori:	Ma, Yu, Cheng, Wenjuan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Bristol IOP Publishing 01.03.2020
Predmet:	Algorithms Clustering Optimization Parallel processing
ISSN:	1757-8981, 1757-899X
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improved fuzzy clustering algorithm based on Spark is proposed in this article. The proposed algorithm integrates the L2 norm and uses the kmeans++ algorithm improved by the Canopy algorithm to initialize the cluster center. Experimental results show that the proposed algorithm performs well in clustering accuracy and computational performance.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1757-8981 1757-899X
DOI:	10.1088/1757-899X/768/7/072106