Optimization and Parallelization of Fuzzy Clustering Algorithm Based on the Improved Kmeans++ Clustering

In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improve...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IOP conference series. Materials Science and Engineering Jg. 768; H. 7; S. 72106 - 72112
Hauptverfasser: Ma, Yu, Cheng, Wenjuan
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Bristol IOP Publishing 01.03.2020
Schlagworte:
ISSN:1757-8981, 1757-899X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improved fuzzy clustering algorithm based on Spark is proposed in this article. The proposed algorithm integrates the L2 norm and uses the kmeans++ algorithm improved by the Canopy algorithm to initialize the cluster center. Experimental results show that the proposed algorithm performs well in clustering accuracy and computational performance.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1757-8981
1757-899X
DOI:10.1088/1757-899X/768/7/072106