Research on K-means Clustering Algorithm Based on MapReduce Distributed Programming Framework

As a classical clustering algorithm, K-means algorithm has a profound research background. In the of big data era, K-means algorithms will play a greater advantage, being able to quickly divide similar data into the same cluster. Combining K-means algorithm with MapReduce distributed computing frame...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Procedia computer science Ročník 228; s. 262 - 270
Hlavní autor:	Zhang, Ling
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier B.V 2023
Témata:	Algorithm Process Clustering Algorithm Distributed Programming Framework K-Means Mapreduce Mapreduce Clustering Algorithm K-Means Algorithm Process Distributed Programming Framework
ISSN:	1877-0509, 1877-0509
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	As a classical clustering algorithm, K-means algorithm has a profound research background. In the of big data era, K-means algorithms will play a greater advantage, being able to quickly divide similar data into the same cluster. Combining K-means algorithm with MapReduce distributed computing framework and running on Hadoop big data platform can significantly improve the clustering effect. Based on MapReduce framework structure, this paper studies K-means model, including K-means principle, distance calculation, content validity index and external validity index. On this basis, the K-means clustering flow based on MapReduce big data programming framework is proposed, and the execution process of the algorithm flow is described in detail, which provides a guide for the algorithm implementation.
ISSN:	1877-0509 1877-0509
DOI:	10.1016/j.procs.2023.11.030