A comparative study of efficient initialization methods for the k-means clustering algorithm

► K-means is the most widely used partitional clustering algorithm. ► k-means is highly sensitive to the selection of the initial centers. ► We present an overview of k-means initialization methods (IMs). ► We then compare eight commonly used linear time IMs. ► We demonstrate that popular IMs often...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Expert systems with applications Ročník 40; číslo 1; s. 200 - 210
Hlavní autoři: Celebi, M. Emre, Kingravi, Hassan A., Vela, Patricio A.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Amsterdam Elsevier Ltd 01.01.2013
Elsevier
Témata:
ISSN:0957-4174, 1873-6793
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:► K-means is the most widely used partitional clustering algorithm. ► k-means is highly sensitive to the selection of the initial centers. ► We present an overview of k-means initialization methods (IMs). ► We then compare eight commonly used linear time IMs. ► We demonstrate that popular IMs often perform poorly. K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.
Bibliografie:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2012.07.021