GBK-means clustering algorithm: An improvement to the K-means algorithm based on the bargaining game

Due to its simplicity, versatility and the diversity of applications to which it can be applied, k-means is one of the well-known algorithms for clustering data. The foundation of this algorithm is based on the distance measure. However, the traditional k-means has some weaknesses that appear in som...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Knowledge-based systems Ročník 213; s. 106672
Hlavní autoři: Jahangoshai Rezaee, Mustafa, Eshkevari, Milad, Saberi, Morteza, Hussain, Omar
Médium: Journal Article
Jazyk:angličtina
Vydáno: Amsterdam Elsevier B.V 15.02.2021
Elsevier Science Ltd
Témata:
ISSN:0950-7051, 1872-7409
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Due to its simplicity, versatility and the diversity of applications to which it can be applied, k-means is one of the well-known algorithms for clustering data. The foundation of this algorithm is based on the distance measure. However, the traditional k-means has some weaknesses that appear in some data sets related to real applications, the most important of which is to consider only the distance criterion for clustering. Various studies have been conducted to address each of these weaknesses to achieve a balance between quality and efficiency. In this paper, a novel k-means variant of the original algorithm is proposed. This approach leverages the power of bargaining game modelling in the k-means algorithm for clustering data. In this novel setting, cluster centres compete with each other to attract the largest number of similar objectives or entities to their cluster. Thus, the centres keep changing their positions so that they have smaller distances with the maximum possible data than other cluster centres. We name this new algorithm the game-based k-means (GBK-means) algorithm. To show the superiority and efficiency of GBK-means over conventional clustering algorithms, namely, k-means and fuzzy k-means, we use the following syntactic and real-world data sets: (1) a series of two-dimensional syntactic data sets; and (2) ten benchmark data sets that are widely used in different clustering studies. The evaluation criteria show GBK-means is able to cluster data more accurately than classical algorithms based on eight evaluation metrics, namely F-measure, the Dunn index (DI), the rand index (RI), the Jaccard index (JI), normalized mutual information (NMI), normalized variation of information (NVI), the measure of concordance and error rate (ER).
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2020.106672