A new approach data processing: density-based spatial clustering of applications with noise (DBSCAN) clustering using game-theory

Due to the unpredictable growth of data in various fields, rapid clustering of big data is seriously needed in order to identify the hidden structure of data and discover the relationships between objects. Among clustering methods, density-based clustering methods have an acceptable processing speed...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Soft computing (Berlin, Germany) Ročník 29; číslo 3; s. 1331 - 1346
Hlavní autoři: Kazemi, Uranus, Soleimani, Seyfollah
Médium: Journal Article
Jazyk:angličtina
Vydáno: Heidelberg Springer Nature B.V 01.02.2025
Témata:
ISSN:1432-7643, 1433-7479
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Due to the unpredictable growth of data in various fields, rapid clustering of big data is seriously needed in order to identify the hidden structure of data and discover the relationships between objects. Among clustering methods, density-based clustering methods have an acceptable processing speed for dealing with big data with high dimensions. However, some methods have fixed parameters that are certainly not optimized for all sections. In addition, the complexity of these clustering methods strongly depends on the number of objects. In this paper, a clustering method is presented in order to increase clustering performance and parameter sensitivity according to game-theory and using the concept of Nash equilibrium and dense games, the optimal parameter for clustering is selected and between noise and points clusters make a difference. This method includes (1) searching the grid with several spaces in which there is no cluster, (2) identifying the player through high density data points in order to determine the parameters and (3) combining the clusters to make the game and (4) merging the nearby clusters. The performance of the proposed method was evaluated in four big synthetic datasets, eight real datasets labeled and unlabeled. The obtained results indicate the superiority of the proposed method over SOM, K-means, DBSCAN, SCGPSC methods in terms of accuracy and purity in processing time.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1432-7643
1433-7479
DOI:10.1007/s00500-025-10405-5