A clustering algorithm based on grids for core data and adjacency relationships for edge data
Grid-based clustering algorithms have become a crucial method in the field of data mining due to their efficiency. However, they face challenges such as parameter sensitivity, poor adaptability to density variations, and misclassification of edge data. To address these issues, existing research prim...
Uložené v:
| Vydané v: | Scientific reports Ročník 15; číslo 1; s. 18390 - 36 |
|---|---|
| Hlavný autor: | |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
London
Nature Publishing Group UK
26.05.2025
Nature Publishing Group Nature Portfolio |
| Predmet: | |
| ISSN: | 2045-2322, 2045-2322 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Grid-based clustering algorithms have become a crucial method in the field of data mining due to their efficiency. However, they face challenges such as parameter sensitivity, poor adaptability to density variations, and misclassification of edge data. To address these issues, existing research primarily focuses on three directions: (1) optimizing the adaptive selection of grid parameters, which struggles to handle variations in cluster density; (2) improving grid division methods (e.g., multi-granularity or dynamic grids), which have limited effectiveness on complex-shaped data; and (3) integrating other clustering techniques, which enhances accuracy but increases algorithmic complexity. This paper proposes a novel improved grid-based clustering algorithm that determines core grids based on data distribution uniformity rather than absolute density and introduces a clustering strategy for non-core grids based on adjacency relationships. This approach effectively identifies clusters with different densities and reduces dependency on initial parameters (density threshold
R
and grid partition parameters
M
). The proposed algorithm integrates grid clustering, partitioning-based clustering, and grid splitting techniques. It employs a regional processing strategy—applying grid clustering to cluster core regions while combining grid and Partitioning techniques for edge regions—to enhance accuracy while maintaining efficiency. Experimental results demonstrate that the proposed algorithm outperforms six other benchmark algorithms on datasets with complex shapes and uneven densities, achieving a balance between efficiency and accuracy. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 2045-2322 2045-2322 |
| DOI: | 10.1038/s41598-025-00532-2 |