NaGB-DBSCAN: An improved DBSCAN clustering algorithm by natural neighbor and granular-ball
DBSCAN is a robust density-based clustering algorithm, which performs well in handling noisy and irregular datasets. However, it relies on the setting of two parameters (ϵ and m), and parameter adjustment is rather troublesome. Moreover, it needs to scan all data points one by one, resulting in a ti...
Uloženo v:
| Vydáno v: | Information sciences Ročník 719; s. 122445 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier Inc
01.11.2025
|
| Témata: | |
| ISSN: | 0020-0255 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | DBSCAN is a robust density-based clustering algorithm, which performs well in handling noisy and irregular datasets. However, it relies on the setting of two parameters (ϵ and m), and parameter adjustment is rather troublesome. Moreover, it needs to scan all data points one by one, resulting in a time complexity of O(N2). To solve these problems, we propose an improved DBSCAN algorithm, which combines Natural Neighbor and Granular-Ball, named NaGB-DBSCAN. Our method has the following several advantages: (1) It requires only a single parameter, which makes it significantly easier to set up. (2) It reduces the data processing workload by combining natural neighbor with granular-ball to cover the original dataset, which lowers the time complexity to O(NlogN). (3) It effectively handles datasets with heterogeneous density and weak connections, enhancing clustering efficiency and quality by optimizing the process and accurately identifying distinct cluster structures. Finally, we validate the effectiveness of our method on 17 synthetic datasets, 14 real datasets, and 1 immune cell dataset. The results show that NaGB-DBSCAN ranks first in average Purity and NMI scores, with both exceeding 90% on the immune cell dataset. Furthermore, pairwise t-tests hypothesis experiment confirm the statistical significance of these results.
•We propose a DBSCAN variant based on Natural Neighbor and Granular-Ball.•Replaces ϵ and m with threshold Rt to reduce parameter dependence.•Adaptive granular-ball generation is achieved using the Natural Neighbor.•Effectively handles datasets with complex shapes and varying densities. |
|---|---|
| ISSN: | 0020-0255 |
| DOI: | 10.1016/j.ins.2025.122445 |