KR-DBSCAN: A density-based clustering algorithm based on reverse nearest neighbor and influence space
Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adja...
Uloženo v:
| Vydáno v: | Expert systems with applications Ročník 186; s. 115763 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Elsevier Ltd
30.12.2021
Elsevier BV |
| Témata: | |
| ISSN: | 0957-4174, 1873-6793 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adjacent clusters of different densities. We propose a density-based clustering algorithm, KR-DBSCAN, which is based on the reverse nearest neighbor and influence space. The core objects are identified according to the reverse nearest neighborhood, and their influence spaces are determined by calculating the k-nearest neighborhood and reverse nearest neighborhood for each data object under the Euclidean distance metric. In particular, a new cluster expansion condition is defined using the reverse nearest neighborhood and its influence space, and when the core objects are within their influence spaces, they are added to the cluster by breadth-first traversal. As a result, adjacent clusters with different densities are effectively distinguished, and the computational load is substantially reduced. Boundary objects and noise objects are identified, also using k-nearest neighbors. KR-DBSCAN is experimentally validated on the UCI dataset and some synthetic datasets.
•We define a new cluster expanding condition for the density-based clustering.•We design a new noise removal approach in the density-based clustering analysis.•We propose a density-based clustering algorithm KR-DBSCAN.•KR-DBSCAN is evaluated through extensive experiments. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0957-4174 1873-6793 |
| DOI: | 10.1016/j.eswa.2021.115763 |