KR-DBSCAN: A density-based clustering algorithm based on reverse nearest neighbor and influence space

Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adja...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Expert systems with applications Ročník 186; s. 115763
Hlavní autoři: Hu, Lihua, Liu, Hongkai, Zhang, Jifu, Liu, Aiqin
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Elsevier Ltd 30.12.2021
Elsevier BV
Témata:
ISSN:0957-4174, 1873-6793
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adjacent clusters of different densities. We propose a density-based clustering algorithm, KR-DBSCAN, which is based on the reverse nearest neighbor and influence space. The core objects are identified according to the reverse nearest neighborhood, and their influence spaces are determined by calculating the k-nearest neighborhood and reverse nearest neighborhood for each data object under the Euclidean distance metric. In particular, a new cluster expansion condition is defined using the reverse nearest neighborhood and its influence space, and when the core objects are within their influence spaces, they are added to the cluster by breadth-first traversal. As a result, adjacent clusters with different densities are effectively distinguished, and the computational load is substantially reduced. Boundary objects and noise objects are identified, also using k-nearest neighbors. KR-DBSCAN is experimentally validated on the UCI dataset and some synthetic datasets. •We define a new cluster expanding condition for the density-based clustering.•We design a new noise removal approach in the density-based clustering analysis.•We propose a density-based clustering algorithm KR-DBSCAN.•KR-DBSCAN is evaluated through extensive experiments.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2021.115763