KR-DBSCAN: A density-based clustering algorithm based on reverse nearest neighbor and influence space

Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adja...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Expert systems with applications Ročník 186; s. 115763
Hlavní autori: Hu, Lihua, Liu, Hongkai, Zhang, Jifu, Liu, Aiqin
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Elsevier Ltd 30.12.2021
Elsevier BV
Predmet:
ISSN:0957-4174, 1873-6793
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adjacent clusters of different densities. We propose a density-based clustering algorithm, KR-DBSCAN, which is based on the reverse nearest neighbor and influence space. The core objects are identified according to the reverse nearest neighborhood, and their influence spaces are determined by calculating the k-nearest neighborhood and reverse nearest neighborhood for each data object under the Euclidean distance metric. In particular, a new cluster expansion condition is defined using the reverse nearest neighborhood and its influence space, and when the core objects are within their influence spaces, they are added to the cluster by breadth-first traversal. As a result, adjacent clusters with different densities are effectively distinguished, and the computational load is substantially reduced. Boundary objects and noise objects are identified, also using k-nearest neighbors. KR-DBSCAN is experimentally validated on the UCI dataset and some synthetic datasets. •We define a new cluster expanding condition for the density-based clustering.•We design a new noise removal approach in the density-based clustering analysis.•We propose a density-based clustering algorithm KR-DBSCAN.•KR-DBSCAN is evaluated through extensive experiments.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2021.115763