A Graph Adaptive Density Peaks Clustering algorithm for automatic centroid selection and effective aggregation

As a clustering approach based on density, Density Peaks Clustering algorithm (DPC) has conspicuous superiorities in searching and finding density peaks. Nevertheless, DPC has obvious deficiencies in centroid selection and aggregation process affected by differences in data shape and density distrib...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications Vol. 195; p. 116539
Main Authors: Xu, Tengfei, Jiang, Jianhua
Format: Journal Article
Language:English
Published: New York Elsevier Ltd 01.06.2022
Elsevier BV
Subjects:
ISSN:0957-4174, 1873-6793
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As a clustering approach based on density, Density Peaks Clustering algorithm (DPC) has conspicuous superiorities in searching and finding density peaks. Nevertheless, DPC has obvious deficiencies in centroid selection and aggregation process affected by differences in data shape and density distribution, which can easily cause problems in centroid selection and trigger domino effect. Therefore, a Graph Adaptive Density Peaks Clustering algorithm based on Graph Theory (called GADPC) is proposed to automatically select centroid and aggregate more effectively. The improvement of GADPC can be subdivided into the two steps. First, the clustering centroids are automatically selected based on the turning angle θ and the graph connectivity of centroids. Second, the remaining points are aggregated towards the corresponding clustering centroid. According to the improved principle, they belong to the closer point which has stronger graph connectivity and higher density. Theoretical analyses and experimental data indicate that GADPC, compared with DBSCAN, K-means and DPC, is more feasible and effective in processing some data sets with varying density and non-spherical distribution such as Jain and Spiral. •A novel GA-DPC is proposed based on the DPC algorithm and Graph Theory.•Centroids can be detected automatically instead of selected manually in DPC.•A new aggregation principle is proposed to eliminate the domino effect in DPC.•Outliers and edge points can be detected easily in GA-DPC.•The experimental results demonstrate the above advantages of the proposed GA-DPC.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2022.116539