Adaptive Density Subgraph Clustering

Density peak clustering (DPC) has garnered growing interest over recent decades due to its capability to identify clusters with diverse shapes and its resilience to the presence of noisy data. Most DPC-based methods exhibit high computational complexity. One approach to mitigate this issue involves...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computational social systems Vol. 11; no. 4; pp. 5468 - 5482
Main Authors: Jia, Hongjie, Wu, Yuhao, Mao, Qirong, Li, Yang, Song, Heping
Format: Journal Article
Language:English
Published: Piscataway IEEE 01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2329-924X, 2373-7476
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Density peak clustering (DPC) has garnered growing interest over recent decades due to its capability to identify clusters with diverse shapes and its resilience to the presence of noisy data. Most DPC-based methods exhibit high computational complexity. One approach to mitigate this issue involves utilizing density subgraphs. Nevertheless, the utilization of density subgraphs may impose restrictions on cluster sizes and potentially lead to an excessive number of small clusters. Furthermore, effectively handling these small clusters, whether through merging or separation, to derive accurate results poses a significant challenge, particularly in scenarios where the number of clusters is unknown. To address these challenges, we propose an adaptive density subgraph clustering algorithm (ADSC). ADSC follows a systematic three-step procedure. First, the high-density regions in the dataset are recognized as density subgraphs based on k-nearest neighbor (KNN) density. Second, the initial clustering is carried out by utilizing an automated mechanism to identify the important density subgraphs and allocate outliers. Last, the obtained initial clustering results are further refined in an adaptive manner using the cluster self-ensemble technique, ultimately yielding the final clustering outcomes. The clustering performance of the proposed ADSC algorithm is evaluated on nineteen benchmark datasets. The experimental results demonstrate that ADSC possesses the ability to automatically determine the optimal number of clusters from intricate density data, all while maintaining high clustering efficiency. Comparative analysis against other well-known density clustering algorithms that require prior knowledge of cluster numbers reveals that ADSC consistently achieves comparable or superior clustering results.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2329-924X
2373-7476
DOI:10.1109/TCSS.2024.3370669