Adaptive gravitational clustering algorithm integrated with noise detection

Clustering analysis is frequently used in data mining, image processing, artificial intelligence, and so on. Traditional approaches heavily rely on manually configured parameters, of which the initial selection exerts a profound influence on the clustering outcomes. In addition, they usually only co...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications Vol. 263; p. 125733
Main Authors: Yang, Juntao, Yang, Lijun, Wang, Wentong, Liu, Tao, Tang, Dongming
Format: Journal Article
Language:English
Published: Elsevier Ltd 05.03.2025
Subjects:
ISSN:0957-4174
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Clustering analysis is frequently used in data mining, image processing, artificial intelligence, and so on. Traditional approaches heavily rely on manually configured parameters, of which the initial selection exerts a profound influence on the clustering outcomes. In addition, they usually only consider the relationship between two individual samples when calculating distances, neglecting the overall structure of the dataset, which can negatively affect clustering performance. At the same time, many contemporary algorithms are tailored to specific datasets, posing challenges in achieving optimal clustering performance for intricate, noisy datasets. To address these limitations, we propose an Adaptive Gravitational Clustering Algorithm Integrated with Noise Detection called GCIND. Inspired by the law of gravitation, GCIND takes into account the natural neighborhood structure of the entire dataset, adaptively computing the gravitation between data points by leveraging shared neighbors and Euclidean distance relationships. Our algorithm initially identifies and eliminates outliers or edge points in the dataset. It subsequently uses gravitation to autonomously cluster the remaining core data. Finally, the removed data are reallocated to their respective clusters. GCIND has four notable advantages: (1) it uses gravitation to build the neighborhood graph, reflecting the overall dataset structure; (2) it demonstrates stronger robustness in handling noisy datasets; (3) it uses adaptive gravitational neighborhood graph clustering, removing manual parameter tuning; (4) it adapts to complex manifold-structured datasets, offering broad applicability. Experiments have shown that GCIND, without requiring any parameter settings, demonstrates slightly better performance than the algorithms compared in the study, especially when dealing with complex manifold datasets. •Uses gravitation to build neighborhood graph, reflecting dataset structure.•Demonstrates strong robustness in handling noisy datasets.•Employs adaptive clustering, removing the need for manual parameter tuning.•Adapts to manifold-structured datasets, offering broad applicability.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125733