Adaptive gravitational clustering algorithm integrated with noise detection

Clustering analysis is frequently used in data mining, image processing, artificial intelligence, and so on. Traditional approaches heavily rely on manually configured parameters, of which the initial selection exerts a profound influence on the clustering outcomes. In addition, they usually only co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications Jg. 263; S. 125733
Hauptverfasser: Yang, Juntao, Yang, Lijun, Wang, Wentong, Liu, Tao, Tang, Dongming
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 05.03.2025
Schlagworte:
ISSN:0957-4174
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clustering analysis is frequently used in data mining, image processing, artificial intelligence, and so on. Traditional approaches heavily rely on manually configured parameters, of which the initial selection exerts a profound influence on the clustering outcomes. In addition, they usually only consider the relationship between two individual samples when calculating distances, neglecting the overall structure of the dataset, which can negatively affect clustering performance. At the same time, many contemporary algorithms are tailored to specific datasets, posing challenges in achieving optimal clustering performance for intricate, noisy datasets. To address these limitations, we propose an Adaptive Gravitational Clustering Algorithm Integrated with Noise Detection called GCIND. Inspired by the law of gravitation, GCIND takes into account the natural neighborhood structure of the entire dataset, adaptively computing the gravitation between data points by leveraging shared neighbors and Euclidean distance relationships. Our algorithm initially identifies and eliminates outliers or edge points in the dataset. It subsequently uses gravitation to autonomously cluster the remaining core data. Finally, the removed data are reallocated to their respective clusters. GCIND has four notable advantages: (1) it uses gravitation to build the neighborhood graph, reflecting the overall dataset structure; (2) it demonstrates stronger robustness in handling noisy datasets; (3) it uses adaptive gravitational neighborhood graph clustering, removing manual parameter tuning; (4) it adapts to complex manifold-structured datasets, offering broad applicability. Experiments have shown that GCIND, without requiring any parameter settings, demonstrates slightly better performance than the algorithms compared in the study, especially when dealing with complex manifold datasets. •Uses gravitation to build neighborhood graph, reflecting dataset structure.•Demonstrates strong robustness in handling noisy datasets.•Employs adaptive clustering, removing the need for manual parameter tuning.•Adapts to manifold-structured datasets, offering broad applicability.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125733