Adaptive gravitational clustering algorithm integrated with noise detection

Clustering analysis is frequently used in data mining, image processing, artificial intelligence, and so on. Traditional approaches heavily rely on manually configured parameters, of which the initial selection exerts a profound influence on the clustering outcomes. In addition, they usually only co...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Expert systems with applications Ročník 263; s. 125733
Hlavní autoři: Yang, Juntao, Yang, Lijun, Wang, Wentong, Liu, Tao, Tang, Dongming
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 05.03.2025
Témata:
ISSN:0957-4174
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Clustering analysis is frequently used in data mining, image processing, artificial intelligence, and so on. Traditional approaches heavily rely on manually configured parameters, of which the initial selection exerts a profound influence on the clustering outcomes. In addition, they usually only consider the relationship between two individual samples when calculating distances, neglecting the overall structure of the dataset, which can negatively affect clustering performance. At the same time, many contemporary algorithms are tailored to specific datasets, posing challenges in achieving optimal clustering performance for intricate, noisy datasets. To address these limitations, we propose an Adaptive Gravitational Clustering Algorithm Integrated with Noise Detection called GCIND. Inspired by the law of gravitation, GCIND takes into account the natural neighborhood structure of the entire dataset, adaptively computing the gravitation between data points by leveraging shared neighbors and Euclidean distance relationships. Our algorithm initially identifies and eliminates outliers or edge points in the dataset. It subsequently uses gravitation to autonomously cluster the remaining core data. Finally, the removed data are reallocated to their respective clusters. GCIND has four notable advantages: (1) it uses gravitation to build the neighborhood graph, reflecting the overall dataset structure; (2) it demonstrates stronger robustness in handling noisy datasets; (3) it uses adaptive gravitational neighborhood graph clustering, removing manual parameter tuning; (4) it adapts to complex manifold-structured datasets, offering broad applicability. Experiments have shown that GCIND, without requiring any parameter settings, demonstrates slightly better performance than the algorithms compared in the study, especially when dealing with complex manifold datasets. •Uses gravitation to build neighborhood graph, reflecting dataset structure.•Demonstrates strong robustness in handling noisy datasets.•Employs adaptive clustering, removing the need for manual parameter tuning.•Adapts to manifold-structured datasets, offering broad applicability.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125733