An Adaptive Clustering Algorithm Based on Local-Density Peaks for Imbalanced Data Without Parameters
Imbalanced data clustering is a challenging problem in machine learning. The main difficulty is caused by the imbalance in both cluster size and data density distribution. To address this problem, we propose a novel clustering algorithm called LDPI based on local-density peaks in this study. First,...
Uloženo v:
| Vydáno v: | IEEE transactions on knowledge and data engineering Ročník 35; číslo 4; s. 3419 - 3432 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.04.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1041-4347, 1558-2191 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Imbalanced data clustering is a challenging problem in machine learning. The main difficulty is caused by the imbalance in both cluster size and data density distribution. To address this problem, we propose a novel clustering algorithm called LDPI based on local-density peaks in this study. First, an initial sub-cluster construction scheme is designed based on a 3-dimensional (3-D) decision graph that can easily detect the initial sub-cluster centers and identify the noise points. Second, a sub-cluster updating strategy is designed, which can automatically identify the false sub-cluster centers and update the initial sub-clusters. Third, a sub-cluster merging scheme is designed, which merges the updated initial sub-clusters into final clusters. Consequently, the proposed algorithm has three advantages: 1) It does not require any input parameters; 2) It can automatically determine the cluster centers and number of clusters; 3) It is suitable for imbalanced datasets and datasets with arbitrary shapes and distributions. The effectiveness of LDPI is demonstrated experimentally and the superiority of LDPI is identified by comparison with 5 state-of-the-art algorithms. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1041-4347 1558-2191 |
| DOI: | 10.1109/TKDE.2021.3138962 |