An Efficient Spectral Clustering Algorithm Based on Granular-Ball
In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC)....
Uloženo v:
| Vydáno v: | IEEE transactions on knowledge and data engineering Ročník 35; číslo 9; s. 9743 - 9753 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1041-4347, 1558-2191 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC). The algorithm changes the construction method of the similarity matrix. Based on granular-ball, the size of the similarity matrix is greatly reduced, and the construction of the similarity matrix is more reasonable. Experimental results show that the proposed algorithm achieves better speedup ratio, less memory consumption and stronger anti noise performance while achieving similar clustering results to the traditional spectral clustering algorithm. Suppose the number of granular-balls is <inline-formula><tex-math notation="LaTeX">m</tex-math> <mml:math><mml:mi>m</mml:mi></mml:math><inline-graphic xlink:href="xie-ieq1-3249475.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">n</tex-math> <mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="xie-ieq2-3249475.gif"/> </inline-formula> is the number of points in the dataset, and <inline-formula><tex-math notation="LaTeX">m< < n</tex-math> <mml:math><mml:mrow><mml:mi>m</mml:mi><mml:mo><</mml:mo><mml:mo><</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="xie-ieq3-3249475.gif"/> </inline-formula>, the time complexity of GBSC is <inline-formula><tex-math notation="LaTeX">O(m^{3})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>m</mml:mi><mml:mn>3</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="xie-ieq4-3249475.gif"/> </inline-formula>. It is proved that GBSC has good adaptability to large-scale datasets. All codes have been released at https://github.com/xjnine/GBSC . |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1041-4347 1558-2191 |
| DOI: | 10.1109/TKDE.2023.3249475 |