An Efficient Spectral Clustering Algorithm Based on Granular-Ball
In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC)....
Saved in:
| Published in: | IEEE transactions on knowledge and data engineering Vol. 35; no. 9; pp. 9743 - 9753 |
|---|---|
| Main Authors: | , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1041-4347, 1558-2191 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC). The algorithm changes the construction method of the similarity matrix. Based on granular-ball, the size of the similarity matrix is greatly reduced, and the construction of the similarity matrix is more reasonable. Experimental results show that the proposed algorithm achieves better speedup ratio, less memory consumption and stronger anti noise performance while achieving similar clustering results to the traditional spectral clustering algorithm. Suppose the number of granular-balls is <inline-formula><tex-math notation="LaTeX">m</tex-math> <mml:math><mml:mi>m</mml:mi></mml:math><inline-graphic xlink:href="xie-ieq1-3249475.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">n</tex-math> <mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="xie-ieq2-3249475.gif"/> </inline-formula> is the number of points in the dataset, and <inline-formula><tex-math notation="LaTeX">m< < n</tex-math> <mml:math><mml:mrow><mml:mi>m</mml:mi><mml:mo><</mml:mo><mml:mo><</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="xie-ieq3-3249475.gif"/> </inline-formula>, the time complexity of GBSC is <inline-formula><tex-math notation="LaTeX">O(m^{3})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>m</mml:mi><mml:mn>3</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="xie-ieq4-3249475.gif"/> </inline-formula>. It is proved that GBSC has good adaptability to large-scale datasets. All codes have been released at https://github.com/xjnine/GBSC . |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1041-4347 1558-2191 |
| DOI: | 10.1109/TKDE.2023.3249475 |