An Efficient Spectral Clustering Algorithm Based on Granular-Ball

In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC)....

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering Vol. 35; no. 9; pp. 9743 - 9753
Main Authors: Xie, Jiang, Kong, Weiyu, Xia, Shuyin, Wang, Guoyin, Gao, Xinbo
Format: Journal Article
Language:English
Published: New York IEEE 01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1041-4347, 1558-2191
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC). The algorithm changes the construction method of the similarity matrix. Based on granular-ball, the size of the similarity matrix is greatly reduced, and the construction of the similarity matrix is more reasonable. Experimental results show that the proposed algorithm achieves better speedup ratio, less memory consumption and stronger anti noise performance while achieving similar clustering results to the traditional spectral clustering algorithm. Suppose the number of granular-balls is <inline-formula><tex-math notation="LaTeX">m</tex-math> <mml:math><mml:mi>m</mml:mi></mml:math><inline-graphic xlink:href="xie-ieq1-3249475.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">n</tex-math> <mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="xie-ieq2-3249475.gif"/> </inline-formula> is the number of points in the dataset, and <inline-formula><tex-math notation="LaTeX">m< < n</tex-math> <mml:math><mml:mrow><mml:mi>m</mml:mi><mml:mo><</mml:mo><mml:mo><</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="xie-ieq3-3249475.gif"/> </inline-formula>, the time complexity of GBSC is <inline-formula><tex-math notation="LaTeX">O(m^{3})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>m</mml:mi><mml:mn>3</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="xie-ieq4-3249475.gif"/> </inline-formula>. It is proved that GBSC has good adaptability to large-scale datasets. All codes have been released at https://github.com/xjnine/GBSC .
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2023.3249475