Weighted Graph Cuts without Eigenvectors A Multilevel Approach

A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel k-means are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods - in pa...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on pattern analysis and machine intelligence Ročník 29; číslo 11; s. 1944 - 1957
Hlavní autori: Dhillon, I.S., Yuqiang Guan, Kulis, B.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Los Alamitos, CA IEEE 01.11.2007
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:0162-8828, 1939-3539
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel k-means are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods - in particular, a general weighted kernel k-means objective is mathematically equivalent to a weighted graph clustering objective. We exploit this equivalence to develop a fast high-quality multilevel algorithm that directly optimizes various weighted graph clustering objectives, such as the popular ratio cut, normalized cut, and ratio association criteria. This eliminates the need for any eigenvector computation for graph clustering problems, which can be prohibitive for very large graphs. Previous multilevel graph partitioning methods such as Metis have suffered from the restriction of equal-sized clusters; our multilevel algorithm removes this restriction by using kernel k-means to optimize weighted graph cuts. Experimental results show that our multilevel algorithm outperforms a state-of-the-art spectral clustering algorithm in terms of speed, memory usage, and quality. We demonstrate that our algorithm is applicable to large-scale clustering tasks such as image segmentation, social network analysis, and gene network analysis.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:0162-8828
1939-3539
DOI:10.1109/TPAMI.2007.1115