Entropy-Rate Clustering: Cluster Analysis via Maximizing a Submodular Function Subject to a Matroid Constraint

We propose a new objective function for clustering. This objective function consists of two components: the entropy rate of a random walk on a graph and a balancing term. The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with simi...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on pattern analysis and machine intelligence Ročník 36; číslo 1; s. 99 - 112
Hlavní autoři: Ming-Yu Liu, Tuzel, Oncel, Ramalingam, Srikumar, Chellappa, Rama
Médium: Journal Article
Jazyk:angličtina
Vydáno: Los Alamitos, CA IEEE 01.01.2014
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0162-8828, 1939-3539, 2160-9292, 1939-3539
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We propose a new objective function for clustering. This objective function consists of two components: the entropy rate of a random walk on a graph and a balancing term. The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with similar sizes and penalizes larger clusters that aggressively group samples. We present a novel graph construction for the graph associated with the data and show that this construction induces a matroid--a combinatorial structure that generalizes the concept of linear independence in vector spaces. The clustering result is given by the graph topology that maximizes the objective function under the matroid constraint. By exploiting the submodular and monotonic properties of the objective function, we develop an efficient greedy algorithm. Furthermore, we prove an approximation bound of 1/2 for the optimality of the greedy solution. We validate the proposed algorithm on various benchmarks and show its competitive performances with respect to popular clustering algorithms. We further apply it for the task of superpixel segmentation. Experiments on the Berkeley segmentation data set reveal its superior performances over the state-of-the-art superpixel segmentation algorithms in all the standard evaluation metrics.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2013.107