k-CEVCLUS: Constrained evidential clustering of large dissimilarity data

In evidential clustering, cluster-membership uncertainty is represented by Dempster–Shafer mass functions. The EVCLUS algorithm is an evidential clustering procedure for dissimilarity data, based on the assumption that similar objects should be assigned mass functions with low degree of conflict. CE...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Knowledge-based systems Ročník 142; s. 29 - 44
Hlavní autori:	Li, Feng, Li, Shoumei, Denœux, Thierry
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Amsterdam Elsevier B.V 15.02.2018 Elsevier Science Ltd Elsevier
Predmet:	Algorithms Artificial Intelligence Belief functions Cluster analysis Clustering Computer Science Constrained clustering Constraints Credal partition Datasets Dempster–Shafer theory Evidence theory Evidentiality Expert systems Instance-level constraints Membership Relational data Relational data bases Storage Uncertainty Credal partition Belief functions Evidence theory Relational data Constrained clustering Dempster–Shafer theory Instance-level constraints relational data belief functions Dempster-Shafer theory credal partition constraints constrained clustering instance-level
ISSN:	0950-7051, 1872-7409
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	In evidential clustering, cluster-membership uncertainty is represented by Dempster–Shafer mass functions. The EVCLUS algorithm is an evidential clustering procedure for dissimilarity data, based on the assumption that similar objects should be assigned mass functions with low degree of conflict. CEVCLUS is a version of EVCLUS allowing one to use prior information on cluster membership, in the form of pairwise must-link and cannot-link constraints. The original CEVCLUS algorithm was shown to have very good performances, but it was quite slow and limited to small datasets. In this paper, we introduce a much faster and efficient version of CEVCLUS, called k-CEVCLUS, which is both several orders of magnitude faster than EVCLUS and has storage and computational complexity linear in the number of objects, making it applicable to large datasets (around 104 objects). We also propose a new constraint expansion strategy, yielding drastic improvements in clustering results when only a few constraints are given.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2017.11.023