CURE-NS: a hierarchical clustering algorithm with new shrinking scheme

CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative poi...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International Conference on Machine Learning and Cybernetics 2002 Ročník 2; s. 895 - 899 vol.2
Hlavní autori: Yun-Tao Qian, Qing-Song Shi, Qi Wang
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 2002
Predmet:
ISBN:9780780375086, 0780375084
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative points to describe the cluster, and the set of representative points are first chosen randomly, and then are shrunk toward the mean of cluster. The shrinking operation plays a key role in CURE, which is used for weakening the effect of outliers. However, we found that the shrinking scheme of CURE is dependent on a hidden assumption of spherical shape of cluster, therefore CURE has difficulties in dealing with databases having specific shapes. In this paper, CURE-NS (CURE with new shrinking scheme) is proposed to overcome this problem, which uses the difference of density values of the representative points to determine the direction and distance of shrinking. Our shrinking scheme has nothing to do with the shape of cluster. A range of experiments demonstrate that CURE-NS has better clustering performance than CURE.
ISBN:9780780375086
0780375084
DOI:10.1109/ICMLC.2002.1174512