CURE-NS: a hierarchical clustering algorithm with new shrinking scheme

CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative poi...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:International Conference on Machine Learning and Cybernetics 2002 Ročník 2; s. 895 - 899 vol.2
Hlavní autoři: Yun-Tao Qian, Qing-Song Shi, Qi Wang
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 2002
Témata:
ISBN:9780780375086, 0780375084
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative points to describe the cluster, and the set of representative points are first chosen randomly, and then are shrunk toward the mean of cluster. The shrinking operation plays a key role in CURE, which is used for weakening the effect of outliers. However, we found that the shrinking scheme of CURE is dependent on a hidden assumption of spherical shape of cluster, therefore CURE has difficulties in dealing with databases having specific shapes. In this paper, CURE-NS (CURE with new shrinking scheme) is proposed to overcome this problem, which uses the difference of density values of the representative points to determine the direction and distance of shrinking. Our shrinking scheme has nothing to do with the shape of cluster. A range of experiments demonstrate that CURE-NS has better clustering performance than CURE.
ISBN:9780780375086
0780375084
DOI:10.1109/ICMLC.2002.1174512