CURE-NS: a hierarchical clustering algorithm with new shrinking scheme

CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative poi...

Full description

Saved in:
Bibliographic Details
Published in:International Conference on Machine Learning and Cybernetics 2002 Vol. 2; pp. 895 - 899 vol.2
Main Authors: Yun-Tao Qian, Qing-Song Shi, Qi Wang
Format: Conference Proceeding
Language:English
Published: IEEE 2002
Subjects:
ISBN:9780780375086, 0780375084
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative points to describe the cluster, and the set of representative points are first chosen randomly, and then are shrunk toward the mean of cluster. The shrinking operation plays a key role in CURE, which is used for weakening the effect of outliers. However, we found that the shrinking scheme of CURE is dependent on a hidden assumption of spherical shape of cluster, therefore CURE has difficulties in dealing with databases having specific shapes. In this paper, CURE-NS (CURE with new shrinking scheme) is proposed to overcome this problem, which uses the difference of density values of the representative points to determine the direction and distance of shrinking. Our shrinking scheme has nothing to do with the shape of cluster. A range of experiments demonstrate that CURE-NS has better clustering performance than CURE.
ISBN:9780780375086
0780375084
DOI:10.1109/ICMLC.2002.1174512