SVStream: A Support Vector-Based Algorithm for Clustering Data Streams

In this paper, we propose a novel data stream clustering algorithm, termed SVStream, which is based on support vector domain description and support vector clustering. In the proposed algorithm, the data elements of a stream are mapped into a kernel space, and the support vectors are used as the sum...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on knowledge and data engineering Ročník 25; číslo 6; s. 1410 - 1424
Hlavní autoři: Wang, Chang-Dong, Lai, Jian-Huang, Huang, Dong, Zheng, Wei-Shi
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.06.2013
Témata:
ISSN:1041-4347
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In this paper, we propose a novel data stream clustering algorithm, termed SVStream, which is based on support vector domain description and support vector clustering. In the proposed algorithm, the data elements of a stream are mapped into a kernel space, and the support vectors are used as the summary information of the historical elements to construct cluster boundaries of arbitrary shape. To adapt to both dramatic and gradual changes, multiple spheres are dynamically maintained, each describing the corresponding data domain presented in the data stream. By allowing for bounded support vectors (BSVs), the proposed SVStream algorithm is capable of identifying overlapping clusters. A BSV decaying mechanism is designed to automatically detect and remove outliers (noise). We perform experiments over synthetic and real data streams, with the overlapping, evolving, and noise situations taken into consideration. Comparison results with state-of-the-art data stream clustering methods demonstrate the effectiveness and efficiency of the proposed method.
ISSN:1041-4347
DOI:10.1109/TKDE.2011.263