Reusable components for partitioning clustering algorithms

Clustering algorithms are well-established and widely used for solving data-mining tasks. Every clustering algorithm is composed of several solutions for specific sub-problems in the clustering process. These solutions are linked together in a clustering algorithm, and they define the process and th...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	The Artificial intelligence review Ročník 32; číslo 1-4; s. 59 - 75
Hlavní autoři:	Delibašić, Boris, Kirchner, Kathrin, Ruhland, Johannes, Jovanović, Miloš, Vukićević, Milan
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Dordrecht Springer Netherlands 01.12.2009 Springer Nature B.V
Témata:	Algorithms Artificial Intelligence Clustering Components Computer Science Data mining Design Literature reviews Machine learning Partitioning Principal components analysis Reusable components Software engineering K-means Kohonen SOM Partitioning clustering X-means Generic MPCK-means Cluster algorithm Reusable component
ISSN:	0269-2821, 1573-7462
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Clustering algorithms are well-established and widely used for solving data-mining tasks. Every clustering algorithm is composed of several solutions for specific sub-problems in the clustering process. These solutions are linked together in a clustering algorithm, and they define the process and the structure of the algorithm. Frequently, many of these solutions occur in more than one clustering algorithm. Mostly, new clustering algorithms include frequently occurring solutions to typical sub-problems from clustering, as well as from other machine-learning algorithms. The problem is that these solutions are usually integrated in their algorithms, and that original algorithms are not designed to share solutions to sub-problems outside the original algorithm easily. We propose a way of designing cluster algorithms and to improve existing ones, based on reusable components. Reusable components are well-documented, frequently occurring solutions to specific sub-problems in a specific area. Thus we identify reusable components, first, as solutions to characteristic sub-problems in partitioning cluster algorithms, and, further, identify a generic structure for the design of partitioning cluster algorithms. We analyze some partitioning algorithms (K-means, X-means, MPCK-means, and Kohonen SOM), and identify reusable components in them. We give examples of how new cluster algorithms can be designed based on them.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23
ISSN:	0269-2821 1573-7462
DOI:	10.1007/s10462-009-9133-6