Automated variable weighting in k-means type clustering

This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The co...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transactions on pattern analysis and machine intelligence Ročník 27; číslo 5; s. 657 - 668
Hlavní autori:	Huang, J.Z., Ng, M.K., Hongqiang Rong, Zichen Li
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Los Alamitos, CA IEEE 01.05.2005 IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Additives Algorithms Applied sciences Artificial Intelligence Cluster Analysis Clustering Clustering algorithms Clustering methods Clusters Computer science; control theory; systems Computer Simulation Cost function Data mining Exact sciences and technology feature evaluation and selection Index Terms- Clustering Information Storage and Retrieval - methods Input variables Intelligence Iterative algorithms Mathematical analysis mining methods and algorithms Models, Statistical Noise reduction Numerical Analysis, Computer-Assisted Partitioning algorithms Pattern analysis Pattern Recognition, Automated - methods Pattern recognition. Digital image processing. Computational geometry Recovering Reproducibility of Results Sensitivity and Specificity Signal Processing, Computer-Assisted mining methods and algorithms feature evaluation and selection K means algorithm Data mining Clustering
ISSN:	0162-8828, 1939-3539
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applications where large and complex real data are often involved. Experimental results on both synthetic and real data have shown that the new algorithm outperformed the standard k-means type algorithms in recovering clusters in data.
Bibliografia:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23 ObjectType-Undefined-3
ISSN:	0162-8828 1939-3539
DOI:	10.1109/TPAMI.2005.95