Geometric double-entity model for recognizing far-near relations of clusters
When solving many practical problems, we not only need sample labels given by a clustering algorithm, but also rely on the recognition of far-near relations of clusters. Under the difficult condition of many clusters in a high-dimensional data set, the clustering visualization methods based on dimen...
Gespeichert in:
| Veröffentlicht in: | Science China. Information sciences Jg. 54; H. 10; S. 2040 - 2050 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Heidelberg
SP Science China Press
01.10.2011
Springer Nature B.V |
| Schlagworte: | |
| ISSN: | 1674-733X, 1869-1919 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | When solving many practical problems, we not only need sample labels given by a clustering algorithm, but also rely on the recognition of far-near relations of clusters. Under the difficult condition of many clusters in a high-dimensional data set, the clustering visualization methods based on dimension reductions usually produce the phenomena, e.g., some clusters are overlapping, interlacing, or pushed away; as a result, the far-near relations of some clusters are displayed wrongly or cannot be distinguished. The existing inter-cluster distance methods cannot determine whether two clusters are far away or near. The geometric double-entity model method (GDEM) is proposed to describe far-near relations of clusters, and the methods such as the relative border distance, absolute border distance and region dense degree are designed to measure far-near degrees between clusters. GDEM pays attention to both the absolute distance between nearest sample sets and the dense degrees of border regions of two clusters, and it is able to uncover accurately far-near relations of clusters in a high-dimensional space, especially under the difficult condition mentioned above. The experimental results on four real data sets show that the proposed method can effectively recognize far-near relations of clusters, while the conventional methods cannot. |
|---|---|
| Bibliographie: | 11-5847/TP geometric double-entity model, far-near relations of clusters, distance between clusters, partitionalclustering algorithms When solving many practical problems, we not only need sample labels given by a clustering algorithm, but also rely on the recognition of far-near relations of clusters. Under the difficult condition of many clusters in a high-dimensional data set, the clustering visualization methods based on dimension reductions usually produce the phenomena, e.g., some clusters are overlapping, interlacing, or pushed away; as a result, the far-near relations of some clusters are displayed wrongly or cannot be distinguished. The existing inter-cluster distance methods cannot determine whether two clusters are far away or near. The geometric double-entity model method (GDEM) is proposed to describe far-near relations of clusters, and the methods such as the relative border distance, absolute border distance and region dense degree are designed to measure far-near degrees between clusters. GDEM pays attention to both the absolute distance between nearest sample sets and the dense degrees of border regions of two clusters, and it is able to uncover accurately far-near relations of clusters in a high-dimensional space, especially under the difficult condition mentioned above. The experimental results on four real data sets show that the proposed method can effectively recognize far-near relations of clusters, while the conventional methods cannot. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1674-733X 1869-1919 |
| DOI: | 10.1007/s11432-011-4386-5 |