Improved K‐means algorithm for clustering non‐spherical data
As one of the commonly used data mining algorithms, K‐means has the advantage of fast clustering speed, but the disadvantage is that it is less effective for clustering non‐spherical data. An improved K‐means algorithm (IK‐means) is proposed to enhance clustering efficiency for non‐spherical data. T...
Uloženo v:
| Vydáno v: | Expert systems Ročník 39; číslo 9 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Oxford
Blackwell Publishing Ltd
01.11.2022
|
| Témata: | |
| ISSN: | 0266-4720, 1468-0394 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | As one of the commonly used data mining algorithms, K‐means has the advantage of fast clustering speed, but the disadvantage is that it is less effective for clustering non‐spherical data. An improved K‐means algorithm (IK‐means) is proposed to enhance clustering efficiency for non‐spherical data. The original dataset is clustered into a relatively larger number of high‐density sub‐clusters, and the final result is obtained by merging connected sub‐clusters respectively. The connectivity among sub‐clusters is evaluated by the sub‐clusters density and the nearest distance class between sub‐clusters. By testing on University of California, Irvine(UCI) datasets and several other artificial simulation datasets, the comparison of proposed IK‐means algorithm against DBSCAN, KGFCM shows its clustering capability for data of arbitrary shape. The clustering Adjusted Rand Index (ARI) value for 72,000 sizes data is 24% higher than DBSCAN, and 95.2% higher than KGFCM. For larger datasets, the IK‐means algorithm is faster than DBSCAN and KGFCM. |
|---|---|
| Bibliografie: | Funding information Guangdong Edu Science Project Plan (Project No: 2021GXJK513); Lianyungang High‐tech Zone Science and Technology Project Plan (Project No: ZD201915); Lianyungang Technical College Project Plan(Project No: XZD202001); Shenzhen Edu Science Project Plan (Project No: DWZZ19002) ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0266-4720 1468-0394 |
| DOI: | 10.1111/exsy.13062 |