A novel Move-Split-Merge based Fuzzy C-Means algorithm for clustering time series

When faced with noisy time series data, significant challenges are encountered by clustering algorithms, including noise interference, temporal distortions, and irregular data patterns. In order to cope with the challenge of noisy time series data and to improve the performance of clustering algorit...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Evolving systems Ročník 15; číslo 6; s. 2273 - 2295
Hlavní autori: Ba, Wei, Gu, Zongquan
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2024
Springer Nature B.V
Predmet:
ISSN:1868-6478, 1868-6486
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:When faced with noisy time series data, significant challenges are encountered by clustering algorithms, including noise interference, temporal distortions, and irregular data patterns. In order to cope with the challenge of noisy time series data and to improve the performance of clustering algorithms, a Move-Split-Merge based Fuzzy C-Means algorithm (MSMFCM) is proposed. Firstly, dynamic wavelet basis functions as well as Median Absolute Deviation (MAD) are used to optimize the wavelets to reduce noise and highlight the actual data patterns in the original data. Secondly, a similarity matrix, constructed using the Move-Split-Merge (MSM) edit distance metric, quantitatively assesses the similarity between each pair of time series data points. Thirdly, to improve clustering efficiency, K-means +  + is used to optimize the initial centers of the Fuzzy C-Means algorithm. Among twenty datasets, the performance of MSMFCM is compared with that of K-means, K-medoids, Fuzzy C-Means, K-shape, and algorithms incorporating Dynamic Time Warping and the Longest Common Subsequence. Simulation results show that MSMFCM significantly outperforms its closest competitors in the Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI) evaluation indicators, with an average improvement of 26.09% for ARI and 18.86% for NMI. It means that MSMFCM has better clustering performance for noisy time series data, which will provide the application of clustering on a wider range of data.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1868-6478
1868-6486
DOI:10.1007/s12530-024-09610-8