Mixed Fuzzy Clustering for Misaligned Time Series

Data mining in medical databases often involves the comparison of time series, which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on fuzzy systems Ročník 25; číslo 6; s. 1777 - 1794
Hlavní autoři: Salgado, Catia M., Ferreira, Marta C., Vieira, Susana M.
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.12.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1063-6706, 1941-0034
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Data mining in medical databases often involves the comparison of time series, which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this paper proposes the mixed fuzzy clustering (MFC) algorithm with the dynamic time-warping (DTW) distance. We developed the MFC algorithm by 1) incorporating the DTW distance into the standard fuzzy c-means to handle misaligned time series; 2) introducing a new dimension into the spatiotemporal clustering algorithm to handle P time-variant features; and 3) incorporating unsupervised learning of cluster-dependent attribute weights. The algorithm is designed to simultaneously cluster time-variant and time-invariant data. We demonstrate the advantages of the proposed algorithm in four synthetic datasets and in two real-world applications in intensive care units. The first application is the classification of patients who will need the administration of vasopressors, and the second is the classification of patients with a high risk of mortality. Time-variant features consist of physiological variables collected with different sampling rates at different points in time. Time-invariant features consist of patients' demographics and score records. The performance is evaluated using cluster validity measures, showing that the proposed algorithm outperforms fuzzy c-means.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1063-6706
1941-0034
DOI:10.1109/TFUZZ.2016.2633375