Mixed Fuzzy Clustering for Misaligned Time Series
Data mining in medical databases often involves the comparison of time series, which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this...
Uloženo v:
| Vydáno v: | IEEE transactions on fuzzy systems Ročník 25; číslo 6; s. 1777 - 1794 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.12.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1063-6706, 1941-0034 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Data mining in medical databases often involves the comparison of time series, which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this paper proposes the mixed fuzzy clustering (MFC) algorithm with the dynamic time-warping (DTW) distance. We developed the MFC algorithm by 1) incorporating the DTW distance into the standard fuzzy c-means to handle misaligned time series; 2) introducing a new dimension into the spatiotemporal clustering algorithm to handle P time-variant features; and 3) incorporating unsupervised learning of cluster-dependent attribute weights. The algorithm is designed to simultaneously cluster time-variant and time-invariant data. We demonstrate the advantages of the proposed algorithm in four synthetic datasets and in two real-world applications in intensive care units. The first application is the classification of patients who will need the administration of vasopressors, and the second is the classification of patients with a high risk of mortality. Time-variant features consist of physiological variables collected with different sampling rates at different points in time. Time-invariant features consist of patients' demographics and score records. The performance is evaluated using cluster validity measures, showing that the proposed algorithm outperforms fuzzy c-means. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1063-6706 1941-0034 |
| DOI: | 10.1109/TFUZZ.2016.2633375 |