Mixed Fuzzy Clustering for Misaligned Time Series
Data mining in medical databases often involves the comparison of time series, which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this...
Saved in:
| Published in: | IEEE transactions on fuzzy systems Vol. 25; no. 6; pp. 1777 - 1794 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.12.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1063-6706, 1941-0034 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Data mining in medical databases often involves the comparison of time series, which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this paper proposes the mixed fuzzy clustering (MFC) algorithm with the dynamic time-warping (DTW) distance. We developed the MFC algorithm by 1) incorporating the DTW distance into the standard fuzzy c-means to handle misaligned time series; 2) introducing a new dimension into the spatiotemporal clustering algorithm to handle P time-variant features; and 3) incorporating unsupervised learning of cluster-dependent attribute weights. The algorithm is designed to simultaneously cluster time-variant and time-invariant data. We demonstrate the advantages of the proposed algorithm in four synthetic datasets and in two real-world applications in intensive care units. The first application is the classification of patients who will need the administration of vasopressors, and the second is the classification of patients with a high risk of mortality. Time-variant features consist of physiological variables collected with different sampling rates at different points in time. Time-invariant features consist of patients' demographics and score records. The performance is evaluated using cluster validity measures, showing that the proposed algorithm outperforms fuzzy c-means. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1063-6706 1941-0034 |
| DOI: | 10.1109/TFUZZ.2016.2633375 |