Clustering of unevenly sampled gene expression time-series data

Time course measurements are becoming a common type of experiment in the use of microarrays. The temporal order of the data and the varying length of sampling intervals are important and should be considered in clustering time-series. However, the shortness of gene expression time-series data limits...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Fuzzy sets and systems Ročník 152; číslo 1; s. 49 - 66
Hlavní autori: Möller-Levet, C.S., Klawonn, F., Cho, K.-H., Yin, H., Wolkenhauer, O.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier B.V 16.05.2005
Predmet:
ISSN:0165-0114, 1872-6801
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Time course measurements are becoming a common type of experiment in the use of microarrays. The temporal order of the data and the varying length of sampling intervals are important and should be considered in clustering time-series. However, the shortness of gene expression time-series data limits the use of conventional statistical models and techniques for time-series analysis. To address this problem, this paper proposes the fuzzy short time-series (FSTS) clustering algorithm, which clusters profiles based on the similarity of their relative change of expression level and the corresponding temporal information. One of the major advantages of fuzzy clustering is that genes can belong to more than one group, revealing distinctive features of each gene's function and regulation. Several examples are provided to illustrate the performance of the proposed algorithm. In addition, we present the validation of the algorithm by clustering the genes which define the model profiles in Chu et al. (Science, 282 (1998) 699). The fuzzy c-means, k-means, average linkage hierarchical algorithm and random clustering are compared to the proposed FSTS algorithm. The performance is evaluated with a well-established cluster validity measure proving that the FSTS algorithm has a better performance than the compared algorithms in clustering similar rates of change of expression in successive unevenly distributed time points. Moreover, the FSTS algorithm was able to cluster in a biologically meaningful way the genes defining the model profiles.
ISSN:0165-0114
1872-6801
DOI:10.1016/j.fss.2004.10.014