Should We Abandon the t-Test in the Analysis of Gene Expression Microarray Data: A Comparison of Variance Modeling Strategies

High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alterna...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:PloS one Ročník 5; číslo 9; s. e12336
Hlavní autoři: Jeanmougin, Marine, de Reynies, Aurelien, Marisa, Laetitia, Paccard, Caroline, Nuel, Gregory, Guedj, Mickael
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Public Library of Science 03.09.2010
Public Library of Science (PLoS)
Témata:
ISSN:1932-6203, 1932-6203
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this context and given the current tendency to apply the t-test, identifying the most efficient approach in practice remains crucial. To provide elements to answer, we conduct a comparison of eight tests representative of variance modeling strategies in gene expression data: Welch's t-test, ANOVA [1], Wilcoxon's test, SAM [2], RVM [3], limma [4], VarMixt [5] and SMVar [6]. Our comparison process relies on four steps (gene list analysis, simulations, spike-in data and re-sampling) to formulate comprehensive and robust conclusions about test performance, in terms of statistical power, false-positive rate, execution time and ease of use. Our results raise concerns about the ability of some methods to control the expected number of false positives at a desirable level. Besides, two tests (limma and VarMixt) show significant improvement compared to the t-test, in particular to deal with small sample sizes. In addition limma presents several practical advantages, so we advocate its application to analyze gene expression data.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
Conceived and designed the experiments: MJ GN MG. Performed the experiments: MJ. Analyzed the data: MJ MG. Wrote the paper: MJ MG. Implemented tools in R: MJ, AdR. Significantly contributed to the paper: AdR, LM, CP, GN.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0012336