Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks

In large-scale multiple testing problems, data are often collected from heterogeneous sources and hypotheses form into groups that exhibit different characteristics. Conventional approaches, including the pooled and separate analyses, fail to efficiently utilize the external grouping information. We...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of the American Statistical Association Ročník 104; číslo 488; s. 1467 - 1481
Hlavní autoři: Cai, T. Tony, Sun, Wenguang
Médium: Journal Article
Jazyk:angličtina
Vydáno: Alexandria, VA Taylor & Francis 01.12.2009
American Statistical Association
Assoc
Taylor & Francis Ltd
Témata:
ISSN:0162-1459, 1537-274X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In large-scale multiple testing problems, data are often collected from heterogeneous sources and hypotheses form into groups that exhibit different characteristics. Conventional approaches, including the pooled and separate analyses, fail to efficiently utilize the external grouping information. We develop a compound decision theoretic framework for testing grouped hypotheses and introduce an oracle procedure that minimizes the false nondiscovery rate subject to a constraint on the false discovery rate. It is shown that both the pooled and separate analyses can be uniformly improved by the oracle procedure. We then propose a data-driven procedure that is shown to be asymptotically optimal. Simulation studies show that our procedures enjoy superior performance and yield the most accurate results in comparison with both the pooled and separate procedures. A real-data example with grouped hypotheses is studied in detail using different methods. Both theoretical and numerical results demonstrate that exploiting external information of the sample can greatly improve the efficiency of a multiple testing procedure. The results also provide insights on how the grouping information is incorporated for optimal simultaneous inference.
Bibliografie:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-2
content type line 23
ISSN:0162-1459
1537-274X
DOI:10.1198/jasa.2009.tm08415