Legacy Data Confound Genomics Studies

Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratificat...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Molecular biology and evolution Ročník 37; číslo 1; s. 2 - 10
Hlavní autoři: Anderson-Trocmé, Luke, Farouni, Rick, Bourgey, Mathieu, Kamatani, Yoichiro, Higasa, Koichiro, Seo, Jeong-Sun, Kim, Changhoon, Matsuda, Fumihiko, Gravel, Simon
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Oxford University Press 01.01.2020
Témata:
ISSN:0737-4038, 1537-1719, 1537-1719
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Review-3
content type line 23
ISSN:0737-4038
1537-1719
1537-1719
DOI:10.1093/molbev/msz201