Legacy Data Confound Genomics Studies

Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratificat...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Molecular biology and evolution Ročník 37; číslo 1; s. 2 - 10
Hlavní autoři: Anderson-Trocmé, Luke, Farouni, Rick, Bourgey, Mathieu, Kamatani, Yoichiro, Higasa, Koichiro, Seo, Jeong-Sun, Kim, Changhoon, Matsuda, Fumihiko, Gravel, Simon
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Oxford University Press 01.01.2020
Témata:
ISSN:0737-4038, 1537-1719, 1537-1719
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.
AbstractList Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.
Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.
Author Kim, Changhoon
Farouni, Rick
Gravel, Simon
Matsuda, Fumihiko
Bourgey, Mathieu
Higasa, Koichiro
Anderson-Trocmé, Luke
Seo, Jeong-Sun
Kamatani, Yoichiro
Author_xml – sequence: 1
  givenname: Luke
  surname: Anderson-Trocmé
  fullname: Anderson-Trocmé, Luke
  organization: Department of Human Genetics, McGill University, Montreal, QC, Canada
– sequence: 2
  givenname: Rick
  surname: Farouni
  fullname: Farouni, Rick
  organization: Department of Human Genetics, McGill University, Montreal, QC, Canada
– sequence: 3
  givenname: Mathieu
  surname: Bourgey
  fullname: Bourgey, Mathieu
  organization: Department of Human Genetics, McGill University, Montreal, QC, Canada
– sequence: 4
  givenname: Yoichiro
  surname: Kamatani
  fullname: Kamatani, Yoichiro
  organization: Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
– sequence: 5
  givenname: Koichiro
  surname: Higasa
  fullname: Higasa, Koichiro
  organization: Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
– sequence: 6
  givenname: Jeong-Sun
  surname: Seo
  fullname: Seo, Jeong-Sun
  organization: Bioinformatics Institute, Macrogen Inc, Seoul, Republic of Korea
– sequence: 7
  givenname: Changhoon
  surname: Kim
  fullname: Kim, Changhoon
  organization: Bioinformatics Institute, Macrogen Inc, Seoul, Republic of Korea
– sequence: 8
  givenname: Fumihiko
  surname: Matsuda
  fullname: Matsuda, Fumihiko
  organization: Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
– sequence: 9
  givenname: Simon
  surname: Gravel
  fullname: Gravel, Simon
  email: simon.gravel@mcgill.ca
  organization: Department of Human Genetics, McGill University, Montreal, QC, Canada
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31504792$$D View this record in MEDLINE/PubMed
BookMark eNqF0M1LwzAYBvAgE_ehR69SEMFLXb6aj6NMncLAg7uHNE2ko01m0wrbX2-l08NAPOU9_JI3zzMFIx-8BeASwTsEJZnXocrt57yOewzRCZigjPAUcSRHYAJ5P1NIxBhMY9xAiChl7AyMCcog5RJPwM3KvmuzSx50q5NF8C50vkiW1oe6NDF5a7uitPEcnDpdRXtxOGdg_fS4Xjynq9fly-J-lRqS0TYVRDgHqcaY5blwUMqcGW2YIxJTXPAca8iEzDByjhBLMJcOO2YzzpnUkMzA7fDstgkfnY2tqstobFVpb0MXFcZCcIxRlvX0-ohuQtf4_nOK9OmxFJTwXl0dVJfXtlDbpqx1s1M_-XuQDsA0IcbGul-CoPruVw39qqHf3pMjb8pWt2XwbaPL6s9bh1yh2_6z4AsMr4xU
CitedBy_id crossref_primary_10_1093_molbev_msad213
crossref_primary_10_1111_epi_16467
crossref_primary_10_1093_molbev_msab313
crossref_primary_10_1111_mms_12721
crossref_primary_10_3389_fmars_2025_1562045
crossref_primary_10_1371_journal_pgen_1010807
crossref_primary_10_1038_s41576_019_0180_9
crossref_primary_10_1146_annurev_biodatasci_122320_120920
crossref_primary_10_1016_j_gde_2020_05_024
crossref_primary_10_1371_journal_pgen_1009676
crossref_primary_10_1016_j_gde_2020_05_028
crossref_primary_10_3390_genes13020183
crossref_primary_10_7554_eLife_81188
crossref_primary_10_1186_s13059_024_03401_9
crossref_primary_10_3390_genes13010044
crossref_primary_10_1016_j_indcrop_2023_116631
crossref_primary_10_1073_pnas_2013798118
Cites_doi 10.1371/journal.pgen.1006581
10.1038/s41598-017-03915-2
10.1038/nature15393
10.1038/nature18964
10.1038/ng.3244
10.1016/j.cell.2016.10.042
10.1002/hep.29876
10.1038/mp.2015.218
10.1093/hmg/ddy111
10.1186/s12863-015-0299-4
10.1038/ncomms14357
10.1093/biomet/93.3.491
10.1017/thg.2013.12
10.1371/journal.pone.0124841
10.1093/hmg/ddx062
10.1093/gigascience/gix067
10.2337/db10-1011
10.1093/genetics/155.2.945
10.1186/s13073-017-0414-4
10.7554/eLife.24284
10.1371/journal.pgen.1005657
10.1038/ng.2424
10.1093/molbev/msz023
10.1073/pnas.1418652112
10.1038/nature04226
10.1038/ng.3643
10.1016/j.tig.2014.07.001
10.1038/nature09534
10.1038/ng.3528
10.1038/s41467-017-00257-5
10.1093/gbe/evy199
10.1038/nature12477
10.1186/gb-2011-12-11-r112
10.1093/toxsci/kfv084
10.1371/journal.pone.0179446
10.1038/nature19057
10.1038/sj.onc.1205803
10.1038/nature08629
10.1007/BF01441146
ContentType Journal Article
Copyright The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019
The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Copyright_xml – notice: The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019
– notice: The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QG
7QP
7QR
7SN
7SS
7TK
7TM
7TO
7U9
7X7
7XB
88A
88E
8AO
8FD
8FE
8FH
8FI
8FJ
8FK
8G5
ABUWG
AEUYN
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
C1K
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
GUQSH
H94
HCIFZ
K9.
LK8
M0S
M1P
M2O
M7N
M7P
MBDVC
P64
PHGZM
PHGZT
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
RC3
7X8
DOI 10.1093/molbev/msz201
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Animal Behavior Abstracts
Calcium & Calcified Tissue Abstracts
Chemoreception Abstracts
Ecology Abstracts
Entomology Abstracts (Full archive)
Neurosciences Abstracts
Nucleic Acids Abstracts
Oncogenes and Growth Factors Abstracts
Virology and AIDS Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Biology Database (Alumni Edition)
Medical Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
Research Library (Alumni Edition)
ProQuest Central (Alumni Edition)
ProQuest One Sustainability
ProQuest Central UK/Ireland
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Natural Science Collection
Environmental Sciences and Pollution Management
ProQuest One Community College
ProQuest Central Korea
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
Research Library Prep
AIDS and Cancer Research Abstracts
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Biological Science Collection
Health & Medical Collection (Alumni Edition)
Medical Database
Research Library
Algology Mycology and Protozoology Abstracts (Microbiology C)
Biological Science Database
Research Library (Corporate)
Biotechnology and BioEngineering Abstracts
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
Genetics Abstracts
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Research Library Prep
ProQuest Central Student
Oncogenes and Growth Factors Abstracts
ProQuest Central Essentials
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
Environmental Sciences and Pollution Management
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
Chemoreception Abstracts
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Virology and AIDS Abstracts
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Ecology Abstracts
Neurosciences Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
Entomology Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
Calcium & Calcified Tissue Abstracts
ProQuest One Academic (New)
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
Research Library (Alumni Edition)
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Biology Journals (Alumni Edition)
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Algology Mycology and Protozoology Abstracts (Microbiology C)
AIDS and Cancer Research Abstracts
ProQuest Research Library
ProQuest Central Basic
ProQuest SciTech Collection
ProQuest Medical Library
Animal Behavior Abstracts
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList
Research Library Prep
CrossRef
MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1537-1719
EndPage 10
ExternalDocumentID 31504792
10_1093_molbev_msz201
10.1093/molbev/msz201
Genre Journal Article
Review
GeographicLocations Japan
GeographicLocations_xml – name: Japan
GroupedDBID ---
-E4
-~X
.2P
.I3
.ZR
0R~
18M
1TH
29M
2WC
4.4
48X
5VS
5WA
70D
7X7
AAFWJ
AAIJN
AAIMJ
AAJKP
AAMDB
AAMVS
AAOGV
AAPNW
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
ABEJV
ABEUO
ABGNP
ABIXL
ABKDP
ABLJU
ABNKS
ABPTD
ABQLI
ABXVV
ABZBJ
ACGFO
ACGFS
ACIPB
ACIWK
ACNCT
ACPRK
ACUFI
ACUTO
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGZP
ADHKW
ADHZD
ADJQC
ADOCK
ADRIX
ADRTK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFIYH
AFOFC
AFPKN
AFRAH
AFULF
AFXEN
AGINJ
AGKEF
AGSYK
AHMBA
AHXPO
AIAGR
AIJHB
AJEUX
AKHUL
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BCRHZ
BEYMZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
BTRTY
BVRKM
CDBKE
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
E3Z
EBS
EE~
EMOBN
F5P
F9B
FHSFR
FLIZI
FOTVD
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HH5
HW0
HZ~
IOX
J21
KOP
KQ8
KSI
KSN
M-Z
M49
ML0
N9A
NGC
NLBLG
NMDNZ
NOYVH
NU-
O9-
OAWHX
ODMLO
OJQWA
OK1
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
RD5
RHF
ROL
ROX
ROZ
RPM
RUSNO
RW1
RXO
TJP
TJX
TLC
TN5
TOX
TR2
VQA
W8F
WOQ
X7H
XSW
YAYTL
YKOAZ
YXANX
ZCA
ZKX
~02
~91
AAYXX
ABUFD
BBNVY
BENPR
CITATION
FYUFA
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QG
7QP
7QR
7SN
7SS
7TK
7TM
7TO
7U9
7XB
88A
88E
8AO
8FD
8FE
8FH
8FI
8FJ
8FK
8G5
ABUWG
AEUYN
AFKRA
AZQEC
BHPHI
C1K
CCPQU
DWQXO
FR3
GNUQQ
GUQSH
H94
HCIFZ
K9.
LK8
M1P
M2O
M7N
M7P
MBDVC
P64
PHGZM
PHGZT
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQUKI
PRINS
Q9U
RC3
7X8
ID FETCH-LOGICAL-c354t-838ff04a226bb8f099b6cac6f39242d7b2a0689521ff33e3279f2f6e57769a03
IEDL.DBID BENPR
ISICitedReferencesCount 20
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000515121200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0737-4038
1537-1719
IngestDate Mon Jul 21 10:55:05 EDT 2025
Mon Oct 06 18:16:58 EDT 2025
Wed Feb 19 02:32:28 EST 2025
Sat Nov 29 05:53:04 EST 2025
Tue Nov 18 22:42:53 EST 2025
Fri Feb 07 10:35:22 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords batch effect
population genetics
statistical genetics
imputation
reference cohorts
mutational signature
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c354t-838ff04a226bb8f099b6cac6f39242d7b2a0689521ff33e3279f2f6e57769a03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Review-3
content type line 23
PMID 31504792
PQID 3171298437
PQPubID 36253
PageCount 9
ParticipantIDs proquest_miscellaneous_2288722155
proquest_journals_3171298437
pubmed_primary_31504792
crossref_primary_10_1093_molbev_msz201
crossref_citationtrail_10_1093_molbev_msz201
oup_primary_10_1093_molbev_msz201
PublicationCentury 2000
PublicationDate 20200101
2020-01-01
2020-Jan-01
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – month: 01
  year: 2020
  text: 20200101
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Oxford
PublicationTitle Molecular biology and evolution
PublicationTitleAlternate Mol Biol Evol
PublicationYear 2020
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Astle (2020021710492404500_msz201-B5) 2016; 167
Kraja (2020021710492404500_msz201-B16) 2011; 60
Mandage (2020021710492404500_msz201-B23) 2017; 12
López-Mejías (2020021710492404500_msz201-B19) 2017; 7
Shiraishi (2020021710492404500_msz201-B33) 2015; 11
Yucesoy (2020021710492404500_msz201-B40) 2015; 146
Pritchard (2020021710492404500_msz201-B32) 2000; 155
Benjamini (2020021710492404500_msz201-B7) 2006; 93
(2020021710492404500_msz201-B15) 2005; 437
Suhre (2020021710492404500_msz201-B36) 2017; 8
Aikens (2020021710492404500_msz201-B3) 2019; 36
McCarthy (2020021710492404500_msz201-B25) 2016; 48
Lek (2020021710492404500_msz201-B18) 2016; 536
Ebejer (2020021710492404500_msz201-B9) 2013; 16
Ellinghaus (2020021710492404500_msz201-B10) 2016; 48
Harris (2020021710492404500_msz201-B12) 2015; 112
Mafessoni (2020021710492404500_msz201-B21) 2018; 10
(2020021710492404500_msz201-B1) 2010; 467
Song (2020021710492404500_msz201-B34) 2015; 47
Xu (2020021710492404500_msz201-B39) 2012; 44
Gao (2020021710492404500_msz201-B11) 2018; 27
Mallick (2020021710492404500_msz201-B22) 2016; 538
Lan (2020021710492404500_msz201-B17) 2017; 6
Herold (2020021710492404500_msz201-B14) 2016; 21
Mathieson (2020021710492404500_msz201-B24) 2017; 13
Spracklen (2020021710492404500_msz201-B35) 2017; 26
Tian (2020021710492404500_msz201-B37) 2017; 8
Alexandrov (2020021710492404500_msz201-B4) 2013; 500
Pfeifer (2020021710492404500_msz201-B30) 2002; 21
Balding (2020021710492404500_msz201-B6) 1995; 96
Nagy (2020021710492404500_msz201-B27) 2017; 9
Consortium (2020021710492404500_msz201-B8) 2015; 526
Lutz (2020021710492404500_msz201-B20) 2015; 16
van Dijk (2020021710492404500_msz201-B38) 2014; 30
(2020021710492404500_msz201-B2) 2012; 135
Nishida (2020021710492404500_msz201-B28) 2018; 68
Pleasance (2020021710492404500_msz201-B31) 2010; 463
Harris (2020021710492404500_msz201-B13) 2017
Minoche (2020021710492404500_msz201-B26) 2011; 12
Park (2020021710492404500_msz201-B29) 2015; 10
References_xml – volume: 13
  start-page: e1006581.
  issue: 2
  year: 2017
  ident: 2020021710492404500_msz201-B24
  article-title: Differences in the rare variant spectrum among human populations
  publication-title: PLoS Genet
  doi: 10.1371/journal.pgen.1006581
– volume: 7
  start-page: 5088.
  issue: 1
  year: 2017
  ident: 2020021710492404500_msz201-B19
  article-title: A genome-wide association study suggests the HLA class II region as the major susceptibility locus for IgA vasculitis
  publication-title: Sci Rep
  doi: 10.1038/s41598-017-03915-2
– volume: 526
  start-page: 68
  issue: 7571
  year: 2015
  ident: 2020021710492404500_msz201-B8
  article-title: A global reference for human genetic variation
  publication-title: Nature
  doi: 10.1038/nature15393
– volume: 538
  start-page: 201
  issue: 7624
  year: 2016
  ident: 2020021710492404500_msz201-B22
  article-title: The Simons Genome Diversity Project: 300 genomes from 142 diverse populations
  publication-title: Nature
  doi: 10.1038/nature18964
– volume: 47
  start-page: 550.
  issue: 5
  year: 2015
  ident: 2020021710492404500_msz201-B34
  article-title: Testing for genetic associations in arbitrarily structured populations
  publication-title: Nat Genet
  doi: 10.1038/ng.3244
– volume: 167
  start-page: 1415
  issue: 5
  year: 2016
  ident: 2020021710492404500_msz201-B5
  article-title: The allelic landscape of human blood cell trait variation and links to common complex disease
  publication-title: Cell
  doi: 10.1016/j.cell.2016.10.042
– volume: 68
  start-page: 848
  issue: 3
  year: 2018
  ident: 2020021710492404500_msz201-B28
  article-title: Key HLA-DRB1-DQB1 haplotypes and role of the BTNL2 gene for response to a hepatitis B vaccine
  publication-title: Hepatology
  doi: 10.1002/hep.29876
– volume: 21
  start-page: 1608.
  issue: 11
  year: 2016
  ident: 2020021710492404500_msz201-B14
  article-title: Family-based association analyses of imputed genotypes reveal genome-wide significant association of Alzheimer’s disease with osbpl6, ptprg, and pdcl3
  publication-title: Mol Psychiatry
  doi: 10.1038/mp.2015.218
– volume: 27
  start-page: 2205
  issue: 12
  year: 2018
  ident: 2020021710492404500_msz201-B11
  article-title: Genome-wide association analyses identify new loci influencing intraocular pressure
  publication-title: Hum Mol Genet
  doi: 10.1093/hmg/ddy111
– volume: 16
  start-page: 138.
  issue: 1
  year: 2015
  ident: 2020021710492404500_msz201-B20
  article-title: A genome-wide association study identifies risk loci for spirometric measures among smokers of European and African ancestry
  publication-title: BMC Genet
  doi: 10.1186/s12863-015-0299-4
– volume: 8
  start-page: 14357
  issue: 1
  year: 2017
  ident: 2020021710492404500_msz201-B36
  article-title: Connecting genetic risk to disease end points through the human blood plasma proteome
  publication-title: Nat Commun
  doi: 10.1038/ncomms14357
– volume: 93
  start-page: 491.
  issue: 3
  year: 2006
  ident: 2020021710492404500_msz201-B7
  article-title: Adaptive linear step-up procedures that control the false discovery rate
  publication-title: Biometrika
  doi: 10.1093/biomet/93.3.491
– volume: 16
  start-page: 560.
  issue: 2
  year: 2013
  ident: 2020021710492404500_msz201-B9
  article-title: Genome-wide association study of inattention and hyperactivity-impulsivity measured as quantitative traits
  publication-title: Twin Res Hum Genet
  doi: 10.1017/thg.2013.12
– volume: 10
  start-page: e0124841.
  issue: 6
  year: 2015
  ident: 2020021710492404500_msz201-B29
  article-title: Mercapturic acids derived from the toxicants acrolein and crotonaldehyde in the urine of cigarette smokers from five ethnic groups with differing risks for lung cancer
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0124841
– volume: 26
  start-page: 1770
  issue: 9
  year: 2017
  ident: 2020021710492404500_msz201-B35
  article-title: Association analyses of East Asian individuals and trans-ancestry analyses with European individuals reveal new loci associated with cholesterol and triglyceride levels
  publication-title: Hum Mol Genet
  doi: 10.1093/hmg/ddx062
– volume: 6
  start-page: gix067
  issue: 9
  year: 2017
  ident: 2020021710492404500_msz201-B17
  article-title: Deep whole-genome sequencing of 90 Han Chinese genomes
  publication-title: GigaScience
  doi: 10.1093/gigascience/gix067
– volume: 60
  start-page: 1329.
  issue: 4
  year: 2011
  ident: 2020021710492404500_msz201-B16
  article-title: A bivariate genome-wide approach to metabolic syndrome: STAMPEED consortium
  publication-title: Diabetes
  doi: 10.2337/db10-1011
– volume: 155
  start-page: 945
  issue: 2
  year: 2000
  ident: 2020021710492404500_msz201-B32
  article-title: Inference of population structure using multilocus genotype data
  publication-title: Genetics
  doi: 10.1093/genetics/155.2.945
– volume: 9
  start-page: 23.
  issue: 1
  year: 2017
  ident: 2020021710492404500_msz201-B27
  article-title: Exploration of haplotype research consortium imputation for genome-wide association studies in 20,032 generation Scotland participants
  publication-title: Genome Med
  doi: 10.1186/s13073-017-0414-4
– year: 2017
  ident: 2020021710492404500_msz201-B13
  article-title: Rapid evolution of the human mutation spectrum
  doi: 10.7554/eLife.24284
– volume: 11
  start-page: e1005657.
  issue: 12
  year: 2015
  ident: 2020021710492404500_msz201-B33
  article-title: A simple model-based approach to inferring and visualizing cancer mutation signatures
  publication-title: PLoS Genet
  doi: 10.1371/journal.pgen.1005657
– volume: 44
  start-page: 1231.
  issue: 11
  year: 2012
  ident: 2020021710492404500_msz201-B39
  article-title: Genome-wide association study in Chinese men identifies two new prostate cancer risk loci at 9q31.2 and 19q13.4
  publication-title: Nat Genet
  doi: 10.1038/ng.2424
– volume: 36
  start-page: 955.
  issue: 5
  year: 2019
  ident: 2020021710492404500_msz201-B3
  article-title: Signals of variation in human mutation rate at multiple levels of sequence context
  publication-title: Mol Biol Evol
  doi: 10.1093/molbev/msz023
– volume: 112
  start-page: 3439
  issue: 11
  year: 2015
  ident: 2020021710492404500_msz201-B12
  article-title: Evidence for recent, population-specific evolution of the human mutation rate
  publication-title: Proc Natl Acad Sci U S A
  doi: 10.1073/pnas.1418652112
– volume: 437
  start-page: 1299
  issue: 7063
  year: 2005
  ident: 2020021710492404500_msz201-B15
  article-title: A haplotype map of the human genome
  publication-title: Nature
  doi: 10.1038/nature04226
– volume: 48
  start-page: 1279
  issue: 10
  year: 2016
  ident: 2020021710492404500_msz201-B25
  article-title: A reference panel of 64,976 haplotypes for genotype imputation
  publication-title: Nat Genet
  doi: 10.1038/ng.3643
– volume: 30
  start-page: 418
  issue: 9
  year: 2014
  ident: 2020021710492404500_msz201-B38
  article-title: Ten years of next-generation sequencing technology
  publication-title: Trends Genet
  doi: 10.1016/j.tig.2014.07.001
– volume: 467
  start-page: 1061
  issue: 7319
  year: 2010
  ident: 2020021710492404500_msz201-B1
  article-title: A map of human genome variation from population-scale sequencing
  publication-title: Nature
  doi: 10.1038/nature09534
– volume: 48
  start-page: 510.
  issue: 5
  year: 2016
  ident: 2020021710492404500_msz201-B10
  article-title: Analysis of five chronic inflammatory diseases identifies 27 new associations and highlights disease-specific patterns at shared loci
  publication-title: Nat Genet
  doi: 10.1038/ng.3528
– volume: 8
  start-page: 599.
  issue: 1
  year: 2017
  ident: 2020021710492404500_msz201-B37
  article-title: Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections
  publication-title: Nat Commun
  doi: 10.1038/s41467-017-00257-5
– volume: 10
  start-page: 2697
  issue: 10
  year: 2018
  ident: 2020021710492404500_msz201-B21
  article-title: Turning vice into virtue: using batch-effects to detect errors in large genomic data sets
  publication-title: Genome Biol Evol
  doi: 10.1093/gbe/evy199
– volume: 500
  start-page: 415
  issue: 7463
  year: 2013
  ident: 2020021710492404500_msz201-B4
  article-title: Signatures of mutational processes in human cancer
  publication-title: Nature
  doi: 10.1038/nature12477
– volume: 135
  start-page: 0
  year: 2012
  ident: 2020021710492404500_msz201-B2
  article-title: An integrated map of genetic variation
  publication-title: Nature
– volume: 12
  start-page: R112.
  issue: 11
  year: 2011
  ident: 2020021710492404500_msz201-B26
  article-title: Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems
  publication-title: Genome Biol
  doi: 10.1186/gb-2011-12-11-r112
– volume: 146
  start-page: 192
  issue: 1
  year: 2015
  ident: 2020021710492404500_msz201-B40
  article-title: Genome-wide association study identifies novel loci associated with diisocyanate-induced occupational asthma
  publication-title: Toxicol Sci
  doi: 10.1093/toxsci/kfv084
– volume: 12
  start-page: e0179446.
  issue: 6
  year: 2017
  ident: 2020021710492404500_msz201-B23
  article-title: Genetic factors affecting EBV copy number in lymphoblastoid cell lines derived from the 1000 Genome Project samples
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0179446
– volume: 536
  start-page: 285
  issue: 7616
  year: 2016
  ident: 2020021710492404500_msz201-B18
  article-title: Analysis of protein-coding genetic variation in 60,706 humans
  publication-title: Nature
  doi: 10.1038/nature19057
– volume: 21
  start-page: 7435
  issue: 48
  year: 2002
  ident: 2020021710492404500_msz201-B30
  article-title: Tobacco smoke carcinogens, DNA damage and p53 mutations in smoking-associated cancers
  publication-title: Oncogene
  doi: 10.1038/sj.onc.1205803
– volume: 463
  start-page: 184
  issue: 7278
  year: 2010
  ident: 2020021710492404500_msz201-B31
  article-title: A small-cell lung cancer genome with complex signatures of tobacco exposure
  publication-title: Nature
  doi: 10.1038/nature08629
– volume: 96
  start-page: 3
  issue: 1–2
  year: 1995
  ident: 2020021710492404500_msz201-B6
  article-title: A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity
  publication-title: Genetica
  doi: 10.1007/BF01441146
SSID ssj0014466
Score 2.4162126
SecondaryResourceType review_article
Snippet Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other...
SourceID proquest
pubmed
crossref
oup
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 2
SubjectTerms Consortia
Genetics
Genome-Wide Association Study
Genomes
Genomics
Human Genome Project
Human populations
Humans
Japan
Mutation
Stratification
Title Legacy Data Confound Genomics Studies
URI https://www.ncbi.nlm.nih.gov/pubmed/31504792
https://www.proquest.com/docview/3171298437
https://www.proquest.com/docview/2288722155
Volume 37
WOSCitedRecordID wos000515121200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1537-1719
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014466
  issn: 0737-4038
  databaseCode: TOX
  dateStart: 20020101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1537-1719
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014466
  issn: 0737-4038
  databaseCode: TOX
  dateStart: 19831201
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEB50V8GL78f6ooJ6smxJ0iY9iY9dPei6yB72VpI2EUF31a7C-uudtOmKiHrwEigNaZlMMt9kMvMB7AuthELL51OTZj6LjcIlhbocskhlRlg3t2AtueKdjuj34647cMvdtcpqTyw26myY2jPyJto5NE2CUX789Oxb1igbXXUUGtNQt5XKWA3qp61O93YSR6iilZxy9JSocFU20Y1vPg4flH5rPubvxDHCVFbpS6bbN8BZGJ72wn9_eRHmHeT0TkodWYIpPViG2ZKEcrwCB1f6TqZj71yOpGcTAC3Rknehi3zl3HMXDVeh1271zi59R57gpzRkI19QYUzAJMIrpYRBIKiiVKaRQUDESMYVkUEkYrTexlCqKeGxISbSIedRLAO6BrXBcKA3wLOh10BrrjOhWaYDact0GUNCFZgIEUYDjirZJakrLG75LR6SMsBNk1LUSSnqBhxOuj-VFTV-6riHE_FXn-1K9IlbfHnyKXccYvIal42NhciBHr7mCSG4uxLEO2ED1svpnXyJIkhmPCabvw--BXPEOt_Fecw21EYvr3oHZtK30X3-sgvTvM-LVuw6rcSna3JjW97FtnfT_wAJ6uu0
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3JTsMwEB2VTXBhX8oaJOBE1GAnsXNACLGLUnHooTfLTmyEBC2QUlT-iX9knKUIIeDEgXMsJ8obz-JZHsAW14ortHwuNXHi-pFReKRQlgM_VInhNszNWEvqrNHgrVZ0XYG3shfGllWWOjFT1EkntnfkNbRzaJq4T9nBw6NrWaNsdrWk0MjF4lL3XzBkS_cvjhHfbUJOT5pH527BKuDGNPC7LqfcGM-X6HcoxQ16SCqMZRwa9BR8kjBFpBfyCM2aMZRqSlhkiAl1wFgYSY_itkMwgmqc2Qoy1hrEd3tlapRRhmEZ5cVITy-itfvOndK92n36Sgr6mdIEfmqr--LdZlbudOqf_Z9pmCzcaecwl_8ZqOj2LIzlBJv9Odiu6xsZ951j2ZWObW60JFLOmc56sVOnKKKch-ZffOICDLc7bb0Ejk0re1oznXDtJ9qTdgSZMSRQngnRe6rCbgmViIuh6Za7407kyXsqcmRFjmwVdgbLH_JpId8t3ETcf1uzWiItCsWSig-YcYvBY1QJNs8j27rznApC0HIQ9OWCKizm0jR4E8UAwGcRWf558w0YP29e1UX9onG5AhPEXjJk906rMNx9etZrMBr3urfp03p2BBwQfyxS799jQCo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Legacy+Data+Confound+Genomics+Studies&rft.jtitle=Molecular+biology+and+evolution&rft.au=Anderson-Trocm%C3%A9%2C+Luke&rft.au=Farouni%2C+Rick&rft.au=Bourgey%2C+Mathieu&rft.au=Kamatani%2C+Yoichiro&rft.date=2020-01-01&rft.pub=Oxford+University+Press&rft.issn=0737-4038&rft.eissn=1537-1719&rft.volume=37&rft.issue=1&rft.spage=2&rft.epage=10&rft_id=info:doi/10.1093%2Fmolbev%2Fmsz201&rft.externalDocID=10.1093%2Fmolbev%2Fmsz201
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0737-4038&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0737-4038&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0737-4038&client=summon