An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data

The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Genome research Ročník 25; číslo 6; s. 918
Hlavní autoři: Jun, Goo, Wing, Mary Kate, Abecasis, Gonçalo R, Kang, Hyun Min
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States 01.06.2015
Témata:
ISSN:1549-5469, 1549-5469
On-line přístup:Zjistit podrobnosti o přístupu
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.
AbstractList The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.
The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.
Author Jun, Goo
Kang, Hyun Min
Wing, Mary Kate
Abecasis, Gonçalo R
Author_xml – sequence: 1
  givenname: Goo
  surname: Jun
  fullname: Jun, Goo
  organization: Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA; Center for Statistical Genetics and Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan 48109, USA
– sequence: 2
  givenname: Mary Kate
  surname: Wing
  fullname: Wing, Mary Kate
  organization: Center for Statistical Genetics and Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan 48109, USA
– sequence: 3
  givenname: Gonçalo R
  surname: Abecasis
  fullname: Abecasis, Gonçalo R
  organization: Center for Statistical Genetics and Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan 48109, USA
– sequence: 4
  givenname: Hyun Min
  surname: Kang
  fullname: Kang, Hyun Min
  organization: Center for Statistical Genetics and Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan 48109, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/25883319$$D View this record in MEDLINE/PubMed
BookMark eNpNkEtPwzAQhC1URB9w5Ip85JJix48kx6o8pQoucI42zhoFEifYCVB-PSkUidPuaGe_kWZOJq51SMgpZ0vOGb949kueaKXiUcoDMuNKZpGSOpv826dkHsILY0zIND0i01ilqRA8m5GvlaNobWUqdD0FV9JgoIaixlFAvQ1VoNZDgx-tf6W29fQdfAWjFz97D6avWvfz5tFWDpsdxfq2oV3bDTXsztGOiPTyfkUDvg3oDNISejgmhxbqgCf7uSBP11eP69to83Bzt15tIiNT3UcZlyyxWgstmDUGAVEzKbUExguLpTGpjW1swApdWKFQK0CuCl5qWySo4wU5_-V2vh3jQ583VTBY1-CwHULOdaoSJcTY4YKc7a1D0WCZd75qwG_zv77ib0lncTo
CitedBy_id crossref_primary_10_1016_j_jasrep_2025_105401
crossref_primary_10_1093_molbev_msad257
crossref_primary_10_1111_eva_12861
crossref_primary_10_1038_sdata_2017_179
crossref_primary_10_1038_s41586_020_2418_2
crossref_primary_10_3389_fmicb_2025_1584315
crossref_primary_10_1038_s44318_025_00372_w
crossref_primary_10_1097_HP_0000000000000880
crossref_primary_10_1186_s12920_019_0524_5
crossref_primary_10_1016_j_ajhg_2018_07_004
crossref_primary_10_1186_s12915_025_02121_1
crossref_primary_10_1038_s41586_023_06334_8
crossref_primary_10_3389_fgene_2021_745508
crossref_primary_10_1002_ajpa_24735
crossref_primary_10_1016_j_watres_2025_123154
crossref_primary_10_1038_s41467_022_28827_2
crossref_primary_10_1111_mec_70094
crossref_primary_10_1038_ejhg_2014_216
crossref_primary_10_1016_j_cub_2025_06_054
crossref_primary_10_1093_g3journal_jkab373
crossref_primary_10_1111_jfb_70167
crossref_primary_10_1093_hmg_ddad189
crossref_primary_10_1038_s41467_024_47606_9
crossref_primary_10_1038_s41598_022_23171_3
crossref_primary_10_3390_genes12060910
crossref_primary_10_7554_eLife_87928_3
crossref_primary_10_1073_pnas_1715554115
crossref_primary_10_1371_journal_pgen_1008915
crossref_primary_10_1093_molbev_msae007
crossref_primary_10_1002_ece3_72075
crossref_primary_10_1093_bib_bbaa065
crossref_primary_10_1111_eva_70049
crossref_primary_10_1038_s41559_025_02699_3
crossref_primary_10_1016_j_jasrep_2023_104333
crossref_primary_10_1073_pnas_1705859115
crossref_primary_10_1126_science_abk0989
crossref_primary_10_1186_s13059_024_03462_w
crossref_primary_10_1038_s41586_024_07546_2
crossref_primary_10_1038_s41588_021_00835_w
crossref_primary_10_1128_jvi_00090_23
crossref_primary_10_1016_j_xgen_2025_100976
crossref_primary_10_1007_s12520_023_01899_x
crossref_primary_10_1016_j_isci_2024_111405
crossref_primary_10_1038_s41431_020_0697_6
crossref_primary_10_1038_s41467_018_04668_w
crossref_primary_10_1186_s12711_024_00875_w
crossref_primary_10_1038_s41467_018_05936_5
crossref_primary_10_1002_gepi_22067
crossref_primary_10_1038_s43856_023_00269_x
crossref_primary_10_1093_evolut_qpac028
crossref_primary_10_1038_s41467_022_28841_4
crossref_primary_10_1371_journal_pbio_3002759
crossref_primary_10_1016_j_prp_2020_152873
crossref_primary_10_1017_ehs_2020_16
crossref_primary_10_1016_j_ajhg_2016_08_012
crossref_primary_10_1161_JAHA_124_036499
crossref_primary_10_1111_tan_15078
crossref_primary_10_1016_j_ebiom_2025_105591
crossref_primary_10_1093_icesjms_fsaf155
crossref_primary_10_1038_s41467_022_30009_z
crossref_primary_10_1016_j_cub_2021_06_027
crossref_primary_10_3389_fimmu_2021_754316
crossref_primary_10_1016_j_cub_2021_06_023
crossref_primary_10_1371_journal_pcbi_1010788
crossref_primary_10_1017_S0033291721004840
crossref_primary_10_1038_s41586_021_04052_7
crossref_primary_10_1111_cobi_14254
crossref_primary_10_1038_s41467_023_39950_z
crossref_primary_10_1128_JVI_01162_19
crossref_primary_10_1007_s00438_022_01955_6
crossref_primary_10_1038_s41439_023_00231_2
crossref_primary_10_1111_eva_13599
crossref_primary_10_1186_s12915_020_00882_5
crossref_primary_10_1073_pnas_2409302122
crossref_primary_10_1038_s41467_018_06159_4
crossref_primary_10_3390_cancers13020230
crossref_primary_10_1002_ajb2_70078
crossref_primary_10_1161_HYPERTENSIONAHA_122_19324
crossref_primary_10_1038_s41598_021_86200_7
crossref_primary_10_1109_TCBB_2018_2854793
crossref_primary_10_3390_jpm11070631
crossref_primary_10_1111_nph_15887
crossref_primary_10_1186_s12859_019_2903_5
crossref_primary_10_1371_journal_pone_0209523
crossref_primary_10_1093_rheumatology_keac227
crossref_primary_10_1146_annurev_genom_083117_021602
crossref_primary_10_4000_bmsap_13952
crossref_primary_10_1186_s12920_018_0342_1
crossref_primary_10_1038_s41467_021_22339_1
crossref_primary_10_1126_science_ads3732
crossref_primary_10_1007_s00335_024_10053_4
crossref_primary_10_1038_s41467_023_39202_0
crossref_primary_10_1016_j_fsigen_2024_103060
crossref_primary_10_1186_s12859_016_1211_6
crossref_primary_10_1186_s12915_025_02286_9
crossref_primary_10_1093_molbev_msaf139
crossref_primary_10_1016_j_ophtha_2017_10_027
crossref_primary_10_1038_s41598_021_91357_2
crossref_primary_10_1038_s41467_022_28973_7
crossref_primary_10_3389_fgene_2019_00508
crossref_primary_10_3390_agronomy14092004
crossref_primary_10_1093_bioinformatics_btaf105
crossref_primary_10_1186_s12915_025_02350_4
crossref_primary_10_1126_science_adg2238
crossref_primary_10_1186_s12859_016_1108_4
crossref_primary_10_1097_HS9_0000000000000785
crossref_primary_10_1038_ng_3368
crossref_primary_10_1093_g3journal_jkab353
crossref_primary_10_3389_fevo_2022_932004
crossref_primary_10_62347_EANH4082
crossref_primary_10_1016_j_omtm_2025_101501
crossref_primary_10_1016_j_ygeno_2025_111061
crossref_primary_10_1016_j_ygeno_2021_10_018
crossref_primary_10_1007_s40502_025_00856_1
crossref_primary_10_1017_S0033291716003184
crossref_primary_10_1101_gad_348319_121
crossref_primary_10_1093_molbev_msae158
crossref_primary_10_1038_s42003_024_06979_9
crossref_primary_10_1016_j_cub_2023_09_055
crossref_primary_10_1038_s41467_023_39570_7
crossref_primary_10_1038_s41467_023_42763_9
crossref_primary_10_1126_science_adu7144
crossref_primary_10_1534_g3_118_200838
crossref_primary_10_1111_1755_0998_13559
crossref_primary_10_1111_1440_1703_12545
crossref_primary_10_1186_s13059_025_03509_6
crossref_primary_10_1038_s41467_024_47316_2
crossref_primary_10_1126_sciadv_adw4954
crossref_primary_10_1038_s41559_019_0878_2
crossref_primary_10_1038_s10038_024_01295_w
crossref_primary_10_1002_gepi_22015
crossref_primary_10_1093_molbev_msac200
crossref_primary_10_1093_nar_gkad635
crossref_primary_10_1111_1755_0998_70032
crossref_primary_10_1099_mgen_0_000615
crossref_primary_10_3390_jpm11111230
crossref_primary_10_1038_s41467_020_16557_2
crossref_primary_10_1186_s12711_019_0462_x
crossref_primary_10_1016_j_ygeno_2021_04_003
crossref_primary_10_1038_s41586_024_08113_5
crossref_primary_10_1038_s41598_024_56584_3
crossref_primary_10_1038_s41467_025_58420_2
crossref_primary_10_1038_s42003_024_07338_4
crossref_primary_10_1038_s42003_023_05131_3
crossref_primary_10_1038_s41598_021_86612_5
crossref_primary_10_1073_pnas_2407584121
crossref_primary_10_1038_s41588_018_0286_6
crossref_primary_10_1038_s42003_021_02794_8
crossref_primary_10_1002_ece3_7878
crossref_primary_10_1016_j_cell_2025_08_003
crossref_primary_10_1038_s41598_021_95996_3
crossref_primary_10_1017_laq_2023_13
crossref_primary_10_1093_nargab_lqad004
crossref_primary_10_1016_j_devcel_2021_07_006
crossref_primary_10_1681_ASN_2021060794
crossref_primary_10_1016_j_ajhg_2022_07_012
crossref_primary_10_1111_mec_70025
crossref_primary_10_3390_genes14030700
crossref_primary_10_1002_oa_3300
crossref_primary_10_1371_journal_pone_0288128
crossref_primary_10_3389_fmicb_2024_1471740
crossref_primary_10_1038_s41598_022_25420_x
crossref_primary_10_1007_s12520_024_02036_y
crossref_primary_10_1111_eva_13653
crossref_primary_10_1073_pnas_1813608115
crossref_primary_10_1371_journal_pbio_3000745
crossref_primary_10_1038_s41388_025_03391_3
crossref_primary_10_1093_molbev_msac108
crossref_primary_10_1371_journal_pone_0253611
crossref_primary_10_1007_s12520_024_02033_1
crossref_primary_10_1038_s41467_023_37691_7
crossref_primary_10_1038_s41598_021_85957_1
crossref_primary_10_1038_s41431_023_01524_4
crossref_primary_10_7554_eLife_87928
crossref_primary_10_1016_j_cub_2025_07_047
crossref_primary_10_1038_s41586_024_07312_4
crossref_primary_10_1038_s41586_020_03114_6
crossref_primary_10_1016_j_cub_2024_06_076
crossref_primary_10_1038_s41559_022_01914_9
crossref_primary_10_1021_acscentsci_3c01131
crossref_primary_10_1038_s41467_018_05747_8
crossref_primary_10_1038_s42003_024_06893_0
crossref_primary_10_1097_FPC_0000000000000260
crossref_primary_10_1007_s00334_025_01041_y
crossref_primary_10_1016_j_cell_2020_10_015
crossref_primary_10_1016_j_celrep_2025_115262
crossref_primary_10_1002_sim_9211
crossref_primary_10_7717_peerj_10947
crossref_primary_10_1186_s12859_015_0795_6
crossref_primary_10_1038_s41467_022_31487_x
crossref_primary_10_1093_evolut_qpad075
crossref_primary_10_1186_s13073_019_0677_z
crossref_primary_10_3389_fgene_2023_1278215
crossref_primary_10_1093_gigascience_giab004
crossref_primary_10_1038_s41467_019_10945_z
crossref_primary_10_1038_s41586_020_2819_2
crossref_primary_10_1111_1755_0998_13960
crossref_primary_10_1371_journal_ppat_1009714
crossref_primary_10_3390_ijms242417635
crossref_primary_10_1093_bib_bbaa084
crossref_primary_10_1371_journal_pone_0182918
crossref_primary_10_1002_ajpa_24650
crossref_primary_10_1038_s41598_024_83870_x
crossref_primary_10_1016_j_cub_2022_11_036
crossref_primary_10_1016_j_cub_2022_11_034
crossref_primary_10_1111_mec_17573
crossref_primary_10_1038_s41598_020_78723_2
crossref_primary_10_1186_s12915_024_02068_9
crossref_primary_10_1016_j_jas_2025_106178
crossref_primary_10_1038_s41467_023_37095_7
crossref_primary_10_1083_jcb_202307026
crossref_primary_10_1038_s41586_021_04108_8
crossref_primary_10_1038_s41467_021_22463_y
crossref_primary_10_1038_s41598_024_54462_6
crossref_primary_10_1016_j_cell_2022_03_007
crossref_primary_10_1093_biomethods_bpz011
crossref_primary_10_1038_s41467_020_18781_2
crossref_primary_10_1111_psyp_12350
crossref_primary_10_1210_clinem_dgaa658
crossref_primary_10_1101_gr_279025_124
crossref_primary_10_1186_s13059_025_03570_1
crossref_primary_10_1111_mec_17440
crossref_primary_10_1016_j_cub_2020_12_015
crossref_primary_10_1038_s41467_022_31403_3
crossref_primary_10_1038_s41592_019_0610_9
crossref_primary_10_1016_j_cell_2019_09_019
crossref_primary_10_15252_embj_2022111587
crossref_primary_10_1038_s41586_022_04578_4
crossref_primary_10_1016_j_cub_2025_01_058
crossref_primary_10_1093_nar_gkad535
crossref_primary_10_1177_1177932219873884
crossref_primary_10_1101_gr_246934_118
crossref_primary_10_1002_tpg2_20249
crossref_primary_10_1038_s41586_024_08418_5
crossref_primary_10_1101_gr_211656_116
crossref_primary_10_1038_s41586_021_03205_y
crossref_primary_10_1016_j_jmoldx_2023_11_006
crossref_primary_10_1038_s41467_023_36631_9
crossref_primary_10_1080_03014223_2022_2053554
crossref_primary_10_1038_s41586_025_09103_x
crossref_primary_10_1038_s41586_023_05806_1
crossref_primary_10_1038_s41586_020_2844_1
crossref_primary_10_1186_s13059_020_1937_3
crossref_primary_10_1007_s00122_022_04105_z
crossref_primary_10_1126_science_adr2915
crossref_primary_10_1534_genetics_119_302843
crossref_primary_10_1038_s41467_025_59155_w
crossref_primary_10_7554_eLife_66815
crossref_primary_10_1186_s12915_025_02343_3
crossref_primary_10_1186_s40168_022_01372_2
crossref_primary_10_1038_s41380_020_01006_9
crossref_primary_10_1016_j_jas_2025_106184
crossref_primary_10_1016_j_scitotenv_2023_166540
crossref_primary_10_1002_alz_13691
crossref_primary_10_1002_gepi_22261
ContentType Journal Article
Copyright 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
Copyright_xml – notice: 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1101/gr.176552.114
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Anatomy & Physiology
Chemistry
Biology
EISSN 1549-5469
ExternalDocumentID 25883319
Genre Journal Article
Technical Report
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NHGRI NIH HHS
  grantid: U01 HG006513
– fundername: NHGRI NIH HHS
  grantid: R01 HG007022
GroupedDBID ---
.GJ
18M
29H
2WC
39C
4.4
53G
5GY
5RE
5VS
AAFWJ
AAZTW
ABDIX
ABDNZ
ACGFO
ACLKE
ACYGS
ADBBV
ADNWM
AEILP
AENEX
AHPUY
AI.
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BTFSW
C1A
CGR
CS3
CUY
CVF
DIK
DU5
E3Z
EBS
ECM
EIF
EJD
F5P
FRP
GX1
H13
HYE
IH2
K-O
KQ8
MV1
NPM
R.V
RCX
RHI
RNS
RPM
RXW
SJN
TAE
TR2
VH1
W8F
WOQ
YKV
ZCG
ZGI
ZXP
7X8
ID FETCH-LOGICAL-c486t-91407f663630fcceaee604464a01bfedcc8f2f2caf36bf35e65ae15b1d6fb7e62
IEDL.DBID 7X8
ISICitedReferencesCount 301
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000355565900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1549-5469
IngestDate Fri Sep 05 07:27:09 EDT 2025
Mon Jul 21 06:05:26 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c486t-91407f663630fcceaee604464a01bfedcc8f2f2caf36bf35e65ae15b1d6fb7e62
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
ObjectType-Technical Report-3
content type line 23
OpenAccessLink https://pubmed.ncbi.nlm.nih.gov/PMC4448687
PMID 25883319
PQID 1685753355
PQPubID 23479
ParticipantIDs proquest_miscellaneous_1685753355
pubmed_primary_25883319
PublicationCentury 2000
PublicationDate 2015-06-01
PublicationDateYYYYMMDD 2015-06-01
PublicationDate_xml – month: 06
  year: 2015
  text: 2015-06-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Genome research
PublicationTitleAlternate Genome Res
PublicationYear 2015
References 23201682 - Nature. 2013 Jan 10;493(7431):216-20
21460063 - Genome Res. 2011 Jun;21(6):940-51
19451168 - Bioinformatics. 2009 Jul 15;25(14):1754-60
21903627 - Bioinformatics. 2011 Nov 1;27(21):2987-93
19931040 - Am J Hum Genet. 2009 Dec;85(6):847-61
23296920 - Genome Res. 2013 May;23(5):833-42
21058334 - Genet Epidemiol. 2010 Dec;34(8):816-34
12662656 - Neural Netw. 1999 Jul;12(6):783-789
24324759 - PLoS One. 2013;8(12):e82138
17943122 - Nature. 2007 Oct 18;449(7164):851-61
23128226 - Nature. 2012 Nov 1;491(7422):56-65
23103226 - Am J Hum Genet. 2012 Nov 2;91(5):839-48
22604720 - Science. 2012 Jul 6;337(6090):64-9
21478889 - Nat Genet. 2011 May;43(5):491-8
24319692 - Biomed Res Int. 2013;2013:865181
References_xml – reference: 23103226 - Am J Hum Genet. 2012 Nov 2;91(5):839-48
– reference: 24319692 - Biomed Res Int. 2013;2013:865181
– reference: 23128226 - Nature. 2012 Nov 1;491(7422):56-65
– reference: 22604720 - Science. 2012 Jul 6;337(6090):64-9
– reference: 23201682 - Nature. 2013 Jan 10;493(7431):216-20
– reference: 21460063 - Genome Res. 2011 Jun;21(6):940-51
– reference: 17943122 - Nature. 2007 Oct 18;449(7164):851-61
– reference: 12662656 - Neural Netw. 1999 Jul;12(6):783-789
– reference: 19451168 - Bioinformatics. 2009 Jul 15;25(14):1754-60
– reference: 21903627 - Bioinformatics. 2011 Nov 1;27(21):2987-93
– reference: 19931040 - Am J Hum Genet. 2009 Dec;85(6):847-61
– reference: 24324759 - PLoS One. 2013;8(12):e82138
– reference: 21058334 - Genet Epidemiol. 2010 Dec;34(8):816-34
– reference: 21478889 - Nat Genet. 2011 May;43(5):491-8
– reference: 23296920 - Genome Res. 2013 May;23(5):833-42
SSID ssj0003488
Score 2.611168
Snippet The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 918
SubjectTerms Computational Biology
Databases, Genetic
Exome
Genetics, Population - methods
Genome, Human
Haplotypes
High-Throughput Nucleotide Sequencing
Humans
Polymorphism, Single Nucleotide
Sequence Alignment
Sequence Analysis, DNA - methods
Software
Title An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data
URI https://www.ncbi.nlm.nih.gov/pubmed/25883319
https://www.proquest.com/docview/1685753355
Volume 25
WOSCitedRecordID wos000355565900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaAgmDhUV7lJSMhNtM4iZ1kQqVQsVB1AKlbFTtnxEBamlIJfj1nJ2knJCSWKBkcRT7n7rvz-fsIuVIh8ChLBAOPxyxMjGJKm4ypQKVCSJBZqJzYRNTvx8NhMqgKbkXVVln7ROeos7G2NfI2l1ZLMsDweDv5YFY1yu6uVhIaq6QRIJSxqzoaLtnCg9DpTloWMiYwD1xwbPL26_SGR1II35Ll_o4uXZTp7fz3-3bJdoUvaadcEHtkBfIm2e_kmFu_f9Fr6jo-XSm9STbu6rvNbq37tk--OzkFRyyB8YimeUYLtKM9YYUPJYMJNXVLF0XMS-eYb6OBKPr5aXlOwg3D2IsQ1lYfqT3EQicLrTBm3wj0vt-hdSs3ta2qB-Sl9_DcfWSVQgPTYSxn6CkxHzQIWmTgGa0hBZB2hzhMPa4MZFrHxje-Tk0glQkESJECF4pn0qgIpH9I1vJxDseERr6SoUYPwqUdrpSXxMZLVeD5OuIhtMhlPe8jnBC7rZHmMP4sRsuZb5Gj0nijSUnVMfKFFVPmyckfRp-SLURDouwDOyMNg_8_nJN1PZ-9FdMLt7Tw2h88_QCcINvL
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+efficient+and+scalable+analysis+framework+for+variant+extraction+and+refinement+from+population-scale+DNA+sequence+data&rft.jtitle=Genome+research&rft.au=Jun%2C+Goo&rft.au=Wing%2C+Mary+Kate&rft.au=Abecasis%2C+Gon%C3%A7alo+R&rft.au=Kang%2C+Hyun+Min&rft.date=2015-06-01&rft.eissn=1549-5469&rft.volume=25&rft.issue=6&rft.spage=918&rft_id=info:doi/10.1101%2Fgr.176552.114&rft_id=info%3Apmid%2F25883319&rft_id=info%3Apmid%2F25883319&rft.externalDocID=25883319
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-5469&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-5469&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-5469&client=summon