GBC: a parallel toolkit based on highly addressable byte-encoding blocks for extremely large-scale genotypes of species

Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framew...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genome Biology Jg. 24; H. 1; S. 76
Hauptverfasser: Zhang, Liubin, Yuan, Yangyang, Peng, Wenjie, Tang, Bin, Li, Mulin Jun, Gui, Hongsheng, Wang, Qiang, Li, Miaoxin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London BioMed Central 17.04.2023
Springer Nature B.V
BMC
Schlagworte:
ISSN:1474-760X, 1474-7596, 1474-760X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC’s data structure and algorithms are valuable for accelerating large-scale genomic research.
AbstractList Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC’s data structure and algorithms are valuable for accelerating large-scale genomic research.
Abstract Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC’s data structure and algorithms are valuable for accelerating large-scale genomic research.
Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC’s data structure and algorithms are valuable for accelerating large-scale genomic research.
Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC's data structure and algorithms are valuable for accelerating large-scale genomic research.Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC's data structure and algorithms are valuable for accelerating large-scale genomic research.
ArticleNumber 76
Author Peng, Wenjie
Li, Mulin Jun
Yuan, Yangyang
Li, Miaoxin
Wang, Qiang
Zhang, Liubin
Gui, Hongsheng
Tang, Bin
Author_xml – sequence: 1
  givenname: Liubin
  surname: Zhang
  fullname: Zhang, Liubin
  organization: Program in Bioinformatics, Zhongshan School of Medicine and The Fifth Affiliated Hospital, Sun Yat-Sen University, Center for Precision Medicine, Sun Yat-Sen University, Center for Disease Genome Research, Sun Yat-Sen University
– sequence: 2
  givenname: Yangyang
  surname: Yuan
  fullname: Yuan, Yangyang
  organization: Program in Bioinformatics, Zhongshan School of Medicine and The Fifth Affiliated Hospital, Sun Yat-Sen University, Center for Precision Medicine, Sun Yat-Sen University, Center for Disease Genome Research, Sun Yat-Sen University, School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University
– sequence: 3
  givenname: Wenjie
  surname: Peng
  fullname: Peng, Wenjie
  organization: Program in Bioinformatics, Zhongshan School of Medicine and The Fifth Affiliated Hospital, Sun Yat-Sen University, Center for Precision Medicine, Sun Yat-Sen University, Center for Disease Genome Research, Sun Yat-Sen University
– sequence: 4
  givenname: Bin
  surname: Tang
  fullname: Tang, Bin
  organization: Program in Bioinformatics, Zhongshan School of Medicine and The Fifth Affiliated Hospital, Sun Yat-Sen University, Center for Precision Medicine, Sun Yat-Sen University, Center for Disease Genome Research, Sun Yat-Sen University
– sequence: 5
  givenname: Mulin Jun
  surname: Li
  fullname: Li, Mulin Jun
  organization: The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University
– sequence: 6
  givenname: Hongsheng
  surname: Gui
  fullname: Gui, Hongsheng
  organization: Behavioral Health Services, Henry Ford Health, Center for Health Policy & Health Services Research, Henry Ford Health
– sequence: 7
  givenname: Qiang
  surname: Wang
  fullname: Wang, Qiang
  organization: Mental Health Center, West China Hospital, Sichuan University
– sequence: 8
  givenname: Miaoxin
  orcidid: 0000-0002-4733-0109
  surname: Li
  fullname: Li, Miaoxin
  email: limiaoxin@mail.sysu.edu.cn
  organization: Program in Bioinformatics, Zhongshan School of Medicine and The Fifth Affiliated Hospital, Sun Yat-Sen University, Center for Precision Medicine, Sun Yat-Sen University, Center for Disease Genome Research, Sun Yat-Sen University, Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37069653$$D View this record in MEDLINE/PubMed
BookMark eNqFkk9v1DAQxSNURP_AF-CALHHhErBjx0m4IFhBqVSJC0jcrIkzznrrjRfbC2w_Pd6mlLaHcrA88vze82j0jouDyU9YFM8Zfc1YK99ExmndlbTi-XRUlpePiiMmGlE2kn4_uFUfFscxrihlnajkk-KQN1R2suZHxa_TD4u3BMgGAjiHjiTv3YVNpIeIA_ETWdpx6XYEhiFgjNA7JP0uYYmT9oOdRtI7ry8iMT4Q_J0CrjHjDsKIZdSQ8REnn3YbjMQbEjeoLcanxWMDLuKz6_uk-Pbp49fF5_L8y-nZ4v15qRvapRIbyfkwNKaCWkDXioY2QDWIGjtDedO3tWHATd1LNFUNueaZYJ2sKpab_KQ4m30HDyu1CXYNYac8WHX14MOoICSrHapBGGl4jxygF73uQSAdkJmmFbISgmWvd7PXZtuvcdA4pby0O6Z3O5NdqtH_VIwy2taMZodX1w7B_9hiTGpto0bnYEK_jYqzmle0lk33X7RqadW2NW33c728h678Nkx5rXuKU0EZE5l6cXv6m7H_ZiED1Qzo4GMMaG4QRtU-cGoOnMqBU1eBU5dZ1N4TaZsgWb_fgHUPS_ksjfmfacTwb-wHVH8AUeLr1A
CitedBy_id crossref_primary_10_1093_gigascience_giae046
Cites_doi 10.1038/nature15393
10.1111/1755-0998.13438
10.1093/bioinformatics/bty875
10.1086/519795
10.1093/gigascience/giab008
10.1093/bioinformatics/btr330
10.1093/bioinformatics/btaa290
10.1093/bioinformatics/bty023
10.1016/j.cell.2019.09.019
10.5281/zenodo.7737556
10.1038/s41586-018-0579-z
10.1093/bioinformatics/btz508
10.1016/j.xgen.2021.100029
10.1093/bioinformatics/btw437
10.1038/nmeth.3654
10.1126/science.356262
10.1093/bioinformatics/btx145
10.1093/bioinformatics/btt460
10.1101/2020.12.18.423437
10.1093/bioinformatics/btv613
10.1093/bioinformatics/btu014
10.1038/s41586-021-03205-y
ContentType Journal Article
Copyright The Author(s) 2023
2023. The Author(s).
2023. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2023
– notice: 2023. The Author(s).
– notice: 2023. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7X7
7XB
88E
8FE
8FH
8FI
8FJ
8FK
ABUWG
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
FYUFA
GHDGH
GNUQQ
HCIFZ
K9.
LK8
M0S
M1P
M7P
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
7X8
7S9
L.6
5PM
DOA
DOI 10.1186/s13059-023-02906-z
DatabaseName Springer Nature OA Free Journals
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
ProQuest SciTech Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Natural Science Collection
ProQuest One
ProQuest Central
Proquest Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Biological Science Collection
ProQuest Health & Medical Collection
Medical Database
Biological Science Database
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
ProQuest Central Student
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Health & Medical Research Collection
Health Research Premium Collection
Health and Medicine Complete (Alumni Edition)
Natural Science Collection
ProQuest Central Korea
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest SciTech Collection
ProQuest Hospital Collection (Alumni)
ProQuest Health & Medical Complete
ProQuest Medical Library
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
DatabaseTitleList CrossRef

Publicly Available Content Database


MEDLINE
AGRICOLA
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1474-760X
EndPage 76
ExternalDocumentID oai_doaj_org_article_d4f6f3be3aab4bcba4e0de1f78462441
PMC10108510
37069653
10_1186_s13059_023_02906_z
Genre Research Support, Non-U.S. Gov't
Journal Article
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 31771401; 31970650
  funderid: http://dx.doi.org/10.13039/501100001809
– fundername: ;
  grantid: 31771401; 31970650
GroupedDBID ---
0R~
29H
4.4
53G
5GY
5VS
7X7
88E
8FE
8FH
8FI
8FJ
AAFWJ
AAHBH
AAJSJ
AASML
ABUWG
ACGFO
ACGFS
ACJQM
ACPRK
ADBBV
ADUKV
AEGXH
AFKRA
AFPKN
AHBYD
AIAGR
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIAM
AOIJS
BAPOH
BAWUL
BBNVY
BCNDV
BENPR
BFQNJ
BHPHI
BMC
BPHCQ
BVXVI
C6C
CCPQU
EBD
EBLON
EBS
EMOBN
FYUFA
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
IAO
IGS
IHR
ISR
ITC
KPI
LK8
M1P
M7P
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PUEGO
ROL
RPM
RSV
SJN
SOJ
SV3
UKHRP
AAYXX
AFFHD
CITATION
3V.
ACRMQ
ADINQ
ALIPV
C24
CGR
CUY
CVF
ECM
EIF
NPM
7XB
8FK
AZQEC
DWQXO
GNUQQ
K9.
PKEHL
PQEST
PQUKI
PRINS
7X8
7S9
L.6
5PM
ID FETCH-LOGICAL-c709t-e7633dd7f2a54a984707a0ca45e9f037b85f1a3f5b6ef25a1a3307a196221b853
IEDL.DBID DOA
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000984385600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1474-760X
1474-7596
IngestDate Mon Nov 10 04:24:32 EST 2025
Tue Nov 04 02:06:44 EST 2025
Fri Sep 05 15:04:34 EDT 2025
Fri Sep 05 07:43:54 EDT 2025
Tue Oct 14 14:10:33 EDT 2025
Thu Jan 02 22:53:16 EST 2025
Sat Nov 29 04:56:03 EST 2025
Tue Nov 18 22:49:34 EST 2025
Sat Sep 06 07:17:32 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Parallelization algorithm
Cloud computation
Genotype compression
Highly addressable genotype blocks
Genotype management
Byte-encoding genotypes
Large-scale genotypes
Language English
License 2023. The Author(s).
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c709t-e7633dd7f2a54a984707a0ca45e9f037b85f1a3f5b6ef25a1a3307a196221b853
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0002-4733-0109
OpenAccessLink https://doaj.org/article/d4f6f3be3aab4bcba4e0de1f78462441
PMID 37069653
PQID 2803040114
PQPubID 2040232
PageCount 1
ParticipantIDs doaj_primary_oai_doaj_org_article_d4f6f3be3aab4bcba4e0de1f78462441
pubmedcentral_primary_oai_pubmedcentral_nih_gov_10108510
proquest_miscellaneous_3153205679
proquest_miscellaneous_2802885081
proquest_journals_2803040114
pubmed_primary_37069653
crossref_primary_10_1186_s13059_023_02906_z
crossref_citationtrail_10_1186_s13059_023_02906_z
springer_journals_10_1186_s13059_023_02906_z
PublicationCentury 2000
PublicationDate 2023-04-17
PublicationDateYYYYMMDD 2023-04-17
PublicationDate_xml – month: 04
  year: 2023
  text: 2023-04-17
  day: 17
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
– name: England
PublicationTitle Genome Biology
PublicationTitleAbbrev Genome Biol
PublicationTitleAlternate Genome Biol
PublicationYear 2023
Publisher BioMed Central
Springer Nature B.V
BMC
Publisher_xml – name: BioMed Central
– name: Springer Nature B.V
– name: BMC
References RM Layer (2906_CR11) 2016; 13
D Lan (2906_CR6) 2020; 36
A Danek (2906_CR12) 2018; 34
S Deorowicz (2906_CR5) 2019; 35
D Taliun (2906_CR18) 2021; 590
R Durbin (2906_CR9) 2014; 30
2906_CR24
S Deorowicz (2906_CR4) 2013; 29
C Zhang (2906_CR16) 2019; 35
A Auton (2906_CR21) 2015; 526
2906_CR20
C Bycroft (2906_CR3) 2018; 562
D Wu (2906_CR14) 2019; 179
HL Rehm (2906_CR13) 2021; 1
P Danecek (2906_CR1) 2011; 27
X Zheng (2906_CR22) 2017; 33
M Li (2906_CR15) 2017; 45
C Theodoris (2906_CR17) 2021; 21
S Purcell (2906_CR2) 2007; 81
P Menozzi (2906_CR19) 1978; 201
2906_CR7
P Danecek (2906_CR8) 2021; 10
K Tatwawadi (2906_CR23) 2016; 32
H Li (2906_CR10) 2016; 32
References_xml – volume: 526
  start-page: 68
  issue: 7571
  year: 2015
  ident: 2906_CR21
  publication-title: Nature
  doi: 10.1038/nature15393
– volume: 21
  start-page: 2580
  issue: 7
  year: 2021
  ident: 2906_CR17
  publication-title: Mol Ecol Resour
  doi: 10.1111/1755-0998.13438
– volume: 35
  start-page: 1786
  issue: 10
  year: 2019
  ident: 2906_CR16
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty875
– volume: 81
  start-page: 559
  issue: 3
  year: 2007
  ident: 2906_CR2
  publication-title: Am J Hum Genet
  doi: 10.1086/519795
– volume: 10
  start-page: giab008
  issue: 2
  year: 2021
  ident: 2906_CR8
  publication-title: Gigascience
  doi: 10.1093/gigascience/giab008
– volume: 27
  start-page: 2156
  issue: 15
  year: 2011
  ident: 2906_CR1
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btr330
– volume: 36
  start-page: 4091
  issue: 13
  year: 2020
  ident: 2906_CR6
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btaa290
– volume: 34
  start-page: 1834
  issue: 11
  year: 2018
  ident: 2906_CR12
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty023
– volume: 179
  start-page: 736
  issue: 3
  year: 2019
  ident: 2906_CR14
  publication-title: Cell
  doi: 10.1016/j.cell.2019.09.019
– ident: 2906_CR24
  doi: 10.5281/zenodo.7737556
– volume: 562
  start-page: 203
  issue: 7726
  year: 2018
  ident: 2906_CR3
  publication-title: Nature
  doi: 10.1038/s41586-018-0579-z
– volume: 35
  start-page: 4791
  issue: 22
  year: 2019
  ident: 2906_CR5
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btz508
– volume: 1
  start-page: 100029
  issue: 2
  year: 2021
  ident: 2906_CR13
  publication-title: Cell Genom
  doi: 10.1016/j.xgen.2021.100029
– volume: 32
  start-page: i479
  issue: 17
  year: 2016
  ident: 2906_CR23
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw437
– volume: 13
  start-page: 63
  issue: 1
  year: 2016
  ident: 2906_CR11
  publication-title: Nat Methods
  doi: 10.1038/nmeth.3654
– volume: 201
  start-page: 786
  issue: 4358
  year: 1978
  ident: 2906_CR19
  publication-title: Science
  doi: 10.1126/science.356262
– volume: 33
  start-page: 2251
  issue: 15
  year: 2017
  ident: 2906_CR22
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx145
– volume: 29
  start-page: 2572
  issue: 20
  year: 2013
  ident: 2906_CR4
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt460
– ident: 2906_CR7
  doi: 10.1101/2020.12.18.423437
– volume: 32
  start-page: 590
  issue: 4
  year: 2016
  ident: 2906_CR10
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btv613
– volume: 45
  start-page: e75
  issue: 9
  year: 2017
  ident: 2906_CR15
  publication-title: Nucleic Acids Res
– ident: 2906_CR20
– volume: 30
  start-page: 1266
  issue: 9
  year: 2014
  ident: 2906_CR9
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu014
– volume: 590
  start-page: 290
  issue: 7845
  year: 2021
  ident: 2906_CR18
  publication-title: Nature
  doi: 10.1038/s41586-021-03205-y
SSID ssj0019426
ssj0017866
Score 2.4325047
Snippet Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present...
Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present...
Abstract Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here,...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 76
SubjectTerms Algorithms
Animal Genetics and Genomics
Bioinformatics
Biomedical and Life Sciences
Byte-encoding genotypes
Chromosomes
Compression
Data Compression - methods
Datasets
Design
Evolutionary Biology
genome
Genomes
Genomics
Genomics - methods
Genotype
Genotype & phenotype
Genotype compression
Genotype management
Genotypes
Highly addressable genotype blocks
Human Genetics
Humans
Large-scale genotypes
Life Sciences
Localization
memory
Method
Microbial Genetics and Genomics
Parallelization algorithm
Plant Genetics and Genomics
Software
species
Whole genome sequencing
SummonAdditionalLinks – databaseName: Health & Medical Collection
  dbid: 7X7
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwELaggMSF9yNQkJG4QVQnTuyYC6IVhVPFAaS9WbZj01WjpGxS0PbXM-MkWy3QXjistLueleKdme-b8WOGkNcGSLgKkORY6X1a-OhSVqXGWwfgGJg1Y7MJeXRULRbqy7Tg1k_HKmdMjEBddw7XyPewixIYHITv709_pNg1CndXpxYa18kNbJuNdi4Xm4QrkxXGKtMHVeTjVSM8gFgqMd-gqcReD0BeqhToC14KUuzzLZaKxfz_FYH-fZDyj93USFKHd_93evfInSk8pR9Ge7pPrvn2Abk1NqxcPyS_Pu0fvKOGYsHwpvENHbquOVkOFMmwpl1Lsfxxs6aAZ7G5im08tevBp1gvE2mSWmDPk55CrEyBF3B1EsQbPI-e9mAvnmLVWFwY7mkXKF4EhVz-Efl2-PHrwed0at2QOsnUkHqALV7XMuSmLIwCCmTSMGeK0qvAuLRVGTLDQ2mFD3lp4D2AjQE4yPMMBvljstN2rX9KqJCWuawWzDJfKMYs8GduaiHKXDjHXUKyWVXaTXXNsb1Go2N-Uwk9qleDenVUrz5PyJvNb07Hqh5XSu-jBWwksSJ3_KJbfdeTg-u6CCJw67kxtrDOmsKz2mdBQoAHIVSWkN1Z8XqCiV5faD0hrzbD4OC4a2Na351FmbyqII7OLpfhGfb3KIVUCXkymuTmablkQomSJ6TaMtat6WyPtMvjWGgc4BojcpaQt7NdXzz75f_Xs6un-pzczqOrFWkmd8nOsDrzL8hN93NY9quX0YN_A3ygS4A
  priority: 102
  providerName: ProQuest
– databaseName: Springer Journals
  dbid: RSV
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Jb9UwEB5BAYkL-xIoyEjcIMJZbMfcaEXhVCGxqDfLduzy1ChBLyno9dczdhb0oEWCQ6QonkiOMzPfjJdvAJ5rBOHKY5JjhHNp6aJJGZlqZyw6R0-NHotNiMPD6uhIfpgOhfXzbvd5STJ66mjWFX_Vo7dlMkWMwUtiHnx2Ga6wwDYTcvSPX5a1A4mgMx-POfe9LQiKTP3nhZd_7pL8bak0ItDBzf_r-y24MUWc5M2oIrfhkmvvwLWxBuXmLvx4t7f_mmgSOMCbxjVk6LrmZDWQgG816VoSGI2bDUEXFeulmMYRsxlcGigwA_IRg4B40hMMfwm6-jDhiOJN2GKe9qgCjgQi2DDX25POk3C2E9Pze_D54O2n_ffpVI0htYLKIXXoiYq6Fj7XrNQSUY0KTa0umZOeFsJUzGe68Mxw53Om8R79h0YLz_MMG4v7sNN2rXsIhAtDbVZzaqgrJaUGITHXNecs59YWNoFs_kHKTlTloWJGo2LKUnE1DqjCAVVxQNVZAi-Wd76NRB1_ld4L_32RDCTb8UG3PlaTzaq69NwXxhVam9JYo0tHa5d5gTEbRkVZAruz1qjJ8nsVqn2hY8Q0M4FnSzPabFiI0a3rTqNMXlUYGmcXyxRZKNnBuJAJPBgVceltISiXnBUJVFsquvU52y3t6mvkDkcPHIJsmsDLWVN_9f3i8Xr0b-KP4Xoelb1MM7ELO8P61D2Bq_b7sOrXT6O5_gRgQT4B
  priority: 102
  providerName: Springer Nature
Title GBC: a parallel toolkit based on highly addressable byte-encoding blocks for extremely large-scale genotypes of species
URI https://link.springer.com/article/10.1186/s13059-023-02906-z
https://www.ncbi.nlm.nih.gov/pubmed/37069653
https://www.proquest.com/docview/2803040114
https://www.proquest.com/docview/2802885081
https://www.proquest.com/docview/3153205679
https://pubmed.ncbi.nlm.nih.gov/PMC10108510
https://doaj.org/article/d4f6f3be3aab4bcba4e0de1f78462441
Volume 24
WOSCitedRecordID wos000984385600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVADU
  databaseName: BioMedCentral
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017866
  issn: 1474-760X
  databaseCode: RBZ
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.biomedcentral.com/search/
  providerName: BioMedCentral
– providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0019426
  issn: 1474-760X
  databaseCode: DOA
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0019426
  issn: 1474-760X
  databaseCode: M7P
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0019426
  issn: 1474-760X
  databaseCode: 7X7
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0019426
  issn: 1474-760X
  databaseCode: BENPR
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0019426
  issn: 1474-760X
  databaseCode: PIMPY
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: Springer Journals
  customDbUrl:
  eissn: 1474-760X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0019426
  issn: 1474-760X
  databaseCode: RSV
  dateStart: 20000201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwEB5BAYkL4lkCZWUkbhDVedkxN7ZqgQOrqDy0cIlsxxarRglqUtD21zN2koUFChcOibKxI3k9j28mdr4BeCIRhHOLSY7ixoSp8SalRCiN0ugcLVVyKDbBF4t8uRTFT6W-3J6wgR54mLj9KrXMJsokUqpUaSVTQysTWY7AidDkEx_KxZRMjesHAoFn-kQmZ_sdeupMhIhPeAjMoc-3YMiz9f8pxPx9p-Qvy6UehY5uwo0xfCQvhmHfgkumuQ3XhoKS6zvw7eX84DmRxBF617WpSd-29cmqJw6sKtI2xNET12uC_sYXP1G1IWrdm9DxWToYIwrR7aQjGMsS9Nvu7SF2r91-8bBDeRriWF3di9uOtJa4DzUx174L748O3x28CsfSCqHmVPShQbeSVBW3scxSKRCiKJdUyzQzwtKEqzyzkUxsppixcSbxGp2BRHON4wgbk3uw07SNuQ-EcUV1VDGqqEkFpQrxLZYVY1nMtE50ANE006Ueecdd-Yu69PlHzspBOiVKp_TSKc8DeLp55svAuvHX3nMnwE1Px5jtb6AelaMelf_SowD2JvGXoxl3pSvdhV4Oc8YAHm-a0QDdqopsTHvm-8R5jnFudHGfJHL1NzLGRQC7g0ZtRptwygTLkgDyLV3b-jvbLc3qsycCR3fqImYawLNJLX-M_eL5evA_5ushXI-9PaVhxPdgpz89M4_gqv7ar7rTGVzmS-7P-QyuzA8XxfHMW-jMba4t8F7x-k3xEX8dzz-589sP3wHJxELB
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Jb9NAFH4qKQgu7IuhwCDBCayOtxkPEkK0UBq1jXooUnsyM_YYolp2iVOq9EfxG3nPS6oA7a0HDpGSzMTyON_3llneB_BSoxOOc0xyjLTWDW1DKaNcbU2KxjHnRrdiE3I0ivf31e4S_OrPwtC2yt4mNoY6q1KaI18lFSUEHIbv749-uKQaRaurvYRGC4stOzvBlK1-N_yI_-8r39_4tLe-6XaqAm4quZq6FhkVZJnMfR2FWqF15lLzVIeRVTkPpImj3NNBHhlhcz_S-B55oBGpvu9hY4DXvQLLIYKdD2B5d7izezBft5AxRUfdBxX67eEm2vIYKdGf2YnFao2uI1IuOkx8KUzqTxf8YiMf8K-Y9--tm3-s3zZucePW__ZAb8PNLgBnH1rG3IElW96Fa60k5-wenHxeW3_LNKOS6EVhCzatquJwPGXk7jNWlYwKPBczhha7kY8xhWVmNrUuVQSlQIAZjA8Oa4bZAEPPR_Ov2L2gHfdujYywjOri0tR3zaqc0VHXsa3vw5dLGfUDGJRVaR8BE9Lw1MsEN9yGinODEYKvMyEiX6RpkDrg9dBI0q5yOwmIFEmTwcUiaeGUIJySBk7JqQOv5785auuWXNh7jRA370k1x5svqsm3pDNhSRbmIg-MDbQ2oUmNDi3PrJdLDGExSPQcWOmBlnSGsE7OUObAi3kzmjBal9KlrY6bPn4cY6bgnd8n8EjBJBJSOfCwpcD8bgPJhRJR4EC8QI6F4Sy2lOPvTSl1dEiUc3AH3vQ8Orv385_X44uH-hyub-7tbCfbw9HWE7jhNzQPXU-uwGA6ObZP4Wr6czquJ886-8Hg62Uz7DfoZ6h-
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Zb9QwEB5BOcQLNyVQwEi8QVTnsmPeaGEBgVYVl_pm2YlNV42SapOCtr-esXPAQouEeIgUxWPJcWbmm4ntbwCeKATh3GKSo7kxYWq8SWkRKqMLdI6WatUXm-Dzeb6_L_Z-OcXvd7uPS5L9mQbH0lR320el7U08Z9stet5MhIg3eAnMiU_Ow4UUMxm3qevDxy_TOoJAABqPypzabw2OPGv_aaHmnzsmf1s29Wg0u_b_73Edrg6RKHnRq84NOGfqm3Cpr025ugXfX-_sPieKOG7wqjIV6ZqmOlx0xOFeSZqaOKbjakXQdfk6KroyRK86EzpqTIeIRCNQHrYEw2KCEOB-RKJ45baehy2qhiGOINb9A25JY4k784lp-234PHv1afdNOFRpCAtORRca9FBJWXIbqyxVAtGOckULlWZGWJpwnWc2UonNNDM2zhTeo19RaPlxHGFjcgc26qY2d4EwrmkRlYxqalJBqUaojFXJWBazokiKAKLxY8lioDB3lTQq6VOZnMl-QiVOqPQTKk8CeDr1OeoJPP4qveN0YJJ05Nv-QbP8KgdblmVqmU20SZTSqS60Sg0tTWQ5xnIYLUUBbI0aJAeP0EpXBQwdJqafATyemtGW3QKNqk1z7GXiPMeQOTpbJolcKY-McRHAZq-U02gTTplgWRJAvqaua6-z3lIvDjynOHpmF3zTAJ6NWvtz7GfP171_E38El_dezuT7t_N39-FK7PU-DSO-BRvd8tg8gIvFt27RLh96K_4BXs9JyQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GBC%3A+a+parallel+toolkit+based+on+highly+addressable+byte-encoding+blocks+for+extremely+large-scale+genotypes+of+species&rft.jtitle=Genome+biology&rft.au=Liubin+Zhang&rft.au=Yangyang+Yuan&rft.au=Wenjie+Peng&rft.au=Bin+Tang&rft.date=2023-04-17&rft.pub=BMC&rft.eissn=1474-760X&rft.volume=24&rft.issue=1&rft.spage=1&rft.epage=22&rft_id=info:doi/10.1186%2Fs13059-023-02906-z&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_d4f6f3be3aab4bcba4e0de1f78462441
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1474-760X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1474-760X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1474-760X&client=summon