MSC: a metagenomic sequence classification algorithm

Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align meta...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Bioinformatics (Oxford, England) Ročník 35; číslo 17; s. 2932 - 2940
Hlavní autoři: Saha, Subrata, Johnson, Jethro, Pal, Soumitra, Weinstock, George M, Rajasekaran, Sanguthevar
Médium: Journal Article
Jazyk:angličtina
Vydáno: England Oxford University Press 01.09.2019
Témata:
ISSN:1367-4803, 1367-4811, 1367-4811
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align metagenomic reads onto known reference genomes to identify microbes in the sample. Since such a collection of reference genomes is very large, the approach often needs high-end computing machines with large memory which is not often available to researchers. Alternative approaches follow an alignment-free methodology where the presence of a microbe is predicted using the information about the unique k-mers present in the microbial genomes. However, such approaches suffer from high false positives due to trading off the value of k with the computational resources. In this article, we propose a highly efficient metagenomic sequence classification (MSC) algorithm that is a hybrid of both approaches. Instead of aligning reads to the full genomes, MSC aligns reads onto a set of carefully chosen, shorter and highly discriminating model sequences built from the unique k-mers of each of the reference sequences. Microbiome researchers are generally interested in two objectives of a taxonomic classifier: (i) to detect prevalence, i.e. the taxa present in a sample, and (ii) to estimate their relative abundances. MSC is primarily designed to detect prevalence and experimental results show that MSC is indeed a more effective and efficient algorithm compared to the other state-of-the-art algorithms in terms of accuracy, memory and runtime. Moreover, MSC outputs an approximate estimate of the abundances. The implementations are freely available for non-commercial purposes. They can be downloaded from https://drive.google.com/open?id=1XirkAamkQ3ltWvI1W1igYQFusp9DHtVl.
AbstractList Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align metagenomic reads onto known reference genomes to identify microbes in the sample. Since such a collection of reference genomes is very large, the approach often needs high-end computing machines with large memory which is not often available to researchers. Alternative approaches follow an alignment-free methodology where the presence of a microbe is predicted using the information about the unique k-mers present in the microbial genomes. However, such approaches suffer from high false positives due to trading off the value of k with the computational resources. In this article, we propose a highly efficient metagenomic sequence classification (MSC) algorithm that is a hybrid of both approaches. Instead of aligning reads to the full genomes, MSC aligns reads onto a set of carefully chosen, shorter and highly discriminating model sequences built from the unique k-mers of each of the reference sequences. Microbiome researchers are generally interested in two objectives of a taxonomic classifier: (i) to detect prevalence, i.e. the taxa present in a sample, and (ii) to estimate their relative abundances. MSC is primarily designed to detect prevalence and experimental results show that MSC is indeed a more effective and efficient algorithm compared to the other state-of-the-art algorithms in terms of accuracy, memory and runtime. Moreover, MSC outputs an approximate estimate of the abundances. The implementations are freely available for non-commercial purposes. They can be downloaded from https://drive.google.com/open?id=1XirkAamkQ3ltWvI1W1igYQFusp9DHtVl.
Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align metagenomic reads onto known reference genomes to identify microbes in the sample. Since such a collection of reference genomes is very large, the approach often needs high-end computing machines with large memory which is not often available to researchers. Alternative approaches follow an alignment-free methodology where the presence of a microbe is predicted using the information about the unique k-mers present in the microbial genomes. However, such approaches suffer from high false positives due to trading off the value of k with the computational resources. In this article, we propose a highly efficient metagenomic sequence classification (MSC) algorithm that is a hybrid of both approaches. Instead of aligning reads to the full genomes, MSC aligns reads onto a set of carefully chosen, shorter and highly discriminating model sequences built from the unique k-mers of each of the reference sequences.MOTIVATIONMetagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align metagenomic reads onto known reference genomes to identify microbes in the sample. Since such a collection of reference genomes is very large, the approach often needs high-end computing machines with large memory which is not often available to researchers. Alternative approaches follow an alignment-free methodology where the presence of a microbe is predicted using the information about the unique k-mers present in the microbial genomes. However, such approaches suffer from high false positives due to trading off the value of k with the computational resources. In this article, we propose a highly efficient metagenomic sequence classification (MSC) algorithm that is a hybrid of both approaches. Instead of aligning reads to the full genomes, MSC aligns reads onto a set of carefully chosen, shorter and highly discriminating model sequences built from the unique k-mers of each of the reference sequences.Microbiome researchers are generally interested in two objectives of a taxonomic classifier: (i) to detect prevalence, i.e. the taxa present in a sample, and (ii) to estimate their relative abundances. MSC is primarily designed to detect prevalence and experimental results show that MSC is indeed a more effective and efficient algorithm compared to the other state-of-the-art algorithms in terms of accuracy, memory and runtime. Moreover, MSC outputs an approximate estimate of the abundances.RESULTSMicrobiome researchers are generally interested in two objectives of a taxonomic classifier: (i) to detect prevalence, i.e. the taxa present in a sample, and (ii) to estimate their relative abundances. MSC is primarily designed to detect prevalence and experimental results show that MSC is indeed a more effective and efficient algorithm compared to the other state-of-the-art algorithms in terms of accuracy, memory and runtime. Moreover, MSC outputs an approximate estimate of the abundances.The implementations are freely available for non-commercial purposes. They can be downloaded from https://drive.google.com/open?id=1XirkAamkQ3ltWvI1W1igYQFusp9DHtVl.AVAILABILITY AND IMPLEMENTATIONThe implementations are freely available for non-commercial purposes. They can be downloaded from https://drive.google.com/open?id=1XirkAamkQ3ltWvI1W1igYQFusp9DHtVl.
Author Rajasekaran, Sanguthevar
Weinstock, George M
Pal, Soumitra
Johnson, Jethro
Saha, Subrata
AuthorAffiliation bty1071-aff1 Healthcare and Life Sciences Division, IBM Thomas J. Watson Research Center , Yorktown Heights, NY, USA
bty1071-aff3 National Center for Biotechnology Information, National Institutes of Health , Bethesda, MD, USA
bty1071-aff2 The Jackson Laboratory for Genomic Medicine , Farmington, CT, USA
bty1071-aff4 Computer Science and Engineering Department, University of Connecticut , Storrs, CT, USA
AuthorAffiliation_xml – name: bty1071-aff1 Healthcare and Life Sciences Division, IBM Thomas J. Watson Research Center , Yorktown Heights, NY, USA
– name: bty1071-aff3 National Center for Biotechnology Information, National Institutes of Health , Bethesda, MD, USA
– name: bty1071-aff4 Computer Science and Engineering Department, University of Connecticut , Storrs, CT, USA
– name: bty1071-aff2 The Jackson Laboratory for Genomic Medicine , Farmington, CT, USA
Author_xml – sequence: 1
  givenname: Subrata
  surname: Saha
  fullname: Saha, Subrata
– sequence: 2
  givenname: Jethro
  surname: Johnson
  fullname: Johnson, Jethro
– sequence: 3
  givenname: Soumitra
  surname: Pal
  fullname: Pal, Soumitra
– sequence: 4
  givenname: George M
  surname: Weinstock
  fullname: Weinstock, George M
– sequence: 5
  givenname: Sanguthevar
  surname: Rajasekaran
  fullname: Rajasekaran, Sanguthevar
BackLink https://www.ncbi.nlm.nih.gov/pubmed/30649204$$D View this record in MEDLINE/PubMed
BookMark eNqFUctOAyEUJUZjH_oLzSzd1MLwmBljTEzjK6lxoa4JMNBiZqAO1KR_L9raWDeuLrmcB5wzAIfOOw3ACMFzBCs8kdZbZ3zXimhVmMi4RrBAB6CPMCvGpETocHeGuAcGIbxBCCmk7Bj0MGSkyiHpA_L4PL3IRNbqKOba-daqLOj3lXZKZ6oRIVhjVTLxLhPN3Hc2LtoTcGREE_Tpdg7B6-3Ny_R-PHu6e5hez8aKIBLHNa4ppQZpnBNYGgwpritZpaUkZSVqA02uCS5KJkpYG6kUZTWRIpdE5pBJPARXG93lSra6VtrFTjR82dlWdGvuheX7N84u-Nx_cFZhhGmRBM62Ap1PfwqRtzYo3TTCab8KPEdFhUtc5jRBR7-9diY_USUA2wBU50PotNlBEORfnfD9Tvi2k0S8_ENUNn4nmt5sm__on3G2m7g
CitedBy_id crossref_primary_10_1016_j_jbi_2023_104316
crossref_primary_10_1371_journal_pone_0267106
crossref_primary_10_1093_bioadv_vbad014
crossref_primary_10_3389_fmicb_2022_708335
Cites_doi 10.1093/bioinformatics/btx106
10.1186/gb-2014-15-3-r46
10.1016/j.gendis.2017.06.001
10.1038/srep19233
10.1093/bioinformatics/btx520
10.1128/mSystems.00020-16
10.1101/gr.096651.109
10.1093/nar/gkv1189
10.1089/10665270252935430
10.1371/journal.pone.0091784
10.1186/1471-2105-13-92
10.1109/BIBM.2010.5706544
10.1093/nar/gks251
10.7717/peerj-cs.104
10.1101/gr.5969107
10.1093/bioinformatics/btn322
10.1038/nmeth.3589
10.1111/j.2041-1014.2012.00642.x
10.1007/s00294-017-0693-8
10.1038/nmeth.2693
10.1186/s12864-015-1419-2
10.1093/bioinformatics/btt389
10.1186/1471-2105-10-421
10.1038/ismej.2016.174
10.1038/ncomms11257
10.1093/nar/gkm929
10.1093/bioinformatics/btw542
ContentType Journal Article
Copyright The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2019
Copyright_xml – notice: The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
– notice: The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2019
DBID AAYXX
CITATION
NPM
7X8
5PM
DOI 10.1093/bioinformatics/bty1071
DatabaseName CrossRef
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1367-4811
EndPage 2940
ExternalDocumentID PMC6931357
30649204
10_1093_bioinformatics_bty1071
Genre Research Support, U.S. Gov't, Non-P.H.S
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NCI NIH HHS
  grantid: P30 CA034196
– fundername: ; ; ;
– fundername: ; ; ;
  grantid: 1447711; 1743418
GroupedDBID ---
-E4
-~X
.2P
.DC
.I3
0R~
23N
2WC
4.4
48X
53G
5GY
5WA
70D
AAIJN
AAIMJ
AAJKP
AAKPC
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
AAYXX
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABPTD
ABQLI
ABWST
ABXVV
ABZBJ
ACGFS
ACIWK
ACPRK
ACUFI
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C45
CDBKE
CITATION
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EMOBN
F5P
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
JXSIZ
KAQDR
KOP
KQ8
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
R44
RD5
RNS
ROL
ROX
RPM
RUSNO
RW1
RXO
SV3
TEORI
TJP
TLC
TOX
TR2
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~91
~KM
NPM
7X8
5PM
EJD
ID FETCH-LOGICAL-c414t-d3d555f1e32408f3053d9b9d55b489adf0f2e43786a80dfbcc56d4ba2b4b206b3
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000487323400007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1367-4803
1367-4811
IngestDate Thu Aug 21 14:10:52 EDT 2025
Fri Jul 11 12:35:28 EDT 2025
Mon Jul 21 05:55:00 EDT 2025
Tue Nov 18 21:59:24 EST 2025
Sat Nov 29 03:49:13 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 17
Language English
License https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c414t-d3d555f1e32408f3053d9b9d55b489adf0f2e43786a80dfbcc56d4ba2b4b206b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://academic.oup.com/bioinformatics/article-pdf/35/17/2932/31617943/bty1071.pdf
PMID 30649204
PQID 2179383825
PQPubID 23479
PageCount 9
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_6931357
proquest_miscellaneous_2179383825
pubmed_primary_30649204
crossref_primary_10_1093_bioinformatics_bty1071
crossref_citationtrail_10_1093_bioinformatics_bty1071
PublicationCentury 2000
PublicationDate 2019-09-01
PublicationDateYYYYMMDD 2019-09-01
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-09-01
  day: 01
PublicationDecade 2010
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics (Oxford, England)
PublicationTitleAlternate Bioinformatics
PublicationYear 2019
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Diaz (2023062803480328100_bty1071-B8) 2012; 27
Buhler (2023062803480328100_bty1071-B5) 2002; 9
Ounit (2023062803480328100_bty1071-B21) 2016; 32
Ames (2023062803480328100_bty1071-B1) 2013; 29
Liu (2023062803480328100_bty1071-B15) 2010
Ounit (2023062803480328100_bty1071-B22) 2015; 16
Chao (2023062803480328100_bty1071-B7) 1984; 11
Jousset (2023062803480328100_bty1071-B11) 2017; 11
Angly (2023062803480328100_bty1071-B2) 2012; 40
Truong (2023062803480328100_bty1071-B26) 2015; 12
Koslicki (2023062803480328100_bty1071-B13) 2014; 9
Garrido-Cardenas (2023062803480328100_bty1071-B9) 2017; 63
Lindgreen (2023062803480328100_bty1071-B14) 2016; 6
Peterson (2023062803480328100_bty1071-B23) 2009; 19
Schaeffer (2023062803480328100_bty1071-B24) 2017; 33
Menzel (2023062803480328100_bty1071-B17) 2016; 7
Morgulis (2023062803480328100_bty1071-B18) 2008; 24
O’Leary (2023062803480328100_bty1071-B20) 2015; 44
Benson (2023062803480328100_bty1071-B4) 2008; 36
Huson (2023062803480328100_bty1071-B10) 2007; 17
Xia (2023062803480328100_bty1071-B28) 2017; 4
Koslicki (2023062803480328100_bty1071-B12) 2016; 1
Müller (2023062803480328100_bty1071-B19) 2017; 33
Wood (2023062803480328100_bty1071-B27) 2014; 15
Sunagawa (2023062803480328100_bty1071-B25) 2013; 10
Bazinet (2023062803480328100_bty1071-B3) 2012; 13
Camacho (2023062803480328100_bty1071-B6) 2009; 10
Lu (2023062803480328100_bty1071-B16) 2016; 3
References_xml – volume: 33
  start-page: 2082
  year: 2017
  ident: 2023062803480328100_bty1071-B24
  article-title: Pseudoalignment for metagenomic read assignment
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx106
– volume: 15
  start-page: R46
  year: 2014
  ident: 2023062803480328100_bty1071-B27
  article-title: Kraken: ultrafast metagenomic sequence classification using exact alignments
  publication-title: Genome Biol.
  doi: 10.1186/gb-2014-15-3-r46
– volume: 4
  start-page: 138
  year: 2017
  ident: 2023062803480328100_bty1071-B28
  article-title: Hypothesis testing and statistical analysis of microbiome
  publication-title: Genes Dis.
  doi: 10.1016/j.gendis.2017.06.001
– volume: 6
  start-page: 19233
  year: 2016
  ident: 2023062803480328100_bty1071-B14
  article-title: An evaluation of the accuracy and speed of metagenome analysis tools
  publication-title: Sci. Rep.
  doi: 10.1038/srep19233
– volume: 33
  start-page: 3740
  year: 2017
  ident: 2023062803480328100_bty1071-B19
  article-title: MetaCache: context-aware classification of metagenomic reads using minhashing
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx520
– volume: 1
  year: 2016
  ident: 2023062803480328100_bty1071-B12
  article-title: MetaPalette: a k-mer painting approach for metagenomic taxonomic profiling and quantification of novel strain variation
  publication-title: mSystems
  doi: 10.1128/mSystems.00020-16
– volume: 19
  start-page: 2317
  year: 2009
  ident: 2023062803480328100_bty1071-B23
  article-title: The NIH human microbiome project
  publication-title: Genome Res.
  doi: 10.1101/gr.096651.109
– volume: 44
  start-page: D733
  year: 2015
  ident: 2023062803480328100_bty1071-B20
  article-title: Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv1189
– volume: 9
  start-page: 225
  year: 2002
  ident: 2023062803480328100_bty1071-B5
  article-title: Finding motifs using random projections
  publication-title: J. Comput. Biol.
  doi: 10.1089/10665270252935430
– volume: 9
  start-page: e91784
  year: 2014
  ident: 2023062803480328100_bty1071-B13
  article-title: WGSQuikr: fast whole-genome shotgun metagenomic classification
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0091784
– volume: 13
  start-page: 92
  year: 2012
  ident: 2023062803480328100_bty1071-B3
  article-title: A comparative evaluation of sequence classification programs
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-92
– volume-title: IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
  year: 2010
  ident: 2023062803480328100_bty1071-B15
  article-title: MetaPhyler: taxonomic profiling for metagenomic sequences
  doi: 10.1109/BIBM.2010.5706544
– volume: 11
  start-page: 265
  year: 1984
  ident: 2023062803480328100_bty1071-B7
  article-title: Nonparametric estimation of the number of classes in a population
  publication-title: Scand. J. Stat.
– volume: 40
  start-page: e94
  year: 2012
  ident: 2023062803480328100_bty1071-B2
  article-title: Grinder: a versatile amplicon and shotgun sequence simulator
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gks251
– volume: 3
  start-page: e104
  year: 2016
  ident: 2023062803480328100_bty1071-B16
  article-title: Bracken: estimating species abundance in metagenomics data
  publication-title: PeerJ Comput. Sci.
  doi: 10.7717/peerj-cs.104
– volume: 17
  start-page: 377
  year: 2007
  ident: 2023062803480328100_bty1071-B10
  article-title: MEGAN analysis of metagenomic data
  publication-title: Genome Res.
  doi: 10.1101/gr.5969107
– volume: 24
  start-page: 1757
  year: 2008
  ident: 2023062803480328100_bty1071-B18
  article-title: Database indexing for production MegaBLAST searches
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn322
– volume: 12
  start-page: 902
  year: 2015
  ident: 2023062803480328100_bty1071-B26
  article-title: MetaPhlAn2 for enhanced metagenomic taxonomic profiling
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.3589
– volume: 27
  start-page: 182
  year: 2012
  ident: 2023062803480328100_bty1071-B8
  article-title: Using high throughput sequencing to explore the biodiversity in oral bacterial communities
  publication-title: Mol. Oral Microbiol.
  doi: 10.1111/j.2041-1014.2012.00642.x
– volume: 63
  start-page: 819
  year: 2017
  ident: 2023062803480328100_bty1071-B9
  article-title: The metagenomics worldwide research
  publication-title: Curr. Genet.
  doi: 10.1007/s00294-017-0693-8
– volume: 10
  start-page: 1196
  year: 2013
  ident: 2023062803480328100_bty1071-B25
  article-title: Metagenomic species profiling using universal phylogenetic marker genes
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.2693
– volume: 16
  start-page: 236
  year: 2015
  ident: 2023062803480328100_bty1071-B22
  article-title: CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers
  publication-title: BMC Genomics
  doi: 10.1186/s12864-015-1419-2
– volume: 29
  start-page: 2253
  year: 2013
  ident: 2023062803480328100_bty1071-B1
  article-title: Scalable metagenomic taxonomy classification using a reference genome database
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt389
– volume: 10
  start-page: 421
  year: 2009
  ident: 2023062803480328100_bty1071-B6
  article-title: BLAST: architecture and applications
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-10-421
– volume: 11
  start-page: 853
  year: 2017
  ident: 2023062803480328100_bty1071-B11
  article-title: Where less may be more: how the rare biosphere pulls ecosystems strings
  publication-title: ISME J.
  doi: 10.1038/ismej.2016.174
– volume: 7
  start-page: 11257
  year: 2016
  ident: 2023062803480328100_bty1071-B17
  article-title: Fast and sensitive taxonomic classification for metagenomics with Kaiju
  publication-title: Nat. Commun.
  doi: 10.1038/ncomms11257
– volume: 36
  start-page: D25
  year: 2008
  ident: 2023062803480328100_bty1071-B4
  article-title: Genbank
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkm929
– volume: 32
  start-page: 3823
  year: 2016
  ident: 2023062803480328100_bty1071-B21
  article-title: Higher classification sensitivity of short metagenomic reads with CLARK-S
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw542
SSID ssj0005056
Score 2.3464382
Snippet Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of...
SourceID pubmedcentral
proquest
pubmed
crossref
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 2932
SubjectTerms Original Papers
Title MSC: a metagenomic sequence classification algorithm
URI https://www.ncbi.nlm.nih.gov/pubmed/30649204
https://www.proquest.com/docview/2179383825
https://pubmed.ncbi.nlm.nih.gov/PMC6931357
Volume 35
WOSCitedRecordID wos000487323400007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20220930
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdpt8Fexr6XfRQPxl6GiS1ZlrS3UVrG6NJBXMibkSy5NSR2ljol-e8nWbIdr4N1D3sx4RJb5u5yOt3H7wD4QAQjZmvxoTJpRi64zxCOfQ5lJBlWnHDZDJsg0ymdz9mP0WjX9sLcLEhZ0u2Wrf6rqDVNC9u0zv6DuLuHaoL-rIWur1rs-nonwX-fHdsO5qWquYFgNdXvbcX0p8x4y6Y8yAqeLy6rdVFfLQfJ3aJyeKoNhrMBJN22NfBu6MdeAGHGbc5Im6C1bXTrSnJcN9c3ZYYx9MmqJuo8qzbLol73u4IqtKfqrLON1LtIrYtJhH3RVWtGkUFTp86Mqj_QnO21UCWtjpF9S8ps3POWibfwV2LACUOod_okG_bbWpvKn56npxdnZ2lyMk8-rn76ZuCYScy76SsH4B4kmJlqwOR83lcHBc3g3-6l2-ZyhibDpSdu4aFfc-uw8nvN7Z4TkzwGj9zpw_titeYJGKnyKXhg55HunoFI685nj3t7muO1muMNNcfrNOc5uDg9SY6_-m6shp9FYVT7EkmMcR4qg8VIc23wkWSCaaKIKOMyD3KoIkRozGkgc5FlOJaR4FBEAgaxQC_AYVmV6hXwICGcZyGH2oxHSruOOaFSBCQWNMcZkWOAW6akmcOcN6NPFqmtfUDpkJmpY-YYTLr7VhZ15a93vG95nmoDabJevFTV5jqFZguiiEI8Bi-tDLpnmuM3g0E0BmQgne4HBnx9-E1ZXDUg7DFDIcLk9R3WfQMe9n-St-CwXm_UO3A_u6mL6_UROCBzetRo3y8OfKv2
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MSC%3A+a+metagenomic+sequence+classification+algorithm&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Saha%2C+Subrata&rft.au=Johnson%2C+Jethro&rft.au=Pal%2C+Soumitra&rft.au=Weinstock%2C+George+M&rft.date=2019-09-01&rft.issn=1367-4811&rft.eissn=1367-4811&rft.volume=35&rft.issue=17&rft.spage=2932&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbty1071&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon