To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics

As computational biologists continue to be inundated by ever increasing amounts of metagenomic data, the need for data analysis approaches that keep up with the pace of sequence archives has remained a challenge. In recent years, the accelerated pace of genomic data availability has been accompanied...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Nucleic acids research Ročník 48; číslo 10; s. 5217 - 5234
Hlavní autori: Elworth, R A Leo, Wang, Qi, Kota, Pavan K, Barberan, C J, Coleman, Benjamin, Balaji, Advait, Gupta, Gaurav, Baraniuk, Richard G, Shrivastava, Anshumali, Treangen, Todd J
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: England Oxford University Press 04.06.2020
Predmet:
ISSN:0305-1048, 1362-4962, 1362-4962
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:As computational biologists continue to be inundated by ever increasing amounts of metagenomic data, the need for data analysis approaches that keep up with the pace of sequence archives has remained a challenge. In recent years, the accelerated pace of genomic data availability has been accompanied by the application of a wide array of highly efficient approaches from other fields to the field of metagenomics. For instance, sketching algorithms such as MinHash have seen a rapid and widespread adoption. These techniques handle increasingly large datasets with minimal sacrifices in quality for tasks such as sequence similarity calculations. Here, we briefly review the fundamentals of the most impactful probabilistic and signal processing algorithms. We also highlight more recent advances to augment previous reviews in these areas that have taken a broader approach. We then explore the application of these techniques to metagenomics, discuss their pros and cons, and speculate on their future directions.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
These authors share senior authorship.
These authors contributed equally to this work and should be regarded as joint first authors.
ISSN:0305-1048
1362-4962
1362-4962
DOI:10.1093/nar/gkaa265