BOA: A partitioned view of genome assembly
De novo genome assembly is a fundamental problem in computational molecular biology that aims to reconstruct an unknown genome sequence from a set of short DNA sequences (or reads) obtained from the genome. The relative ordering of the reads along the target genome is not known a priori, which is on...
Uložené v:
| Vydané v: | iScience Ročník 25; číslo 11; s. 105273 |
|---|---|
| Hlavní autori: | , , , , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Inc
18.11.2022
Elsevier |
| Predmet: | |
| ISSN: | 2589-0042, 2589-0042 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | De novo genome assembly is a fundamental problem in computational molecular biology that aims to reconstruct an unknown genome sequence from a set of short DNA sequences (or reads) obtained from the genome. The relative ordering of the reads along the target genome is not known a priori, which is one of the main contributors to the increased complexity of the assembly process. In this article, with the dual objective of improving assembly quality and exposing a high degree of parallelism, we present a partitioning-based approach. Our framework, BOA (bucket-order-assemble), uses a bucketing alongside graph- and hypergraph-based partitioning techniques to produce a partial ordering of the reads. This partial ordering enables us to divide the read set into disjoint blocks that can be independently assembled in parallel using any state-of-the-art serial assembler of choice. Experimental results show that BOA improves both the overall assembly quality and performance.
[Display omitted]
•A graph/hypergraph partitioning based method to improve assembly quality and runtime•Bucketing and graph/hypergraph partitioning to partition reads into blocks•Each block is then independently assembled using any standalone assembler•Hypergraph variant produces more precise contigs and is faster than state-of-the-art assemblers
Genomics; Bioinformatics; High-performance computing in bioinformatics; Algorithms. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 These authors contributed equally Lead contact |
| ISSN: | 2589-0042 2589-0042 |
| DOI: | 10.1016/j.isci.2022.105273 |