Integrating Hi-C links with assembly graphs for chromosome-scale assembly
Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, th...
Saved in:
| Published in: | PLoS computational biology Vol. 15; no. 8; p. e1007273 |
|---|---|
| Main Authors: | , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
Public Library of Science
01.08.2019
Public Library of Science (PLoS) |
| Subjects: | |
| ISSN: | 1553-7358, 1553-734X, 1553-7358 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA. |
|---|---|
| Bibliography: | new_version ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Sergey Koren has received travel and accommodation expenses to speak at Oxford Nanopore Technologies conferences. Anthony Schmitt and Siddarth Selvaraj are employees of Arima Genomics, a company commercializing Hi-C DNA sequencing technologies. |
| ISSN: | 1553-7358 1553-734X 1553-7358 |
| DOI: | 10.1371/journal.pcbi.1007273 |