Integrating Hi-C links with assembly graphs for chromosome-scale assembly

Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, th...

Full description

Saved in:
Bibliographic Details
Published in:PLoS computational biology Vol. 15; no. 8; p. e1007273
Main Authors: Ghurye, Jay, Rhie, Arang, Walenz, Brian P., Schmitt, Anthony, Selvaraj, Siddarth, Pop, Mihai, Phillippy, Adam M., Koren, Sergey
Format: Journal Article
Language:English
Published: United States Public Library of Science 01.08.2019
Public Library of Science (PLoS)
Subjects:
ISSN:1553-7358, 1553-734X, 1553-7358
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA.
Bibliography:new_version
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Sergey Koren has received travel and accommodation expenses to speak at Oxford Nanopore Technologies conferences. Anthony Schmitt and Siddarth Selvaraj are employees of Arima Genomics, a company commercializing Hi-C DNA sequencing technologies.
ISSN:1553-7358
1553-734X
1553-7358
DOI:10.1371/journal.pcbi.1007273