Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention

Abstract Motivation In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deci...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics Vol. 37; no. 9; pp. 1211 - 1217
Main Authors: Konishi, Hiroki, Yamaguchi, Rui, Yamaguchi, Kiyoshi, Furukawa, Yoichi, Imoto, Seiya
Format: Journal Article
Language:English
Published: England Oxford University Press 09.06.2021
Subjects:
ISSN:1367-4803, 1367-4811, 1460-2059, 1367-4811
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Motivation In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deciphering precise DNA sequences from noisy and complicated nanopore raw signals remains a crucial demand for downstream analyses based on higher-quality nanopore sequencing, although various basecallers have been introduced to date. Results To address this need, we developed a novel basecaller, Halcyon, that incorporates neural-network techniques frequently used in the field of machine translation. Our model employs monotonic-attention mechanisms to learn semantic correspondences between nucleotides and signal levels without any pre-segmentation against input signals. We evaluated performance with a human whole-genome sequencing dataset and demonstrated that Halcyon outperformed existing third-party basecallers and achieved competitive performance against the latest Oxford Nanopore Technologies’ basecallers. Availabilityand implementation The source code (halcyon) can be found at https://github.com/relastle/halcyon.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4803
1367-4811
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btaa953