A scalable distributed workflow for accelerating long reads self-correction

Third-Generation Sequencing (TGS) technologies have transformed genomic research by enabling the extraction of longer nucleotide sequences (referred to as long reads) and providing deeper insights into genome structure. However, long reads are often associated with high sequencing error rates, makin...

Full description

Saved in:
Bibliographic Details
Published in:Future generation computer systems Vol. 177; p. 108244
Main Authors: Ceccaroni, Riccardo, Di Rocco, Lorenzo, Ferraro Petrillo, Umberto, Brutti, Pierpaolo
Format: Journal Article
Language:English
Published: Elsevier B.V 01.04.2026
Subjects:
ISSN:0167-739X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Third-Generation Sequencing (TGS) technologies have transformed genomic research by enabling the extraction of longer nucleotide sequences (referred to as long reads) and providing deeper insights into genome structure. However, long reads are often associated with high sequencing error rates, making their correction a major challenge in many computational genomic pipelines. In this paper, we introduce HyperC, a distributed workflow designed to accelerate the execution of existing long-read self-correction tools through a hybrid parallelization strategy. By combining MPI and OpenMP, our proposal efficiently scatters and executes tasks across a distributed computing system. Optimized input data handling further reduces I/O bottlenecks and maximizes resource utilization. To assess the effectiveness of HyperC, we integrated it with CONSENT, a high-performance correction module, and conducted extensive experiments on real-world sequencing datasets. The results show significant reductions in execution time and improved scalability compared to the standalone execution of CONSENT, establishing HyperCas a robust and practical solution for high-performance genomic analysis and population-scale studies.
ISSN:0167-739X
DOI:10.1016/j.future.2025.108244