PVGwfa: a multi-level parallel sequence-to-graph alignment algorithm

A pangenome graph represents the genomes of multiple individuals, offering a comprehensive reference and overcoming allele bias from linear reference genomes. Sequence-to-graph alignment, crucial for pangenome tasks, aligns sequences to a graph to find the best matches. However, existing algorithms...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of supercomputing Vol. 81; no. 5; p. 743
Main Authors: Peng, Chenchen, Xia, Zeyu, Tang, Shengbo, Guo, Yifei, Yang, Canqun, Tang, Tao, Cui, Yingbo
Format: Journal Article
Language:English
Published: New York Springer Nature B.V 15.04.2025
Subjects:
ISSN:1573-0484, 0920-8542, 1573-0484
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A pangenome graph represents the genomes of multiple individuals, offering a comprehensive reference and overcoming allele bias from linear reference genomes. Sequence-to-graph alignment, crucial for pangenome tasks, aligns sequences to a graph to find the best matches. However, existing algorithms struggle with large-scale sequences. In this paper, we propose PVGwfa, a multi-level parallel sequence-to-graph alignment algorithm. We first employ MPI and Pthread for multi-process and multi-thread parallelization. Next, we introduce a hybrid load balancing strategy for better performance. Additionally, we vectorize the core of PVGwfa using SIMD instructions to accelerate sequence alignment. Experiments on real and simulated datasets show that PVGwfa reduces computation time from nearly an hour to a few minutes. For large datasets, PVGwfa achieved speedups ranging from 1.98× to 100.44× as the number of processes increased from 2 to 128, while maintaining consistent alignment results. The PVGwfa tool and source code are publicly available at https://github.com/nudt-bioinfo/PVGwfa.git.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1573-0484
0920-8542
1573-0484
DOI:10.1007/s11227-025-07184-z