Maximum-scoring path sets on pangenome graphs of constant treewidth

We generalize a problem of finding maximum-scoring segment sets, previously studied by Csűrös (IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139–150), from sequences to graphs. Namely, given a vertex-weighted graph G and a non-negative startup penalty c , we can find a...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Frontiers in bioinformatics Ročník 4; s. 1391086
Hlavní autori: Brejová, Broňa, Gagie, Travis, Herencsárová, Eva, Vinař, Tomáš
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Switzerland Frontiers Media S.A 01.07.2024
Predmet:
ISSN:2673-7647, 2673-7647
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:We generalize a problem of finding maximum-scoring segment sets, previously studied by Csűrös (IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139–150), from sequences to graphs. Namely, given a vertex-weighted graph G and a non-negative startup penalty c , we can find a set of vertex-disjoint paths in G with maximum total score when each path’s score is its vertices’ total weight minus c . We call this new problem maximum-scoring path sets (MSPS). We present an algorithm that has a linear-time complexity for graphs with a constant treewidth. Generalization from sequences to graphs allows the algorithm to be used on pangenome graphs representing several related genomes and can be seen as a common abstraction for several biological problems on pangenomes, including searching for CpG islands, ChIP-seq data analysis, analysis of region enrichment for functional elements, or simple chaining problems.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Manuel Caceres, Aalto University, Finland
Edited by: Cinzia Pizzi, University of Padua, Italy
Reviewed by: Gianluca Della Vedova, University of Milano-Bicocca, Italy
ISSN:2673-7647
2673-7647
DOI:10.3389/fbinf.2024.1391086