Bit-parallel sequence-to-graph alignment
Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan...
Gespeichert in:
| Veröffentlicht in: | bioRxiv |
|---|---|
| Hauptverfasser: | , , |
| Format: | Paper |
| Sprache: | Englisch |
| Veröffentlicht: |
Cold Spring Harbor
Cold Spring Harbor Laboratory Press
15.05.2018
Cold Spring Harbor Laboratory |
| Ausgabe: | 1.1 |
| Schlagworte: | |
| ISSN: | 2692-8205, 2692-8205 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction, and variant calling with respect to a variation graph. Here, we generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers' bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of w over naive algorithms. Our bitvector-based graph alignment algorithm reaches a worst case runtime of O(V + m/w E log(w)) for acyclic graphs and O(V + mE log(w)) for arbitrary cyclic graphs. We apply it to four different types of graphs and observe a speedup between 3.1-fold and 10.1-fold compared to previous algorithms. |
|---|---|
| Bibliographie: | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
| ISSN: | 2692-8205 2692-8205 |
| DOI: | 10.1101/323063 |