Non-overlapping indexing in BWT-runs bounded space
We revisit the non-overlapping indexing problem for an efficient repetition-aware solution. The problem is to index a text T[1..n], such that whenever a pattern P[1..p] comes as a query, we can report the largest set of non-overlapping occurrences of P in T. A previous index by Cohen and Porat [ISAA...
Uložené v:
| Vydané v: | Theoretical computer science Ročník 1056; s. 115512 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier B.V
21.11.2025
|
| Predmet: | |
| ISSN: | 0304-3975 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | We revisit the non-overlapping indexing problem for an efficient repetition-aware solution. The problem is to index a text T[1..n], such that whenever a pattern P[1..p] comes as a query, we can report the largest set of non-overlapping occurrences of P in T. A previous index by Cohen and Porat [ISAAC 2009] takes linear space and optimal O(p+occno) query time, where occno denotes the output size. We present an index of size O(r), where r denotes the number of runs in the Burrows Wheeler Transform (BWT) of T. The parameter r is significantly smaller than n for highly repetitive texts. The query time of our index is O(ploglogwσ+sort(occno)), where σ denotes the alphabet size, w denotes the machine word size in bits and sort(x) denotes the time for sorting x integers within the range [1,n]. We also study the counting version of this problem. |
|---|---|
| ISSN: | 0304-3975 |
| DOI: | 10.1016/j.tcs.2025.115512 |