Non-overlapping indexing in BWT-runs bounded space
We revisit the non-overlapping indexing problem for an efficient repetition-aware solution. The problem is to index a text T[1..n], such that whenever a pattern P[1..p] comes as a query, we can report the largest set of non-overlapping occurrences of P in T. A previous index by Cohen and Porat [ISAA...
Saved in:
| Published in: | Theoretical computer science Vol. 1056; p. 115512 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
21.11.2025
|
| Subjects: | |
| ISSN: | 0304-3975 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We revisit the non-overlapping indexing problem for an efficient repetition-aware solution. The problem is to index a text T[1..n], such that whenever a pattern P[1..p] comes as a query, we can report the largest set of non-overlapping occurrences of P in T. A previous index by Cohen and Porat [ISAAC 2009] takes linear space and optimal O(p+occno) query time, where occno denotes the output size. We present an index of size O(r), where r denotes the number of runs in the Burrows Wheeler Transform (BWT) of T. The parameter r is significantly smaller than n for highly repetitive texts. The query time of our index is O(ploglogwσ+sort(occno)), where σ denotes the alphabet size, w denotes the machine word size in bits and sort(x) denotes the time for sorting x integers within the range [1,n]. We also study the counting version of this problem. |
|---|---|
| ISSN: | 0304-3975 |
| DOI: | 10.1016/j.tcs.2025.115512 |