Constant-delay enumeration for SLP-compressed documents
We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension...
Saved in:
| Published in: | Logical methods in computer science Vol. 21, Issue 1 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Logical Methods in Computer Science e.V
01.01.2025
|
| Subjects: | |
| ISSN: | 1860-5974, 1860-5974 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension of regular automata that allows annotations on letters. This model extends the notion of Regular Spanners as it allows arbitrarily long outputs. Our main result is an algorithm that evaluates such a query by enumerating all results with output-linear delay after a preprocessing phase which takes linear time on the size of the SLP, and cubic time over the size of the automaton. This is an improvement over Schmid and Schweikardt's result, which, with the same preprocessing time, enumerates with a delay that is logarithmic on the size of the uncompressed document. We achieve this through a persistent data structure named Enumerable Compact Sets with Shifts which guarantees output-linear delay under certain restrictions. These results imply constant-delay enumeration algorithms in the context of regular spanners. Further, we use an extension of annotated automata which utilizes succinctly encoded annotations to save an exponential factor from previous results that dealt with constant-delay enumeration over vset automata. Lastly, we extend our results in the same fashion Schmid and Schweikardt did to allow complex document editing while maintaining the constant delay guarantee. |
|---|---|
| AbstractList | We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension of regular automata that allows annotations on letters. This model extends the notion of Regular Spanners as it allows arbitrarily long outputs. Our main result is an algorithm that evaluates such a query by enumerating all results with output-linear delay after a preprocessing phase which takes linear time on the size of the SLP, and cubic time over the size of the automaton. This is an improvement over Schmid and Schweikardt's result, which, with the same preprocessing time, enumerates with a delay that is logarithmic on the size of the uncompressed document. We achieve this through a persistent data structure named Enumerable Compact Sets with Shifts which guarantees output-linear delay under certain restrictions. These results imply constant-delay enumeration algorithms in the context of regular spanners. Further, we use an extension of annotated automata which utilizes succinctly encoded annotations to save an exponential factor from previous results that dealt with constant-delay enumeration over vset automata. Lastly, we extend our results in the same fashion Schmid and Schweikardt did to allow complex document editing while maintaining the constant delay guarantee. |
| Author | Muñoz, Martín Riveros, Cristian |
| Author_xml | – sequence: 1 givenname: Martín surname: Muñoz fullname: Muñoz, Martín – sequence: 2 givenname: Cristian surname: Riveros fullname: Riveros, Cristian |
| BookMark | eNpNkE1LAzEQhoNUsNb-AU971EM0mc1ust6k-FEoKKjnkI-JbNndlGQ99N-7rSLOe5hhXngOzzmZDXFAQi45uxE1NOq2612mwK_4HZfXwKA6IXOuakarRorZv_uMLHPesmnKkiuo50Su4pBHM4zUY2f2BQ5fPSYztnEoQkzF2-aVutjvEuaMvvDRTf0w5gtyGkyXcfm7F-Tj8eF99Uw3L0_r1f2GupKpkcpgPUNhVcNCowJwXikEABYkkyb4pnEofADJrWW1qJAjKhukt065ErBckPUP10ez1bvU9ibtdTStPj5i-tQmja3rUNsgcMJbUFYKKLlxVYVCNUoaqJhxEwt-WC7FnBOGPx5n-mhSH0xq4HqK1AeT5Te462ma |
| ContentType | Journal Article |
| DBID | AAYXX CITATION DOA |
| DOI | 10.46298/lmcs-21(1:17)2025 |
| DatabaseName | CrossRef DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | CrossRef |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1860-5974 |
| ExternalDocumentID | oai_doaj_org_article_bf4e158b28b74231ac55e48987a250ac 10_46298_lmcs_21_1_17_2025 |
| GroupedDBID | .4S .DC 29L 2WC 5GY 5VS AAFWJ AAYXX ADBBV ADMLS ADQAK AENEX AFPKN ALMA_UNASSIGNED_HOLDINGS ARCSS BCNDV CITATION EBS EJD FRP GROUPED_DOAJ J9A KQ8 MK~ ML~ M~E OK1 OVT P2P TR2 TUS XSB |
| ID | FETCH-LOGICAL-c308t-7fbd0e4b890f98f21158e2220f707afd99ce4df271bb0645e1ee8bf7dbc8c32e3 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001429349500002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1860-5974 |
| IngestDate | Fri Oct 03 12:53:36 EDT 2025 Sat Nov 29 08:21:19 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c308t-7fbd0e4b890f98f21158e2220f707afd99ce4df271bb0645e1ee8bf7dbc8c32e3 |
| OpenAccessLink | https://doaj.org/article/bf4e158b28b74231ac55e48987a250ac |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_bf4e158b28b74231ac55e48987a250ac crossref_primary_10_46298_lmcs_21_1_17_2025 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-01-01 |
| PublicationDateYYYYMMDD | 2025-01-01 |
| PublicationDate_xml | – month: 01 year: 2025 text: 2025-01-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Logical methods in computer science |
| PublicationYear | 2025 |
| Publisher | Logical Methods in Computer Science e.V |
| Publisher_xml | – name: Logical Methods in Computer Science e.V |
| SSID | ssj0000331826 |
| Score | 2.3517706 |
| Snippet | We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which... |
| SourceID | doaj crossref |
| SourceType | Open Website Index Database |
| SubjectTerms | computer science - data structures and algorithms computer science - formal languages and automata theory computer science - logic in computer science |
| Title | Constant-delay enumeration for SLP-compressed documents |
| URI | https://doaj.org/article/bf4e158b28b74231ac55e48987a250ac |
| Volume | 21, Issue 1 |
| WOSCitedRecordID | wos001429349500002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1860-5974 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000331826 issn: 1860-5974 databaseCode: DOA dateStart: 20040101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1860-5974 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000331826 issn: 1860-5974 databaseCode: M~E dateStart: 20040101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NS8MwGA4yPHjxW_ymBw-KBPPVJvGmsuFhjoEKu4V8gqCbbJ3gxd9uknYyT16k0EMpITxv-77P0755AsBZzHfOIqehtqWBjCAJjeAUVoQHHQm5QTpb5vf5YCBGIzlc2uor9YQ19sANcKmRzONSGCJM-qmI45ilZyJKZR2rt7Yp-0bWsySmcg6mNBHnZpUMq4gUV69vdgYJPsfXmF9EyV_-qkRLhv25svQ2wXpLCYubZipbYMWPt8HGYruFon37dkBqSklcrobJ2fGzSE3svglgEaln8dgfwtQhnu3AXeEmdp7Xr-2C51736e4etvseQEuRqCEPxiHPjJAoSBGiRCuFj3UcBY64Dk5K65kLhGNjkt2cx94LE7gzVlhKPN0DnfFk7PdB4YmsnDAlioFiHpWacUN1RQ0JjkVxdAAuFxio98beQkVZkBFTCTFFsIoHVwmxA3CbYPq5M1lT5wsxYKoNmPorYIf_McgRWEsTar6FHINOPZ37E7BqP-qX2fQ0Pwvx_PDV_QacG7oc |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Constant-delay+enumeration+for+SLP-compressed+documents&rft.jtitle=Logical+methods+in+computer+science&rft.au=Mart%C3%ADn+Mu%C3%B1oz&rft.au=Cristian+Riveros&rft.date=2025-01-01&rft.pub=Logical+Methods+in+Computer+Science+e.V&rft.eissn=1860-5974&rft.volume=21%2C+Issue+1&rft_id=info:doi/10.46298%2Flmcs-21%281%3A17%292025&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_bf4e158b28b74231ac55e48987a250ac |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1860-5974&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1860-5974&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1860-5974&client=summon |