Constant-delay enumeration for SLP-compressed documents

We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Logical methods in computer science Ročník 21, Issue 1
Hlavní autori: Muñoz, Martín, Riveros, Cristian
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Logical Methods in Computer Science e.V 01.01.2025
Predmet:
ISSN:1860-5974, 1860-5974
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension of regular automata that allows annotations on letters. This model extends the notion of Regular Spanners as it allows arbitrarily long outputs. Our main result is an algorithm that evaluates such a query by enumerating all results with output-linear delay after a preprocessing phase which takes linear time on the size of the SLP, and cubic time over the size of the automaton. This is an improvement over Schmid and Schweikardt's result, which, with the same preprocessing time, enumerates with a delay that is logarithmic on the size of the uncompressed document. We achieve this through a persistent data structure named Enumerable Compact Sets with Shifts which guarantees output-linear delay under certain restrictions. These results imply constant-delay enumeration algorithms in the context of regular spanners. Further, we use an extension of annotated automata which utilizes succinctly encoded annotations to save an exponential factor from previous results that dealt with constant-delay enumeration over vset automata. Lastly, we extend our results in the same fashion Schmid and Schweikardt did to allow complex document editing while maintaining the constant delay guarantee.
AbstractList We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension of regular automata that allows annotations on letters. This model extends the notion of Regular Spanners as it allows arbitrarily long outputs. Our main result is an algorithm that evaluates such a query by enumerating all results with output-linear delay after a preprocessing phase which takes linear time on the size of the SLP, and cubic time over the size of the automaton. This is an improvement over Schmid and Schweikardt's result, which, with the same preprocessing time, enumerates with a delay that is logarithmic on the size of the uncompressed document. We achieve this through a persistent data structure named Enumerable Compact Sets with Shifts which guarantees output-linear delay under certain restrictions. These results imply constant-delay enumeration algorithms in the context of regular spanners. Further, we use an extension of annotated automata which utilizes succinctly encoded annotations to save an exponential factor from previous results that dealt with constant-delay enumeration over vset automata. Lastly, we extend our results in the same fashion Schmid and Schweikardt did to allow complex document editing while maintaining the constant delay guarantee.
Author Muñoz, Martín
Riveros, Cristian
Author_xml – sequence: 1
  givenname: Martín
  surname: Muñoz
  fullname: Muñoz, Martín
– sequence: 2
  givenname: Cristian
  surname: Riveros
  fullname: Riveros, Cristian
BookMark eNpNkE1LAzEQhoNUsNb-AU971EM0mc1ust6k-FEoKKjnkI-JbNndlGQ99N-7rSLOe5hhXngOzzmZDXFAQi45uxE1NOq2612mwK_4HZfXwKA6IXOuakarRorZv_uMLHPesmnKkiuo50Su4pBHM4zUY2f2BQ5fPSYztnEoQkzF2-aVutjvEuaMvvDRTf0w5gtyGkyXcfm7F-Tj8eF99Uw3L0_r1f2GupKpkcpgPUNhVcNCowJwXikEABYkkyb4pnEofADJrWW1qJAjKhukt065ErBckPUP10ez1bvU9ibtdTStPj5i-tQmja3rUNsgcMJbUFYKKLlxVYVCNUoaqJhxEwt-WC7FnBOGPx5n-mhSH0xq4HqK1AeT5Te462ma
ContentType Journal Article
DBID AAYXX
CITATION
DOA
DOI 10.46298/lmcs-21(1:17)2025
DatabaseName CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1860-5974
ExternalDocumentID oai_doaj_org_article_bf4e158b28b74231ac55e48987a250ac
10_46298_lmcs_21_1_17_2025
GroupedDBID .4S
.DC
29L
2WC
5GY
5VS
AAFWJ
AAYXX
ADBBV
ADMLS
ADQAK
AENEX
AFPKN
ALMA_UNASSIGNED_HOLDINGS
ARCSS
BCNDV
CITATION
EBS
EJD
FRP
GROUPED_DOAJ
J9A
KQ8
MK~
ML~
M~E
OK1
OVT
P2P
TR2
TUS
XSB
ID FETCH-LOGICAL-c308t-7fbd0e4b890f98f21158e2220f707afd99ce4df271bb0645e1ee8bf7dbc8c32e3
IEDL.DBID DOA
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001429349500002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1860-5974
IngestDate Fri Oct 03 12:53:36 EDT 2025
Sat Nov 29 08:21:19 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c308t-7fbd0e4b890f98f21158e2220f707afd99ce4df271bb0645e1ee8bf7dbc8c32e3
OpenAccessLink https://doaj.org/article/bf4e158b28b74231ac55e48987a250ac
ParticipantIDs doaj_primary_oai_doaj_org_article_bf4e158b28b74231ac55e48987a250ac
crossref_primary_10_46298_lmcs_21_1_17_2025
PublicationCentury 2000
PublicationDate 2025-01-01
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – month: 01
  year: 2025
  text: 2025-01-01
  day: 01
PublicationDecade 2020
PublicationTitle Logical methods in computer science
PublicationYear 2025
Publisher Logical Methods in Computer Science e.V
Publisher_xml – name: Logical Methods in Computer Science e.V
SSID ssj0000331826
Score 2.3517706
Snippet We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which...
SourceID doaj
crossref
SourceType Open Website
Index Database
SubjectTerms computer science - data structures and algorithms
computer science - formal languages and automata theory
computer science - logic in computer science
Title Constant-delay enumeration for SLP-compressed documents
URI https://doaj.org/article/bf4e158b28b74231ac55e48987a250ac
Volume 21, Issue 1
WOSCitedRecordID wos001429349500002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1860-5974
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000331826
  issn: 1860-5974
  databaseCode: DOA
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1860-5974
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000331826
  issn: 1860-5974
  databaseCode: M~E
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA6yePDiW3zTgwdFgk2T5uFNZRcP67Kgwt5Ck6Yg6K7sdgUv_nZn0q6sJy9SKG0pIXzTZL5JJ98Qcia18KUsYPYTMMhFJVNaMCapy6U2ihdeh1i1pK8GAz0ameFSqS_MCWvkgRvgMJEssFy7TDv8qcgKn-dBaAiVC_DehcfZF1jPUjAV52DOkTg3u2SEzIy-en3zM5qxc3bN1AWE_PkvT7Qk2B89S2-TrLeUMLlpurJFVsJ4m2wsyi0k7ejbIZiUglyupqjs-JlgEntoDJgA9Uwe-0OKGeJRDrxMyomfx_1ru-S51326u6dt3QPqeaprqipXpkE4bdLK6ApCtFwH8ONppVJVVKUxPoiyyhRzDuXmAgtBu0qVzmvPs8D3SGc8GYd9khhunAxSe-O5EN7AHfBUL6GVzMP1AblcYGDfG3kLC2FBRMwiYjZjFg5lEbEDcosw_byJ0tTxARjMtgazfxns8D8aOSJr2KFmLeSYdOrpPJyQVf9Rv8ymp_FbgPPDV_cbgr25xQ
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Constant-delay+enumeration+for+SLP-compressed+documents&rft.jtitle=Logical+methods+in+computer+science&rft.au=Mu%C3%B1oz%2C+Mart%C3%ADn&rft.au=Riveros%2C+Cristian&rft.date=2025-01-01&rft.issn=1860-5974&rft.eissn=1860-5974&rft.volume=21%2C+Issue+1&rft_id=info:doi/10.46298%2Flmcs-21%281%3A17%292025&rft.externalDBID=n%2Fa&rft.externalDocID=10_46298_lmcs_21_1_17_2025
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1860-5974&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1860-5974&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1860-5974&client=summon