On compressing and indexing repetitive sequences

We introduce LZ-End, a new member of the Lempel–Ziv family of text compressors, which achieves compression ratios close to those of LZ77 but is much faster at extracting arbitrary text substrings. We then build the first self-index based on LZ77 (or LZ-End) compression, which in addition to text ext...

Full description

Saved in:
Bibliographic Details
Published in:Theoretical computer science Vol. 483; pp. 115 - 133
Main Authors: Kreft, Sebastian, Navarro, Gonzalo
Format: Journal Article
Language:English
Published: Elsevier B.V 29.04.2013
Subjects:
ISSN:0304-3975, 1879-2294
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract We introduce LZ-End, a new member of the Lempel–Ziv family of text compressors, which achieves compression ratios close to those of LZ77 but is much faster at extracting arbitrary text substrings. We then build the first self-index based on LZ77 (or LZ-End) compression, which in addition to text extraction offers fast indexed searches on the compressed text. This self-index is particularly effective for representing highly repetitive sequence collections, which arise for example when storing versioned documents, software repositories, periodic publications, and biological sequence databases.
AbstractList We introduce LZ-End, a new member of the Lempel–Ziv family of text compressors, which achieves compression ratios close to those of LZ77 but is much faster at extracting arbitrary text substrings. We then build the first self-index based on LZ77 (or LZ-End) compression, which in addition to text extraction offers fast indexed searches on the compressed text. This self-index is particularly effective for representing highly repetitive sequence collections, which arise for example when storing versioned documents, software repositories, periodic publications, and biological sequence databases.
Author Kreft, Sebastian
Navarro, Gonzalo
Author_xml – sequence: 1
  givenname: Sebastian
  surname: Kreft
  fullname: Kreft, Sebastian
  email: skreft@dcc.uchile.cl
– sequence: 2
  givenname: Gonzalo
  surname: Navarro
  fullname: Navarro, Gonzalo
  email: gnavarro@dcc.uchile.cl
BookMark eNp9j81KAzEUhYNUsK0-gLt5gRlvJpNkgisp_kGhG12HmeSOpLSZmsSib2-GunLRy4HLWXwHvgWZ-dEjIbcUKgpU3G2rZGJVA60ryAFxQea0laqsa9XMyBwYNCVTkl-RRYxbyMelmBPY-MKM-0PAGJ3_KDpvC-ctfk8l4AGTS-6IRcTPL_QG4zW5HLpdxJu_vyTvT49vq5dyvXl-XT2sS8MES-VgewNtrTi2teCmZ9QqnhuajiuF1AhrqGVMwsCoHFgvBLTIeQMSoBU9WxJ62jVhjDHgoA_B7bvwoynoSVlvdVbWk7KGHBCZkf8Y41KX3OhT6NzuLHl_IjErHR0GHY2bfK0LaJK2oztD_wJ1eHKo
CitedBy_id crossref_primary_10_1016_j_is_2014_06_002
crossref_primary_10_1145_3607141
crossref_primary_10_1145_3531445
crossref_primary_10_1016_j_tcs_2017_12_021
crossref_primary_10_1007_s11047_022_09882_6
crossref_primary_10_1016_j_ic_2021_104749
crossref_primary_10_1093_comjnl_bxx108
crossref_primary_10_1145_3375890
crossref_primary_10_1145_3701561
crossref_primary_10_1016_j_ipl_2018_09_005
crossref_primary_10_1016_j_tcs_2018_09_007
crossref_primary_10_1016_j_ic_2024_105155
crossref_primary_10_1109_TCBB_2021_3108843
crossref_primary_10_1016_j_is_2016_04_002
crossref_primary_10_1016_j_tcs_2018_11_022
crossref_primary_10_1016_j_jda_2018_09_002
crossref_primary_10_1007_s00453_023_01186_0
crossref_primary_10_1016_j_tcs_2017_02_020
crossref_primary_10_1007_s00453_017_0327_z
crossref_primary_10_3390_a14010014
crossref_primary_10_1016_j_ic_2022_104999
crossref_primary_10_1145_2851495
crossref_primary_10_1016_j_ic_2020_104518
crossref_primary_10_1186_1748_7188_8_25
crossref_primary_10_1016_j_ic_2019_01_006
crossref_primary_10_1109_TIT_2021_3112676
crossref_primary_10_1007_s10791_017_9297_7
crossref_primary_10_1145_3434399
crossref_primary_10_1186_s13059_024_03244_4
crossref_primary_10_1109_ACCESS_2022_3221520
crossref_primary_10_1007_s13369_018_3105_6
crossref_primary_10_1109_TIT_2020_3042746
crossref_primary_10_3389_fbioe_2015_00012
crossref_primary_10_1016_j_jda_2016_10_001
crossref_primary_10_1109_TITS_2014_2345055
crossref_primary_10_1016_j_jcss_2020_11_002
crossref_primary_10_1109_TCBB_2020_2968323
crossref_primary_10_1007_s00236_025_00481_3
crossref_primary_10_1007_s11786_016_0286_9
crossref_primary_10_1109_TIT_2022_3224382
crossref_primary_10_1371_journal_pone_0109384
crossref_primary_10_1016_j_jcss_2020_12_001
crossref_primary_10_1145_3432999
crossref_primary_10_1145_3426473
crossref_primary_10_1016_j_ic_2013_10_003
crossref_primary_10_1145_3750729
crossref_primary_10_1016_j_is_2019_03_007
crossref_primary_10_1145_3550454_3555512
crossref_primary_10_1016_j_tcs_2015_08_008
crossref_primary_10_1137_17M1121457
crossref_primary_10_1145_2699876
Cites_doi 10.1016/S1570-8667(03)00066-2
10.3233/FI-2011-565
10.1145/2063576.2063646
10.1145/1240233.1240243
10.1016/j.tcs.2012.02.015
10.1007/s00453-010-9443-8
10.1145/1290672.1290680
10.1007/978-3-642-16321-0_20
10.1016/0022-0000(86)90043-7
10.1007/3-540-45061-0_73
10.1137/S0097539797331105
10.1016/S0196-6774(03)00087-7
10.1007/3-540-45061-0_29
10.1007/978-3-540-89097-3_17
10.1016/j.ipl.2006.04.008
10.1145/382780.382782
10.1109/DCC.2010.29
10.1109/5.892708
10.1109/DCC.1994.305932
10.1145/321479.321481
10.1109/TIT.1977.1055714
10.1007/11780441_29
10.1137/S0097539702402354
10.1137/0222058
10.1007/s11786-007-0024-4
10.1109/FOCS.2008.83
10.1007/978-3-540-87744-8_58
10.1109/SFCS.2000.892127
10.1145/63334.63341
10.1016/j.tcs.2007.07.013
10.1145/1109557.1109693
10.1007/978-3-642-03816-7_21
10.1145/1412228.1412230
10.1007/s00453-004-1146-6
10.1016/S0304-3975(02)00777-6
10.1007/978-3-642-12200-2_16
10.1016/0020-0190(83)90075-3
10.1145/225058.225288
10.1145/1082036.1082039
10.1109/BIBE.2010.22
10.1137/1.9781611972900.9
10.1007/978-3-540-74450-4_41
10.1089/cmb.2009.0169
10.1016/j.tcs.2009.09.012
10.1109/TIT.1978.1055934
10.1109/18.108250
10.1137/1.9781611972870.6
10.1109/2.881693
10.1145/1216370.1216372
10.1007/978-3-540-78773-0_32
10.1007/978-3-642-03784-9_12
10.1137/070685373
10.1109/DCC.2010.43
10.1109/TIT.1976.1055501
10.1007/978-3-642-21458-5_6
10.1137/1.9781611973082.30
ContentType Journal Article
Copyright 2012 Elsevier B.V.
Copyright_xml – notice: 2012 Elsevier B.V.
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.tcs.2012.02.006
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
Computer Science
EISSN 1879-2294
EndPage 133
ExternalDocumentID 10_1016_j_tcs_2012_02_006
S0304397512001259
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
4.4
457
4G.
5VS
6I.
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXUO
AAYFN
ABAOU
ABBOA
ABJNI
ABMAC
ABVKL
ABXDB
ABYKQ
ACAZW
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AEXQZ
AFKWA
AFTJW
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ARUGR
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HVGLF
IHE
IXB
J1W
KOM
LG9
M26
M41
MHUIS
MO0
N9A
NCXOZ
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
RIG
ROL
RPZ
SCC
SDF
SDG
SES
SPC
SPCBC
SSV
SSW
SSZ
T5K
TN5
WH7
YNT
ZMT
~G-
29Q
9DU
AAEDT
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABEFU
ABFNM
ABWVN
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADMUD
ADNMO
ADVLN
AEIPS
AEUPX
AFJKZ
AFPUW
AGHFR
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
FGOYB
G-2
HZ~
R2-
SEW
TAE
WUQ
ZY4
~HD
ID FETCH-LOGICAL-c363t-fdbc08295e8265cb31d9595eeca599e1c6dc1d3370f317f3b6608e554070086b3
ISICitedReferencesCount 114
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000318890000012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0304-3975
IngestDate Tue Nov 18 22:25:20 EST 2025
Sat Nov 29 08:07:51 EST 2025
Fri Feb 23 02:30:24 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Repetitive texts
Compression
Succinct data structures
Self-indexing
Lempel–Ziv
Language English
License http://www.elsevier.com/open-access/userlicense/1.0
https://www.elsevier.com/tdm/userlicense/1.0
https://www.elsevier.com/open-access/userlicense/1.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c363t-fdbc08295e8265cb31d9595eeca599e1c6dc1d3370f317f3b6608e554070086b3
OpenAccessLink https://dx.doi.org/10.1016/j.tcs.2012.02.006
PageCount 19
ParticipantIDs crossref_primary_10_1016_j_tcs_2012_02_006
crossref_citationtrail_10_1016_j_tcs_2012_02_006
elsevier_sciencedirect_doi_10_1016_j_tcs_2012_02_006
PublicationCentury 2000
PublicationDate 2013-04-29
PublicationDateYYYYMMDD 2013-04-29
PublicationDate_xml – month: 04
  year: 2013
  text: 2013-04-29
  day: 29
PublicationDecade 2010
PublicationTitle Theoretical computer science
PublicationYear 2013
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Manzini (br000045) 2001; 48
Hamming (br000210) 1986
Apostolico (br000035) 1985
N. Brisaboa, S. Ladra, G. Navarro, Directly addressable variable-length codes, in: Proc. 16th International Symposium on String Processing and Information Retrieval, SPIRE’09, pp. 122–130.
Manber, Myers (br000040) 1993; 22
J. Kärkkäinen, P. Sanders, Simple linear work suffix array construction, in: Proc. 30th International Colloquium on Automata, Languages and Programming, ICALP’03, in: LNCS, vol. 2719, pp. 943–955.
Sadakane (br000170) 2003; 48
S. Gog, J. Fischer, Advantages of shared data structures for sequences of balanced parentheses, in: Proc. 20th Data Compression Conference, DCC’10, pp. 406–415.
Fiala, Greene (br000215) 1989; 32
R. Grossi, J.S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, in: Proc. 32nd Annual ACM Symposium on Theory of Computing, STOC’00, pp. 397–406.
D. Arroyuelo, G. Navarro, Smaller and faster Lempel–Ziv indices, in: Proc. 18th International Workshop on Combinatorial Algorithms, IWOCA’07, pp. 11–20.
Ferragina, Manzini, Mäkinen, Navarro (br000055) 2007; 3
S. Kreft, G. Navarro, LZ77-like compression with fast random access, in: Proc. 20th Data Compression Conference, DCC’10, pp. 239–248.
S. Kuruppu, S. Puglisi, J. Zobel, Relative Lempel–Ziv compression of genomes for large-scale storage and retrieval, in: Proc. 17th International Symposium on String Processing and Information Retrieval, SPIRE’10, pp. 201–206.
S. Kreft, Self-Index based on LZ77, MSc Thesis, University of Chile, 2010. Available as Tech. Report TR/DCC-2011-13, Dept. of Computer Science, University of Chile.
Morrison (br000230) 1968; 15
Munro (br000310) 1986; 33
S. Kreft, G. Navarro, Self-indexing based on LZ77, in: Proc. 22nd Annual Symposium on Combinatorial Pattern Matching, CPM’11, in: LNCS, vol. 6661, pp. 41–54.
J. Fischer, V. Heun, A new succinct representation of RMQ-information and improvements in the enhanced suffix array, in: Proc. 1st International Symposium on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, ESCAPE’07, in: LNCS, vol. 4614, pp. 459–470.
Navarro (br000325) 2009; 13
R. Grossi, A. Gupta, J.S. Vitter, High-order entropy-compressed text indexes, in: Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’03, pp. 841–850.
J. Fischer, Optimal succinctness for range minimum queries, in: Proceedings of the Latin American Symposium on Theoretical Informatics, LATIN’10, in: LNCS, vol. 6034, pp. 158–169.
D. Okanohara, K. Sadakane, An online algorithm for finding the longest previous factors, in: Proc. 16th Annual European Symposium on Algorithms, ESA’08, pp. 696–707.
Ziviani, de~Moura, Navarro, Baeza-Yates (br000015) 2000; 33
ary trees and multisets, in: Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’02, pp. 233–242.
M. Burrows, D. Wheeler, A block sorting lossless data compression algorithm, Technical Report 124, Digital Equipment Corporation, 1994.
Navarro (br000120) 2004; 2
Fischer, Mäkinen, Navarro (br000275) 2009; 410
Lempel, Ziv (br000075) 1976; 22
Ferragina, Manzini (br000125) 2005; 52
Mäkinen, Navarro, Sirén, Välimäki (br000150) 2010; 17
.
Navarro, Mäkinen (br000020) 2007; 39
S. Muthukrishnan, Efficient algorithms for document retrieval problems, in: Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’02, pp. 657–666.
gram indexing for highly repetitive biological sequences, in: Proc. 10th IEEE Conference on Bioinformatics and Bioengineering, BIBE’10, pp. 86–91.
Ziv, Lempel (br000085) 1978; 24
R. Raman, V. Raman, S.S. Rao, Succinct indexable dictionaries with applications to encoding
S. Kuruppu, B. Beresford-Smith, T. Conway, J. Zobel, Repetition-based compression of large DNA datasets, in: Proc. 13th Annual International Conference on Computational Molecular Biology, RECOMB’09, Poster.
Mäkinen, Navarro (br000165) 2005; 12
Arroyuelo, Navarro, Sadakane (br000315) 2012; 62
G. Navarro, K. Sadakane, Fully-functional static and dynamic succinct trees, CoRR 0905.0768v5 (2010).
F. Claude, A. Fariña, M. Martínez-Prieto, G. Navarro, Indexes for highly repetitive document collections, in: Proc. 20th ACM International Conference on Information and Knowledge Management, CIKM’11, pp. 463–468.
D. Arroyuelo, R. Cánovas, G. Navarro, K. Sadakane, Succinct trees in practice, in: Proc. 11th Workshop on Algorithm Engineering and Experiments, ALENEX’10, pp. 84–97.
C. Nevill-Manning, I. Witten, D. Maulsby, Compression by induction of hierarchical grammars, in: Proc. 4th Data Compression Conference, DCC’94, pp. 244–253.
D. Arroyuelo, G. Navarro, K. Sadakane, Reducing the space requirement of LZ-index, Proc. 17th Annual Symposium on Combinatorial Pattern Matching, CPM, in: LNCS, vol. 4009, pp. 319–330.
Hon, Sadakane, Sung (br000340) 2009; 38
F. Claude, A. Fariña, M. Martínez-Prieto, G. Navarro, Compressed
Willard (br000300) 1983; 17
K. Sadakane, R. Grossi, Squeezing succinct data structures into entropy bounds, in: Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’06, pp. 1230–1239.
Kosaraju, Manzini (br000200) 1999; 29
Rytter (br000175) 2003; 302
F. Claude, G. Navarro, Self-indexed text compression using straight-line programs, in: Proc. 34th International Symposium on Mathematical Foundations of Computer Science, MFCS’09, in: LNCS, vol. 5734, pp. 235–246.
P. Bille, G.M. Landau, R. Raman, K. Sadakane, S.R. Satti, O. Weimann, Random access to grammar-compressed strings, in: Proc. 22nd Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’11, pp. 373–389.
Benoit, Demaine, Munro, Raman, Raman, Rao (br000235) 2005; 43
M. Farach, M. Thorup, String matching in Lempel–Ziv compressed strings, in: Proc. 27th ACM Annual Symposium on the Theory of Computing, STOC, pp. 703–712.
J. Kärkkäinen, Repetition-based text indexes, Ph.D. Thesis, Department of Computer Science, University of Helsinki, Finland, 1999.
M. Crochemore, C.S. Iliopoulos, M. Kubica, M.S. Rahman, T. Walen, Improved algorithms for the range next value problem and applications, in: Proc. 25th International Symposium on Theoretical Aspects of Computer Science, STACS’08, pp. 205–216.
J.I. Munro, R. Raman, V. Raman, S.S. Rao, Succinct representations of permutations, in: Proc. 30th International Colloquium on Automata, Languages and Computation, ICALP’03, in: LNCS, vol. 2719, pp. 345–356.
Chen, Puglisi, Smyth (br000290) 2008; 1
L.M.S. Russo, G. Navarro, A.L. Oliveira, Fully-compressed suffix trees, in: 8th Latin American Symposium on Theoretical Informatics, LATIN’08, in: LNCS, vol. 4957, pp. 362–373.
Russo, Oliveira (br000135) 2008; 5
J. Sirén, N. Välimäki, V. Mäkinen, G. Navarro, Run-length compressed indexes are superior for highly repetitive sequence collections, in: Proc. 15th International Symposium on String Processing and Information Retrieval, SPIRE’08, in: LNCS, vol. 5280, pp. 164–175.
Plotnik, Weinberger, Ziv (br000205) 1992; 38
Grossi, Vitter (br000160) 2005; 35
J. Kärkkäinen, E. Ukkonen, Lempel-Ziv parsing and sublinear-size index structures for string matching, in: Proc. 3rd South American Workshop on String Processing, WSP’96, pp. 141–155.
P. Ferragina, G. Manzini, Opportunistic data structures with applications, in: Proc. 41st Annual Symposium on Foundations of Computer Science, FOCS’00, pp. 390–398.
Gagie (br000060) 2006; 99
Claude, Navarro (br000335) 2010; 111
Larsson, Moffat (br000070) 2000; 88
Ziv, Lempel (br000080) 1977; 23
M. Paˇtraşcu, Succincter, in: Proc. 49th IEEE Annual Symposium on Foundations of Computer Science, FOCS’08, pp. 305–313.
D. Okanohara, K. Sadakane, Practical entropy-compressed rank/select dictionary, in: Proc. 9th Workshop on Algorithm Engineering and Experiments, ALENEX’07.
Mäkinen, Navarro (br000240) 2007; 387
Ferragina (10.1016/j.tcs.2012.02.006_br000125) 2005; 52
Hon (10.1016/j.tcs.2012.02.006_br000340) 2009; 38
10.1016/j.tcs.2012.02.006_br000190
10.1016/j.tcs.2012.02.006_br000270
10.1016/j.tcs.2012.02.006_br000030
10.1016/j.tcs.2012.02.006_br000195
Navarro (10.1016/j.tcs.2012.02.006_br000020) 2007; 39
Claude (10.1016/j.tcs.2012.02.006_br000335) 2010; 111
Fiala (10.1016/j.tcs.2012.02.006_br000215) 1989; 32
10.1016/j.tcs.2012.02.006_br000350
10.1016/j.tcs.2012.02.006_br000100
10.1016/j.tcs.2012.02.006_br000265
10.1016/j.tcs.2012.02.006_br000220
10.1016/j.tcs.2012.02.006_br000025
10.1016/j.tcs.2012.02.006_br000145
10.1016/j.tcs.2012.02.006_br000225
Navarro (10.1016/j.tcs.2012.02.006_br000325) 2009; 13
Manzini (10.1016/j.tcs.2012.02.006_br000045) 2001; 48
10.1016/j.tcs.2012.02.006_br000345
Ziv (10.1016/j.tcs.2012.02.006_br000080) 1977; 23
10.1016/j.tcs.2012.02.006_br000105
10.1016/j.tcs.2012.02.006_br000305
Fischer (10.1016/j.tcs.2012.02.006_br000275) 2009; 410
Kosaraju (10.1016/j.tcs.2012.02.006_br000200) 1999; 29
Morrison (10.1016/j.tcs.2012.02.006_br000230) 1968; 15
10.1016/j.tcs.2012.02.006_br000280
10.1016/j.tcs.2012.02.006_br000285
Russo (10.1016/j.tcs.2012.02.006_br000135) 2008; 5
10.1016/j.tcs.2012.02.006_br000155
10.1016/j.tcs.2012.02.006_br000110
Larsson (10.1016/j.tcs.2012.02.006_br000070) 2000; 88
Ziv (10.1016/j.tcs.2012.02.006_br000085) 1978; 24
10.1016/j.tcs.2012.02.006_br000115
Mäkinen (10.1016/j.tcs.2012.02.006_br000165) 2005; 12
Chen (10.1016/j.tcs.2012.02.006_br000290) 2008; 1
Ziviani (10.1016/j.tcs.2012.02.006_br000015) 2000; 33
Apostolico (10.1016/j.tcs.2012.02.006_br000035) 1985
10.1016/j.tcs.2012.02.006_br000090
Mäkinen (10.1016/j.tcs.2012.02.006_br000150) 2010; 17
Mäkinen (10.1016/j.tcs.2012.02.006_br000240) 2007; 387
10.1016/j.tcs.2012.02.006_br000050
Rytter (10.1016/j.tcs.2012.02.006_br000175) 2003; 302
10.1016/j.tcs.2012.02.006_br000250
10.1016/j.tcs.2012.02.006_br000095
10.1016/j.tcs.2012.02.006_br000010
10.1016/j.tcs.2012.02.006_br000130
10.1016/j.tcs.2012.02.006_br000295
10.1016/j.tcs.2012.02.006_br000320
Sadakane (10.1016/j.tcs.2012.02.006_br000170) 2003; 48
10.1016/j.tcs.2012.02.006_br000245
10.1016/j.tcs.2012.02.006_br000005
Gagie (10.1016/j.tcs.2012.02.006_br000060) 2006; 99
Manber (10.1016/j.tcs.2012.02.006_br000040) 1993; 22
Munro (10.1016/j.tcs.2012.02.006_br000310) 1986; 33
Grossi (10.1016/j.tcs.2012.02.006_br000160) 2005; 35
Willard (10.1016/j.tcs.2012.02.006_br000300) 1983; 17
Navarro (10.1016/j.tcs.2012.02.006_br000120) 2004; 2
10.1016/j.tcs.2012.02.006_br000180
Benoit (10.1016/j.tcs.2012.02.006_br000235) 2005; 43
10.1016/j.tcs.2012.02.006_br000140
10.1016/j.tcs.2012.02.006_br000260
10.1016/j.tcs.2012.02.006_br000065
Lempel (10.1016/j.tcs.2012.02.006_br000075) 1976; 22
10.1016/j.tcs.2012.02.006_br000185
10.1016/j.tcs.2012.02.006_br000330
10.1016/j.tcs.2012.02.006_br000255
Ferragina (10.1016/j.tcs.2012.02.006_br000055) 2007; 3
Plotnik (10.1016/j.tcs.2012.02.006_br000205) 1992; 38
Hamming (10.1016/j.tcs.2012.02.006_br000210) 1986
Arroyuelo (10.1016/j.tcs.2012.02.006_br000315) 2012; 62
References_xml – reference: F. Claude, A. Fariña, M. Martínez-Prieto, G. Navarro, Compressed
– volume: 39
  year: 2007
  ident: br000020
  article-title: Compressed full-text indexes
  publication-title: ACM Computing Surveys
– start-page: 85
  year: 1985
  end-page: 96
  ident: br000035
  article-title: The myriad virtues of subword trees
  publication-title: Combinatorial Algorithms on Words
– reference: J. Fischer, Optimal succinctness for range minimum queries, in: Proceedings of the Latin American Symposium on Theoretical Informatics, LATIN’10, in: LNCS, vol. 6034, pp. 158–169.
– reference: N. Brisaboa, S. Ladra, G. Navarro, Directly addressable variable-length codes, in: Proc. 16th International Symposium on String Processing and Information Retrieval, SPIRE’09, pp. 122–130.
– volume: 13
  year: 2009
  ident: br000325
  article-title: Implementing the LZ-index: Theory versus practice
  publication-title: ACM Journal of Experimental Algorithmics
– reference: R. Grossi, A. Gupta, J.S. Vitter, High-order entropy-compressed text indexes, in: Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’03, pp. 841–850.
– volume: 111
  start-page: 313
  year: 2010
  end-page: 337
  ident: br000335
  article-title: Self-indexed grammar-based compression
  publication-title: Fundamenta Informaticae
– reference: M. Burrows, D. Wheeler, A block sorting lossless data compression algorithm, Technical Report 124, Digital Equipment Corporation, 1994.
– volume: 99
  start-page: 246
  year: 2006
  end-page: 251
  ident: br000060
  article-title: Large alphabets and incompressibility
  publication-title: Information Processing Letters
– volume: 52
  start-page: 552
  year: 2005
  end-page: 581
  ident: br000125
  article-title: Indexing compressed text
  publication-title: Journal of the ACM
– reference: S. Muthukrishnan, Efficient algorithms for document retrieval problems, in: Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’02, pp. 657–666.
– reference: S. Kreft, G. Navarro, Self-indexing based on LZ77, in: Proc. 22nd Annual Symposium on Combinatorial Pattern Matching, CPM’11, in: LNCS, vol. 6661, pp. 41–54.
– volume: 2
  start-page: 87
  year: 2004
  end-page: 114
  ident: br000120
  article-title: Indexing text using the Ziv–Lempel trie
  publication-title: Journal of Discrete Algorithms
– reference: M. Paˇtraşcu, Succincter, in: Proc. 49th IEEE Annual Symposium on Foundations of Computer Science, FOCS’08, pp. 305–313.
– volume: 12
  start-page: 40
  year: 2005
  end-page: 66
  ident: br000165
  article-title: Succinct suffix arrays based on run-length encoding
  publication-title: Nordic Journal of Computing
– reference: F. Claude, G. Navarro, Self-indexed text compression using straight-line programs, in: Proc. 34th International Symposium on Mathematical Foundations of Computer Science, MFCS’09, in: LNCS, vol. 5734, pp. 235–246.
– volume: 1
  start-page: 605
  year: 2008
  end-page: 623
  ident: br000290
  article-title: Lempel-Ziv factorization using less time & space
  publication-title: Mathematics in Computer Science
– reference: F. Claude, A. Fariña, M. Martínez-Prieto, G. Navarro, Indexes for highly repetitive document collections, in: Proc. 20th ACM International Conference on Information and Knowledge Management, CIKM’11, pp. 463–468.
– volume: 43
  start-page: 275
  year: 2005
  end-page: 292
  ident: br000235
  article-title: Representing trees of higher degree
  publication-title: Algorithmica
– volume: 23
  start-page: 337
  year: 1977
  end-page: 343
  ident: br000080
  article-title: A universal algorithm for sequential data compression
  publication-title: IEEE Transactions on Information Theory
– volume: 3
  year: 2007
  ident: br000055
  article-title: Compressed representations of sequences and full-text indexes
  publication-title: ACM Transactions on Algorithms
– reference: D. Okanohara, K. Sadakane, An online algorithm for finding the longest previous factors, in: Proc. 16th Annual European Symposium on Algorithms, ESA’08, pp. 696–707.
– reference: D. Arroyuelo, G. Navarro, Smaller and faster Lempel–Ziv indices, in: Proc. 18th International Workshop on Combinatorial Algorithms, IWOCA’07, pp. 11–20.
– volume: 17
  start-page: 81
  year: 1983
  end-page: 84
  ident: br000300
  article-title: Log–logarithmic worst-case range queries are possible in space
  publication-title: Information Processing Letters
– reference: J. Kärkkäinen, E. Ukkonen, Lempel-Ziv parsing and sublinear-size index structures for string matching, in: Proc. 3rd South American Workshop on String Processing, WSP’96, pp. 141–155.
– volume: 35
  start-page: 378
  year: 2005
  end-page: 407
  ident: br000160
  article-title: Compressed suffix arrays and suffix trees with applications to text indexing and string matching
  publication-title: SIAM Journal of Computing
– reference: -gram indexing for highly repetitive biological sequences, in: Proc. 10th IEEE Conference on Bioinformatics and Bioengineering, BIBE’10, pp. 86–91.
– reference: M. Farach, M. Thorup, String matching in Lempel–Ziv compressed strings, in: Proc. 27th ACM Annual Symposium on the Theory of Computing, STOC, pp. 703–712.
– reference: J. Fischer, V. Heun, A new succinct representation of RMQ-information and improvements in the enhanced suffix array, in: Proc. 1st International Symposium on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, ESCAPE’07, in: LNCS, vol. 4614, pp. 459–470.
– volume: 48
  start-page: 294
  year: 2003
  end-page: 313
  ident: br000170
  article-title: New text indexing functionalities of the compressed suffix arrays
  publication-title: Journal of Algorithms
– reference: S. Kuruppu, B. Beresford-Smith, T. Conway, J. Zobel, Repetition-based compression of large DNA datasets, in: Proc. 13th Annual International Conference on Computational Molecular Biology, RECOMB’09, Poster.
– volume: 15
  start-page: 514
  year: 1968
  end-page: 534
  ident: br000230
  article-title: PATRICIA—Practical algorithm to retrieve information coded in alphanumeric
  publication-title: Journal of the ACM
– reference: L.M.S. Russo, G. Navarro, A.L. Oliveira, Fully-compressed suffix trees, in: 8th Latin American Symposium on Theoretical Informatics, LATIN’08, in: LNCS, vol. 4957, pp. 362–373.
– year: 1986
  ident: br000210
  article-title: Coding and Information Theory
– reference: M. Crochemore, C.S. Iliopoulos, M. Kubica, M.S. Rahman, T. Walen, Improved algorithms for the range next value problem and applications, in: Proc. 25th International Symposium on Theoretical Aspects of Computer Science, STACS’08, pp. 205–216.
– reference: J. Kärkkäinen, P. Sanders, Simple linear work suffix array construction, in: Proc. 30th International Colloquium on Automata, Languages and Programming, ICALP’03, in: LNCS, vol. 2719, pp. 943–955.
– reference: S. Kreft, G. Navarro, LZ77-like compression with fast random access, in: Proc. 20th Data Compression Conference, DCC’10, pp. 239–248.
– reference: D. Arroyuelo, R. Cánovas, G. Navarro, K. Sadakane, Succinct trees in practice, in: Proc. 11th Workshop on Algorithm Engineering and Experiments, ALENEX’10, pp. 84–97.
– volume: 33
  start-page: 37
  year: 2000
  end-page: 44
  ident: br000015
  article-title: Compression: a key for next-generation text retrieval systems
  publication-title: IEEE Computer
– reference: K. Sadakane, R. Grossi, Squeezing succinct data structures into entropy bounds, in: Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’06, pp. 1230–1239.
– volume: 302
  start-page: 211
  year: 2003
  end-page: 222
  ident: br000175
  article-title: Application of Lempel–Ziv factorization to the approximation of grammar-based compression
  publication-title: Theoretical Computer Science
– reference: C. Nevill-Manning, I. Witten, D. Maulsby, Compression by induction of hierarchical grammars, in: Proc. 4th Data Compression Conference, DCC’94, pp. 244–253.
– reference: -ary trees and multisets, in: Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’02, pp. 233–242.
– reference: R. Grossi, J.S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, in: Proc. 32nd Annual ACM Symposium on Theory of Computing, STOC’00, pp. 397–406.
– reference: D. Arroyuelo, G. Navarro, K. Sadakane, Reducing the space requirement of LZ-index, Proc. 17th Annual Symposium on Combinatorial Pattern Matching, CPM, in: LNCS, vol. 4009, pp. 319–330.
– reference: S. Kreft, Self-Index based on LZ77, MSc Thesis, University of Chile, 2010. Available as Tech. Report TR/DCC-2011-13, Dept. of Computer Science, University of Chile.
– volume: 38
  start-page: 66
  year: 1992
  end-page: 72
  ident: br000205
  article-title: Upper bounds on the probability of sequences emitted by finite-state sources and on the redundancy of the Lempel–Ziv algorithm
  publication-title: IEEE Transactions on Information Theory
– reference: P. Bille, G.M. Landau, R. Raman, K. Sadakane, S.R. Satti, O. Weimann, Random access to grammar-compressed strings, in: Proc. 22nd Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’11, pp. 373–389.
– volume: 22
  start-page: 75
  year: 1976
  end-page: 81
  ident: br000075
  article-title: On the complexity of finite sequences
  publication-title: IEEE Transactions on Information Theory
– volume: 17
  start-page: 281
  year: 2010
  end-page: 308
  ident: br000150
  article-title: Storage and retrieval of highly repetitive sequence collections
  publication-title: Journal of Computational Biology
– reference: G. Navarro, K. Sadakane, Fully-functional static and dynamic succinct trees, CoRR 0905.0768v5 (2010).
– volume: 32
  start-page: 490
  year: 1989
  end-page: 505
  ident: br000215
  article-title: Data compression with finite windows
  publication-title: Communications of the ACM
– volume: 5
  start-page: 501
  year: 2008
  end-page: 513
  ident: br000135
  article-title: A compressed self-index using a Ziv–Lempel dictionary
  publication-title: Information Retrieval
– volume: 88
  start-page: 1722
  year: 2000
  end-page: 1732
  ident: br000070
  article-title: Off-line dictionary-based compression
  publication-title: Proceedings of the IEEE
– volume: 33
  start-page: 66
  year: 1986
  end-page: 74
  ident: br000310
  article-title: An implicit data structure supporting insertion, deletion, and search in
  publication-title: Journal of Computer System Sciences
– volume: 24
  start-page: 530
  year: 1978
  end-page: 536
  ident: br000085
  article-title: Compression of individual sequences via variable-rate coding
  publication-title: IEEE Transactions on Information Theory
– reference: S. Gog, J. Fischer, Advantages of shared data structures for sequences of balanced parentheses, in: Proc. 20th Data Compression Conference, DCC’10, pp. 406–415.
– reference: P. Ferragina, G. Manzini, Opportunistic data structures with applications, in: Proc. 41st Annual Symposium on Foundations of Computer Science, FOCS’00, pp. 390–398.
– reference: J. Sirén, N. Välimäki, V. Mäkinen, G. Navarro, Run-length compressed indexes are superior for highly repetitive sequence collections, in: Proc. 15th International Symposium on String Processing and Information Retrieval, SPIRE’08, in: LNCS, vol. 5280, pp. 164–175.
– reference: J. Kärkkäinen, Repetition-based text indexes, Ph.D. Thesis, Department of Computer Science, University of Helsinki, Finland, 1999.
– volume: 48
  start-page: 407
  year: 2001
  end-page: 430
  ident: br000045
  article-title: An analysis of the Burrows-Wheeler transform
  publication-title: Journal of the ACM
– volume: 387
  start-page: 332
  year: 2007
  end-page: 347
  ident: br000240
  article-title: Rank and select revisited and extended
  publication-title: Theoretical Computer Science
– reference: D. Okanohara, K. Sadakane, Practical entropy-compressed rank/select dictionary, in: Proc. 9th Workshop on Algorithm Engineering and Experiments, ALENEX’07.
– reference: .
– volume: 410
  start-page: 5354
  year: 2009
  end-page: 5364
  ident: br000275
  article-title: Faster entropy-bounded compressed suffix trees
  publication-title: Theoretical Computer Science
– volume: 29
  start-page: 893
  year: 1999
  end-page: 911
  ident: br000200
  article-title: Compression of low entropy strings with Lempel–Ziv algorithms
  publication-title: SIAM Journal on Computing
– reference: J.I. Munro, R. Raman, V. Raman, S.S. Rao, Succinct representations of permutations, in: Proc. 30th International Colloquium on Automata, Languages and Computation, ICALP’03, in: LNCS, vol. 2719, pp. 345–356.
– reference: R. Raman, V. Raman, S.S. Rao, Succinct indexable dictionaries with applications to encoding
– reference: S. Kuruppu, S. Puglisi, J. Zobel, Relative Lempel–Ziv compression of genomes for large-scale storage and retrieval, in: Proc. 17th International Symposium on String Processing and Information Retrieval, SPIRE’10, pp. 201–206.
– volume: 38
  start-page: 2162
  year: 2009
  end-page: 2178
  ident: br000340
  article-title: Breaking a time-and-space barrier in constructing full-text indices
  publication-title: SIAM Journal on Computing
– volume: 22
  start-page: 935
  year: 1993
  end-page: 948
  ident: br000040
  article-title: Suffix arrays: a new method for on-line string searches
  publication-title: SIAM Journal on Computing
– volume: 62
  start-page: 54
  year: 2012
  end-page: 101
  ident: br000315
  article-title: Stronger Lempel–Ziv based compressed text indexing
  publication-title: Algorithmica
– ident: 10.1016/j.tcs.2012.02.006_br000100
– volume: 2
  start-page: 87
  year: 2004
  ident: 10.1016/j.tcs.2012.02.006_br000120
  article-title: Indexing text using the Ziv–Lempel trie
  publication-title: Journal of Discrete Algorithms
  doi: 10.1016/S1570-8667(03)00066-2
– volume: 111
  start-page: 313
  year: 2010
  ident: 10.1016/j.tcs.2012.02.006_br000335
  article-title: Self-indexed grammar-based compression
  publication-title: Fundamenta Informaticae
  doi: 10.3233/FI-2011-565
– ident: 10.1016/j.tcs.2012.02.006_br000110
  doi: 10.1145/2063576.2063646
– volume: 3
  year: 2007
  ident: 10.1016/j.tcs.2012.02.006_br000055
  article-title: Compressed representations of sequences and full-text indexes
  publication-title: ACM Transactions on Algorithms
  doi: 10.1145/1240233.1240243
– ident: 10.1016/j.tcs.2012.02.006_br000280
  doi: 10.1016/j.tcs.2012.02.015
– ident: 10.1016/j.tcs.2012.02.006_br000185
– ident: 10.1016/j.tcs.2012.02.006_br000265
– volume: 62
  start-page: 54
  issue: 1
  year: 2012
  ident: 10.1016/j.tcs.2012.02.006_br000315
  article-title: Stronger Lempel–Ziv based compressed text indexing
  publication-title: Algorithmica
  doi: 10.1007/s00453-010-9443-8
– ident: 10.1016/j.tcs.2012.02.006_br000220
  doi: 10.1145/1290672.1290680
– ident: 10.1016/j.tcs.2012.02.006_br000140
  doi: 10.1007/978-3-642-16321-0_20
– volume: 33
  start-page: 66
  year: 1986
  ident: 10.1016/j.tcs.2012.02.006_br000310
  article-title: An implicit data structure supporting insertion, deletion, and search in O(logn) time
  publication-title: Journal of Computer System Sciences
  doi: 10.1016/0022-0000(86)90043-7
– ident: 10.1016/j.tcs.2012.02.006_br000295
  doi: 10.1007/3-540-45061-0_73
– ident: 10.1016/j.tcs.2012.02.006_br000190
– volume: 29
  start-page: 893
  year: 1999
  ident: 10.1016/j.tcs.2012.02.006_br000200
  article-title: Compression of low entropy strings with Lempel–Ziv algorithms
  publication-title: SIAM Journal on Computing
  doi: 10.1137/S0097539797331105
– volume: 48
  start-page: 294
  year: 2003
  ident: 10.1016/j.tcs.2012.02.006_br000170
  article-title: New text indexing functionalities of the compressed suffix arrays
  publication-title: Journal of Algorithms
  doi: 10.1016/S0196-6774(03)00087-7
– ident: 10.1016/j.tcs.2012.02.006_br000225
  doi: 10.1007/3-540-45061-0_29
– ident: 10.1016/j.tcs.2012.02.006_br000145
  doi: 10.1007/978-3-540-89097-3_17
– volume: 99
  start-page: 246
  year: 2006
  ident: 10.1016/j.tcs.2012.02.006_br000060
  article-title: Large alphabets and incompressibility
  publication-title: Information Processing Letters
  doi: 10.1016/j.ipl.2006.04.008
– volume: 48
  start-page: 407
  year: 2001
  ident: 10.1016/j.tcs.2012.02.006_br000045
  article-title: An analysis of the Burrows-Wheeler transform
  publication-title: Journal of the ACM
  doi: 10.1145/382780.382782
– ident: 10.1016/j.tcs.2012.02.006_br000005
  doi: 10.1109/DCC.2010.29
– volume: 88
  start-page: 1722
  year: 2000
  ident: 10.1016/j.tcs.2012.02.006_br000070
  article-title: Off-line dictionary-based compression
  publication-title: Proceedings of the IEEE
  doi: 10.1109/5.892708
– volume: 5
  start-page: 501
  year: 2008
  ident: 10.1016/j.tcs.2012.02.006_br000135
  article-title: A compressed self-index using a Ziv–Lempel dictionary
  publication-title: Information Retrieval
– ident: 10.1016/j.tcs.2012.02.006_br000065
  doi: 10.1109/DCC.1994.305932
– volume: 15
  start-page: 514
  year: 1968
  ident: 10.1016/j.tcs.2012.02.006_br000230
  article-title: PATRICIA—Practical algorithm to retrieve information coded in alphanumeric
  publication-title: Journal of the ACM
  doi: 10.1145/321479.321481
– volume: 23
  start-page: 337
  year: 1977
  ident: 10.1016/j.tcs.2012.02.006_br000080
  article-title: A universal algorithm for sequential data compression
  publication-title: IEEE Transactions on Information Theory
  doi: 10.1109/TIT.1977.1055714
– ident: 10.1016/j.tcs.2012.02.006_br000130
  doi: 10.1007/11780441_29
– volume: 35
  start-page: 378
  year: 2005
  ident: 10.1016/j.tcs.2012.02.006_br000160
  article-title: Compressed suffix arrays and suffix trees with applications to text indexing and string matching
  publication-title: SIAM Journal of Computing
  doi: 10.1137/S0097539702402354
– volume: 22
  start-page: 935
  year: 1993
  ident: 10.1016/j.tcs.2012.02.006_br000040
  article-title: Suffix arrays: a new method for on-line string searches
  publication-title: SIAM Journal on Computing
  doi: 10.1137/0222058
– volume: 1
  start-page: 605
  year: 2008
  ident: 10.1016/j.tcs.2012.02.006_br000290
  article-title: Lempel-Ziv factorization using less time & space
  publication-title: Mathematics in Computer Science
  doi: 10.1007/s11786-007-0024-4
– ident: 10.1016/j.tcs.2012.02.006_br000245
  doi: 10.1109/FOCS.2008.83
– ident: 10.1016/j.tcs.2012.02.006_br000320
  doi: 10.1007/978-3-540-87744-8_58
– ident: 10.1016/j.tcs.2012.02.006_br000155
– ident: 10.1016/j.tcs.2012.02.006_br000025
  doi: 10.1109/SFCS.2000.892127
– volume: 32
  start-page: 490
  year: 1989
  ident: 10.1016/j.tcs.2012.02.006_br000215
  article-title: Data compression with finite windows
  publication-title: Communications of the ACM
  doi: 10.1145/63334.63341
– volume: 387
  start-page: 332
  year: 2007
  ident: 10.1016/j.tcs.2012.02.006_br000240
  article-title: Rank and select revisited and extended
  publication-title: Theoretical Computer Science
  doi: 10.1016/j.tcs.2007.07.013
– ident: 10.1016/j.tcs.2012.02.006_br000115
  doi: 10.1145/1109557.1109693
– ident: 10.1016/j.tcs.2012.02.006_br000090
  doi: 10.1007/978-3-642-03816-7_21
– volume: 13
  year: 2009
  ident: 10.1016/j.tcs.2012.02.006_br000325
  article-title: Implementing the LZ-index: Theory versus practice
  publication-title: ACM Journal of Experimental Algorithmics
  doi: 10.1145/1412228.1412230
– volume: 43
  start-page: 275
  year: 2005
  ident: 10.1016/j.tcs.2012.02.006_br000235
  article-title: Representing trees of higher degree
  publication-title: Algorithmica
  doi: 10.1007/s00453-004-1146-6
– volume: 302
  start-page: 211
  year: 2003
  ident: 10.1016/j.tcs.2012.02.006_br000175
  article-title: Application of Lempel–Ziv factorization to the approximation of grammar-based compression
  publication-title: Theoretical Computer Science
  doi: 10.1016/S0304-3975(02)00777-6
– year: 1986
  ident: 10.1016/j.tcs.2012.02.006_br000210
– ident: 10.1016/j.tcs.2012.02.006_br000260
  doi: 10.1007/978-3-642-12200-2_16
– volume: 17
  start-page: 81
  year: 1983
  ident: 10.1016/j.tcs.2012.02.006_br000300
  article-title: Log–logarithmic worst-case range queries are possible in space Θ(n)
  publication-title: Information Processing Letters
  doi: 10.1016/0020-0190(83)90075-3
– ident: 10.1016/j.tcs.2012.02.006_br000195
  doi: 10.1145/225058.225288
– volume: 52
  start-page: 552
  year: 2005
  ident: 10.1016/j.tcs.2012.02.006_br000125
  article-title: Indexing compressed text
  publication-title: Journal of the ACM
  doi: 10.1145/1082036.1082039
– ident: 10.1016/j.tcs.2012.02.006_br000105
  doi: 10.1109/BIBE.2010.22
– ident: 10.1016/j.tcs.2012.02.006_br000255
  doi: 10.1137/1.9781611972900.9
– start-page: 85
  year: 1985
  ident: 10.1016/j.tcs.2012.02.006_br000035
  article-title: The myriad virtues of subword trees
– ident: 10.1016/j.tcs.2012.02.006_br000330
  doi: 10.1007/978-3-540-74450-4_41
– volume: 17
  start-page: 281
  year: 2010
  ident: 10.1016/j.tcs.2012.02.006_br000150
  article-title: Storage and retrieval of highly repetitive sequence collections
  publication-title: Journal of Computational Biology
  doi: 10.1089/cmb.2009.0169
– volume: 410
  start-page: 5354
  year: 2009
  ident: 10.1016/j.tcs.2012.02.006_br000275
  article-title: Faster entropy-bounded compressed suffix trees
  publication-title: Theoretical Computer Science
  doi: 10.1016/j.tcs.2009.09.012
– volume: 24
  start-page: 530
  year: 1978
  ident: 10.1016/j.tcs.2012.02.006_br000085
  article-title: Compression of individual sequences via variable-rate coding
  publication-title: IEEE Transactions on Information Theory
  doi: 10.1109/TIT.1978.1055934
– volume: 38
  start-page: 66
  year: 1992
  ident: 10.1016/j.tcs.2012.02.006_br000205
  article-title: Upper bounds on the probability of sequences emitted by finite-state sources and on the redundancy of the Lempel–Ziv algorithm
  publication-title: IEEE Transactions on Information Theory
  doi: 10.1109/18.108250
– ident: 10.1016/j.tcs.2012.02.006_br000285
  doi: 10.1137/1.9781611972870.6
– ident: 10.1016/j.tcs.2012.02.006_br000305
– volume: 33
  start-page: 37
  year: 2000
  ident: 10.1016/j.tcs.2012.02.006_br000015
  article-title: Compression: a key for next-generation text retrieval systems
  publication-title: IEEE Computer
  doi: 10.1109/2.881693
– volume: 39
  year: 2007
  ident: 10.1016/j.tcs.2012.02.006_br000020
  article-title: Compressed full-text indexes
  publication-title: ACM Computing Surveys
  doi: 10.1145/1216370.1216372
– volume: 12
  start-page: 40
  year: 2005
  ident: 10.1016/j.tcs.2012.02.006_br000165
  article-title: Succinct suffix arrays based on run-length encoding
  publication-title: Nordic Journal of Computing
– ident: 10.1016/j.tcs.2012.02.006_br000345
  doi: 10.1007/978-3-540-78773-0_32
– ident: 10.1016/j.tcs.2012.02.006_br000180
– ident: 10.1016/j.tcs.2012.02.006_br000050
– ident: 10.1016/j.tcs.2012.02.006_br000250
  doi: 10.1007/978-3-642-03784-9_12
– ident: 10.1016/j.tcs.2012.02.006_br000350
– volume: 38
  start-page: 2162
  year: 2009
  ident: 10.1016/j.tcs.2012.02.006_br000340
  article-title: Breaking a time-and-space barrier in constructing full-text indices
  publication-title: SIAM Journal on Computing
  doi: 10.1137/070685373
– ident: 10.1016/j.tcs.2012.02.006_br000030
  doi: 10.1137/S0097539702402354
– ident: 10.1016/j.tcs.2012.02.006_br000270
  doi: 10.1109/DCC.2010.43
– volume: 22
  start-page: 75
  year: 1976
  ident: 10.1016/j.tcs.2012.02.006_br000075
  article-title: On the complexity of finite sequences
  publication-title: IEEE Transactions on Information Theory
  doi: 10.1109/TIT.1976.1055501
– ident: 10.1016/j.tcs.2012.02.006_br000010
  doi: 10.1007/978-3-642-21458-5_6
– ident: 10.1016/j.tcs.2012.02.006_br000095
  doi: 10.1137/1.9781611973082.30
SSID ssj0000576
Score 2.470366
Snippet We introduce LZ-End, a new member of the Lempel–Ziv family of text compressors, which achieves compression ratios close to those of LZ77 but is much faster at...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 115
SubjectTerms Compression
Lempel–Ziv
Repetitive texts
Self-indexing
Succinct data structures
Title On compressing and indexing repetitive sequences
URI https://dx.doi.org/10.1016/j.tcs.2012.02.006
Volume 483
WOSCitedRecordID wos000318890000012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-2294
  dateEnd: 20180131
  omitProxy: false
  ssIdentifier: ssj0000576
  issn: 0304-3975
  databaseCode: AIEXJ
  dateStart: 19950109
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fS9xAEB7s2Yf60Fpb0dpKHvrUspJkN9ndRymKFj0LXuHeQrLZQEVyx10U8a_vTHaTC9qWtiAcyyVkk2W_2Zn9MfMNwMccbZKQ3LCqCgUTKi-ZiivBOFrzkhcpSlUbKHwmx2M1nepvPs3dsk0nIOta3d3p-ZNCjfcQbAqd_Qe4-5fiDfyPoGOJsGP5V8BfOMfy1r_VByC2lIh0sbBziiojb6Heh3o4PZ0MwhqNz_fw2RvJXjWjTW0196VFE9gMxGuc3xKlY7vVPqvv8-vZcE-B8jsIFq801-NgFxdgRYco2iU6ObBOXyqpWRy7PMWdQhWKD1RiFCUD6xo52otHitvtIVwdNIY41GmDlohUH5Bkt2b3ktpBzYjIHQxXb89gPZaJViNYPzw9mn5dGeJEuqNq3-7uULt173vwoV9PSwZTjckmvPRrhODQYfsa1my9Ba-6_BuBV8dbsHHec-4u30B4UQcD4AMEPuiAD1bABz3wb-H78dHkywnz-TCY4SlvWFUWhkKhE4trwsQUPCp1glfW5InWNjJpaXCAcRlWOAIrHGlpqGxCFIu0ci34NozqWW13INClUcbK0FYyF7bUWAiBk2GTKiFMVexC2HVHZjxZPOUsuc46r8CrDHswox7MQvyF6S586qvMHVPKnx4WXR9nXordFC5Dgfh9tXf_V20PXqyk_D2MmsWN_QDPzW3zY7nY92LzE95qdks
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=On+compressing+and+indexing+repetitive+sequences&rft.jtitle=Theoretical+computer+science&rft.au=Kreft%2C+Sebastian&rft.au=Navarro%2C+Gonzalo&rft.date=2013-04-29&rft.pub=Elsevier+B.V&rft.issn=0304-3975&rft.eissn=1879-2294&rft.volume=483&rft.spage=115&rft.epage=133&rft_id=info:doi/10.1016%2Fj.tcs.2012.02.006&rft.externalDocID=S0304397512001259
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0304-3975&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0304-3975&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0304-3975&client=summon