On the performance of data compression algorithms based upon string matching

Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the databa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory Jg. 44; H. 1; S. 47 - 65
Hauptverfasser: En-hui Yang, Kieffer, J.C.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.01.1998
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0018-9448, 1557-9654
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources.
AbstractList Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources
Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv is extended.
Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources.
Author En-hui Yang
Kieffer, J.C.
Author_xml – sequence: 1
  surname: En-hui Yang
  fullname: En-hui Yang
  organization: Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada
– sequence: 2
  givenname: J.C.
  surname: Kieffer
  fullname: Kieffer, J.C.
BookMark eNptkDtrwzAUhUVJoUnaoWsn0aHQwYmkSLY0ltAXBLK0s5Dl68TBtlxJGfrvq-LQIXS6r-8cLmeGJr3rAaFbShaUErWkcpELomRxgaZUiCJTueATNCWEykxxLq_QLIRDGrmgbIo22x7HPeABfO18Z3oL2NW4MtFg67rBQwiN67Fpd843cd8FXJoAFT4OaRuib_od7ky0-9Rco8vatAFuTnWOPl-eP9Zv2Wb7-r5-2mR2RXjMFLMlKJtXrKZAGbeVFYJXQpZEMitzaXIFtFSFkVDxXEieSw4FkUAKpixbzdHD6Dt493WEEHXXBAtta3pwx6CZ5DxnjCbw_gw8uKPv02-aKqEYlZIn6HGErHcheKj14JvO-G9Nif4NVVOpx1ATuzxjbRNNTAlFb5r2X8XdqGgA4M_5dPwBGjSCTQ
CODEN IETTAW
CitedBy_id crossref_primary_10_1109_18_841161
crossref_primary_10_1016_S0165_1684_02_00302_X
crossref_primary_10_1109_TIT_2024_3356477
crossref_primary_10_1016_j_ic_2007_03_001
crossref_primary_10_1109_18_979328
crossref_primary_10_1109_18_720530
crossref_primary_10_1109_TIT_2003_820019
crossref_primary_10_1086_426363
crossref_primary_10_1109_TIT_2014_2373382
crossref_primary_10_1109_18_796370
crossref_primary_10_1109_34_777372
crossref_primary_10_1109_18_817514
crossref_primary_10_1109_TIT_2010_2040867
crossref_primary_10_1109_TIT_2002_800493
crossref_primary_10_1137_S0097539797331105
crossref_primary_10_1109_TIT_2008_2006449
crossref_primary_10_1109_TIT_2011_2158899
crossref_primary_10_1109_18_761253
crossref_primary_10_1109_TIT_2015_2428238
crossref_primary_10_1109_TIT_2004_830793
crossref_primary_10_1134_S0032946012040072
crossref_primary_10_1109_TIP_2008_2002308
crossref_primary_10_1109_TIT_2008_926387
crossref_primary_10_1017_S0027763000008242
crossref_primary_10_1109_18_749005
crossref_primary_10_1214_aop_1015345604
crossref_primary_10_1023_A_1010759616411
crossref_primary_10_1109_TCOMM_2012_061412_110194
crossref_primary_10_1109_TIT_2002_1003841
crossref_primary_10_1109_18_705559
crossref_primary_10_1109_TIT_2008_924668
crossref_primary_10_1109_18_782108
crossref_primary_10_1109_TIT_2011_2178059
crossref_primary_10_1109_TIT_2006_872845
crossref_primary_10_1109_TIT_2003_810637
crossref_primary_10_1109_83_988964
crossref_primary_10_1109_TIT_2011_2136910
crossref_primary_10_1109_18_904515
crossref_primary_10_1109_TIT_2003_809491
crossref_primary_10_1109_18_904532
crossref_primary_10_1109_TIT_2023_3247601
crossref_primary_10_1214_105051604000000576
crossref_primary_10_1214_aoap_1029962749
Cites_doi 10.1109/18.623143
10.1109/TIT.1987.1057355
10.1007/BF01066715
10.1109/18.179344
10.1109/5.286191
10.1109/18.75241
10.1002/0471200611
10.1137/0512027
10.1109/18.259648
10.1109/18.256495
10.1109/TIT.1976.1055501
10.1109/18.45281
10.1109/TIT.1977.1055714
10.1109/18.490547
10.1109/TIT.1980.1056170
10.1109/18.605570
10.1109/TIT.1975.1055349
10.1109/ISIT.1994.395004
10.1109/18.382018
10.1109/TIT.1978.1055934
10.1109/18.149506
ContentType Journal Article
Copyright Copyright Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 1998
Copyright_xml – notice: Copyright Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 1998
DBID RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/18.650987
DatabaseName IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts
Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEL
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1557-9654
EndPage 65
ExternalDocumentID 26658794
10_1109_18_650987
650987
Genre Feature
GroupedDBID -~X
.DC
0R~
29I
3EH
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACGFS
ACGOD
ACIWK
AENEX
AETEA
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
VH1
VJK
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c304t-92cbe9c6d2f1e124cdc554d58b082c868a69e1b97a8ed46584684e708e0729c23
IEDL.DBID RIE
ISICitedReferencesCount 65
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000071193500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0018-9448
IngestDate Sun Sep 28 12:35:59 EDT 2025
Sun Nov 30 04:43:23 EST 2025
Sat Nov 29 02:54:45 EST 2025
Tue Nov 18 22:21:01 EST 2025
Tue Aug 26 21:00:16 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c304t-92cbe9c6d2f1e124cdc554d58b082c868a69e1b97a8ed46584684e708e0729c23
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-2
content type line 23
PQID 195921884
PQPubID 36024
PageCount 19
ParticipantIDs proquest_miscellaneous_28446221
crossref_primary_10_1109_18_650987
proquest_journals_195921884
ieee_primary_650987
crossref_citationtrail_10_1109_18_650987
PublicationCentury 1900
PublicationDate 1998-Jan.
1998-01-00
19980101
PublicationDateYYYYMMDD 1998-01-01
PublicationDate_xml – month: 01
  year: 1998
  text: 1998-Jan.
PublicationDecade 1990
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on information theory
PublicationTitleAbbrev TIT
PublicationYear 1998
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref24
ref12
cover (ref3) 1991
ref23
ref15
ref14
ref20
ref11
ref10
csisza´r (ref4) 1981
ref21
billingsley (ref2) 1968
ref17
ref16
ref19
ref18
ref8
halverson (ref7) 1980; it 26
ref9
zhang (ref22) 1996; 42
berger (ref1) 1971
ref6
ref5
References_xml – ident: ref11
  doi: 10.1109/18.623143
– ident: ref6
  doi: 10.1109/TIT.1987.1057355
– ident: ref15
  doi: 10.1007/BF01066715
– ident: ref14
  doi: 10.1109/18.179344
– ident: ref19
  doi: 10.1109/5.286191
– ident: ref9
  doi: 10.1109/18.75241
– year: 1991
  ident: ref3
  publication-title: Elements of Information Theory
  doi: 10.1002/0471200611
– year: 1981
  ident: ref4
  publication-title: Information Theory Coding Theorems for Discrete Memoryless Systems
– ident: ref8
  doi: 10.1137/0512027
– ident: ref17
  doi: 10.1109/18.259648
– year: 1968
  ident: ref2
  publication-title: Convergence of Probability Measures
– ident: ref16
  doi: 10.1109/18.256495
– ident: ref12
  doi: 10.1109/TIT.1976.1055501
– ident: ref18
  doi: 10.1109/18.45281
– ident: ref23
  doi: 10.1109/TIT.1977.1055714
– volume: 42
  start-page: 822
  year: 1996
  ident: ref22
  article-title: an on-line universal lossy data compression algorithm by continuous codebook refinement-part ii: optimality for $\phi$-mixing source models
  publication-title: IEEE Trans Information Theory
  doi: 10.1109/18.490547
– volume: it 26
  start-page: 189
  year: 1980
  ident: ref7
  article-title: discrete-time detection in $\phi$-mixing noise
  publication-title: IEEE Trans Information Theory
  doi: 10.1109/TIT.1980.1056170
– ident: ref21
  doi: 10.1109/18.605570
– ident: ref5
  doi: 10.1109/TIT.1975.1055349
– ident: ref10
  doi: 10.1109/ISIT.1994.395004
– ident: ref20
  doi: 10.1109/18.382018
– year: 1971
  ident: ref1
  publication-title: Rate Distortion Theory
– ident: ref24
  doi: 10.1109/TIT.1978.1055934
– ident: ref13
  doi: 10.1109/18.149506
SSID ssj0014512
Score 1.8275967
Snippet Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In...
Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv is extended.
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 47
SubjectTerms Algorithm design and analysis
Algorithms
Communication system control
Data compression
Decoding
Encoding
Information processing
Information theory
Random variables
Rate-distortion
Terrorism
Title On the performance of data compression algorithms based upon string matching
URI https://ieeexplore.ieee.org/document/650987
https://www.proquest.com/docview/195921884
https://www.proquest.com/docview/28446221
Volume 44
WOSCitedRecordID wos000071193500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEL
  customDbUrl:
  eissn: 1557-9654
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014512
  issn: 0018-9448
  databaseCode: RIE
  dateStart: 19630101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED7RigEGCgVEKQ8LMbCkjRM3tkeEqBhQYQCpW5TYl4IESdUHvx8_0gKChS1ynCg653x3vrvvA7jMozBHE1oFCmMMWKzyQEaaBTTRmdkaWBExTzbBRyMxHsvHGmfb9cIgois-w569dLl8XamlPSrrW7Q3wRvQ4DzxrVrrhAEbUA8MTo3-mpCjBhGioexT0fMP_jA9jkvl1wbsrMqw9a_v2YWd2nkk136192ADyza0VsQMpNbTNmx_Qxnch_uHkhg3j0y_egRIVRBbG0psRbmvhC1J9japZq-Ll_c5sbZNk-XUjFpej3JCjGPrqi4P4Hl4-3RzF9QkCoGKQ7Ywklc5SpXoqKBojLnSyngQeiByY_yVSESWSKS55JlAzZw_IhjyUKDFFFdRfAjNsirxCEhUaKExlqowMZwseJ5LpnhIC45JFmeqA1cr-aaqRhi3RBdvqYs0QplSkXqZdeBiPXXqYTX-mtS2Ml9PWI12V2uW1vo2Ty1EjnFWBOvA-fquURSb_chKrJbz1NhhlkQRPf7zrV3Y8t2G9nDlBJqL2RJPYVN9LF7nszP3r30CQtvUEA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDLZ4ScCBxwAxnhHiwKXQpFmbHBFiGmIMDiBxq9rEHUjQTnvw-8mjGyC4cKvStKqcOrZj-_sATnMW5mhCq0BhhAGPVB5IpnlAY52ZrYEXjHuyiaTXE8_P8qHG2Xa9MIjois_w3F66XL6u1MQelV1YtDeRzMNii3MW-matWcqAt6iHBqdGg03QUcMI0VBeUHHuH_1hfBybyq8t2NmV9vq_vmgD1mr3kVz69d6EOSwbsD6lZiC1pjZg9RvO4BZ070tiHD0y-OoSIFVBbHUosTXlvha2JNlbvxq-jl_eR8RaN00mAzNqmT3KPjGurau73Ian9vXjVSeoaRQCFYV8bGSvcpQq1qygaMy50sr4ELolcmP-lYhFFkukuUwygZo7j0RwTEKBFlVcsWgHFsqqxF0grNBCYyRVYaI4WSR5LrlKQlokGGdRpppwNpVvqmqMcUt18Za6WCOUKRWpl1kTTmZTBx5Y469JDSvz2YTp6P50zdJa40apBckx7orgTTie3TWqYvMfWYnVZJQaS8xjxujen289huXO41037d70bvdhxfce2qOWA1gYDyd4CEvqY_w6Gh65_-4Tb0rXVw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=On+the+performance+of+data+compression+algorithms+based+upon+string+matching&rft.jtitle=IEEE+transactions+on+information+theory&rft.au=En-hui+Yang&rft.au=Kieffer%2C+J.C.&rft.date=1998-01-01&rft.issn=0018-9448&rft.volume=44&rft.issue=1&rft.spage=47&rft.epage=65&rft_id=info:doi/10.1109%2F18.650987&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_18_650987
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9448&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9448&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9448&client=summon