On the performance of data compression algorithms based upon string matching
Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the databa...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on information theory Jg. 44; H. 1; S. 47 - 65 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
IEEE
01.01.1998
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 0018-9448, 1557-9654 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources. |
|---|---|
| AbstractList | Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv is extended. Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources. |
| Author | En-hui Yang Kieffer, J.C. |
| Author_xml | – sequence: 1 surname: En-hui Yang fullname: En-hui Yang organization: Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada – sequence: 2 givenname: J.C. surname: Kieffer fullname: Kieffer, J.C. |
| BookMark | eNptkDtrwzAUhUVJoUnaoWsn0aHQwYmkSLY0ltAXBLK0s5Dl68TBtlxJGfrvq-LQIXS6r-8cLmeGJr3rAaFbShaUErWkcpELomRxgaZUiCJTueATNCWEykxxLq_QLIRDGrmgbIo22x7HPeABfO18Z3oL2NW4MtFg67rBQwiN67Fpd843cd8FXJoAFT4OaRuib_od7ky0-9Rco8vatAFuTnWOPl-eP9Zv2Wb7-r5-2mR2RXjMFLMlKJtXrKZAGbeVFYJXQpZEMitzaXIFtFSFkVDxXEieSw4FkUAKpixbzdHD6Dt493WEEHXXBAtta3pwx6CZ5DxnjCbw_gw8uKPv02-aKqEYlZIn6HGErHcheKj14JvO-G9Nif4NVVOpx1ATuzxjbRNNTAlFb5r2X8XdqGgA4M_5dPwBGjSCTQ |
| CODEN | IETTAW |
| CitedBy_id | crossref_primary_10_1109_18_841161 crossref_primary_10_1016_S0165_1684_02_00302_X crossref_primary_10_1109_TIT_2024_3356477 crossref_primary_10_1016_j_ic_2007_03_001 crossref_primary_10_1109_18_979328 crossref_primary_10_1109_18_720530 crossref_primary_10_1109_TIT_2003_820019 crossref_primary_10_1086_426363 crossref_primary_10_1109_TIT_2014_2373382 crossref_primary_10_1109_18_796370 crossref_primary_10_1109_34_777372 crossref_primary_10_1109_18_817514 crossref_primary_10_1109_TIT_2010_2040867 crossref_primary_10_1109_TIT_2002_800493 crossref_primary_10_1137_S0097539797331105 crossref_primary_10_1109_TIT_2008_2006449 crossref_primary_10_1109_TIT_2011_2158899 crossref_primary_10_1109_18_761253 crossref_primary_10_1109_TIT_2015_2428238 crossref_primary_10_1109_TIT_2004_830793 crossref_primary_10_1134_S0032946012040072 crossref_primary_10_1109_TIP_2008_2002308 crossref_primary_10_1109_TIT_2008_926387 crossref_primary_10_1017_S0027763000008242 crossref_primary_10_1109_18_749005 crossref_primary_10_1214_aop_1015345604 crossref_primary_10_1023_A_1010759616411 crossref_primary_10_1109_TCOMM_2012_061412_110194 crossref_primary_10_1109_TIT_2002_1003841 crossref_primary_10_1109_18_705559 crossref_primary_10_1109_TIT_2008_924668 crossref_primary_10_1109_18_782108 crossref_primary_10_1109_TIT_2011_2178059 crossref_primary_10_1109_TIT_2006_872845 crossref_primary_10_1109_TIT_2003_810637 crossref_primary_10_1109_83_988964 crossref_primary_10_1109_TIT_2011_2136910 crossref_primary_10_1109_18_904515 crossref_primary_10_1109_TIT_2003_809491 crossref_primary_10_1109_18_904532 crossref_primary_10_1109_TIT_2023_3247601 crossref_primary_10_1214_105051604000000576 crossref_primary_10_1214_aoap_1029962749 |
| Cites_doi | 10.1109/18.623143 10.1109/TIT.1987.1057355 10.1007/BF01066715 10.1109/18.179344 10.1109/5.286191 10.1109/18.75241 10.1002/0471200611 10.1137/0512027 10.1109/18.259648 10.1109/18.256495 10.1109/TIT.1976.1055501 10.1109/18.45281 10.1109/TIT.1977.1055714 10.1109/18.490547 10.1109/TIT.1980.1056170 10.1109/18.605570 10.1109/TIT.1975.1055349 10.1109/ISIT.1994.395004 10.1109/18.382018 10.1109/TIT.1978.1055934 10.1109/18.149506 |
| ContentType | Journal Article |
| Copyright | Copyright Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 1998 |
| Copyright_xml | – notice: Copyright Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 1998 |
| DBID | RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/18.650987 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library (IEL) (UW System Shared) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEL url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1557-9654 |
| EndPage | 65 |
| ExternalDocumentID | 26658794 10_1109_18_650987 650987 |
| Genre | Feature |
| GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACGOD ACIWK AENEX AETEA AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 IAAWW IBMZZ ICLAB IDIHD IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS RXW TAE TN5 VH1 VJK AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c304t-92cbe9c6d2f1e124cdc554d58b082c868a69e1b97a8ed46584684e708e0729c23 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 65 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000071193500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0018-9448 |
| IngestDate | Sun Sep 28 12:35:59 EDT 2025 Sun Nov 30 04:43:23 EST 2025 Sat Nov 29 02:54:45 EST 2025 Tue Nov 18 22:21:01 EST 2025 Tue Aug 26 21:00:16 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c304t-92cbe9c6d2f1e124cdc554d58b082c868a69e1b97a8ed46584684e708e0729c23 |
| Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 |
| PQID | 195921884 |
| PQPubID | 36024 |
| PageCount | 19 |
| ParticipantIDs | proquest_miscellaneous_28446221 crossref_primary_10_1109_18_650987 proquest_journals_195921884 ieee_primary_650987 crossref_citationtrail_10_1109_18_650987 |
| PublicationCentury | 1900 |
| PublicationDate | 1998-Jan. 1998-01-00 19980101 |
| PublicationDateYYYYMMDD | 1998-01-01 |
| PublicationDate_xml | – month: 01 year: 1998 text: 1998-Jan. |
| PublicationDecade | 1990 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on information theory |
| PublicationTitleAbbrev | TIT |
| PublicationYear | 1998 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref24 ref12 cover (ref3) 1991 ref23 ref15 ref14 ref20 ref11 ref10 csisza´r (ref4) 1981 ref21 billingsley (ref2) 1968 ref17 ref16 ref19 ref18 ref8 halverson (ref7) 1980; it 26 ref9 zhang (ref22) 1996; 42 berger (ref1) 1971 ref6 ref5 |
| References_xml | – ident: ref11 doi: 10.1109/18.623143 – ident: ref6 doi: 10.1109/TIT.1987.1057355 – ident: ref15 doi: 10.1007/BF01066715 – ident: ref14 doi: 10.1109/18.179344 – ident: ref19 doi: 10.1109/5.286191 – ident: ref9 doi: 10.1109/18.75241 – year: 1991 ident: ref3 publication-title: Elements of Information Theory doi: 10.1002/0471200611 – year: 1981 ident: ref4 publication-title: Information Theory Coding Theorems for Discrete Memoryless Systems – ident: ref8 doi: 10.1137/0512027 – ident: ref17 doi: 10.1109/18.259648 – year: 1968 ident: ref2 publication-title: Convergence of Probability Measures – ident: ref16 doi: 10.1109/18.256495 – ident: ref12 doi: 10.1109/TIT.1976.1055501 – ident: ref18 doi: 10.1109/18.45281 – ident: ref23 doi: 10.1109/TIT.1977.1055714 – volume: 42 start-page: 822 year: 1996 ident: ref22 article-title: an on-line universal lossy data compression algorithm by continuous codebook refinement-part ii: optimality for $\phi$-mixing source models publication-title: IEEE Trans Information Theory doi: 10.1109/18.490547 – volume: it 26 start-page: 189 year: 1980 ident: ref7 article-title: discrete-time detection in $\phi$-mixing noise publication-title: IEEE Trans Information Theory doi: 10.1109/TIT.1980.1056170 – ident: ref21 doi: 10.1109/18.605570 – ident: ref5 doi: 10.1109/TIT.1975.1055349 – ident: ref10 doi: 10.1109/ISIT.1994.395004 – ident: ref20 doi: 10.1109/18.382018 – year: 1971 ident: ref1 publication-title: Rate Distortion Theory – ident: ref24 doi: 10.1109/TIT.1978.1055934 – ident: ref13 doi: 10.1109/18.149506 |
| SSID | ssj0014512 |
| Score | 1.8275967 |
| Snippet | Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In... Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv is extended. |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 47 |
| SubjectTerms | Algorithm design and analysis Algorithms Communication system control Data compression Decoding Encoding Information processing Information theory Random variables Rate-distortion Terrorism |
| Title | On the performance of data compression algorithms based upon string matching |
| URI | https://ieeexplore.ieee.org/document/650987 https://www.proquest.com/docview/195921884 https://www.proquest.com/docview/28446221 |
| Volume | 44 |
| WOSCitedRecordID | wos000071193500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEL customDbUrl: eissn: 1557-9654 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014512 issn: 0018-9448 databaseCode: RIE dateStart: 19630101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED7RigEGCgVEKQ8LMbCkjRM3tkeEqBhQYQCpW5TYl4IESdUHvx8_0gKChS1ynCg653x3vrvvA7jMozBHE1oFCmMMWKzyQEaaBTTRmdkaWBExTzbBRyMxHsvHGmfb9cIgois-w569dLl8XamlPSrrW7Q3wRvQ4DzxrVrrhAEbUA8MTo3-mpCjBhGioexT0fMP_jA9jkvl1wbsrMqw9a_v2YWd2nkk136192ADyza0VsQMpNbTNmx_Qxnch_uHkhg3j0y_egRIVRBbG0psRbmvhC1J9japZq-Ll_c5sbZNk-XUjFpej3JCjGPrqi4P4Hl4-3RzF9QkCoGKQ7Ywklc5SpXoqKBojLnSyngQeiByY_yVSESWSKS55JlAzZw_IhjyUKDFFFdRfAjNsirxCEhUaKExlqowMZwseJ5LpnhIC45JFmeqA1cr-aaqRhi3RBdvqYs0QplSkXqZdeBiPXXqYTX-mtS2Ml9PWI12V2uW1vo2Ty1EjnFWBOvA-fquURSb_chKrJbz1NhhlkQRPf7zrV3Y8t2G9nDlBJqL2RJPYVN9LF7nszP3r30CQtvUEA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDLZ4ScCBxwAxnhHiwKXQpFmbHBFiGmIMDiBxq9rEHUjQTnvw-8mjGyC4cKvStKqcOrZj-_sATnMW5mhCq0BhhAGPVB5IpnlAY52ZrYEXjHuyiaTXE8_P8qHG2Xa9MIjois_w3F66XL6u1MQelV1YtDeRzMNii3MW-matWcqAt6iHBqdGg03QUcMI0VBeUHHuH_1hfBybyq8t2NmV9vq_vmgD1mr3kVz69d6EOSwbsD6lZiC1pjZg9RvO4BZ070tiHD0y-OoSIFVBbHUosTXlvha2JNlbvxq-jl_eR8RaN00mAzNqmT3KPjGurau73Ian9vXjVSeoaRQCFYV8bGSvcpQq1qygaMy50sr4ELolcmP-lYhFFkukuUwygZo7j0RwTEKBFlVcsWgHFsqqxF0grNBCYyRVYaI4WSR5LrlKQlokGGdRpppwNpVvqmqMcUt18Za6WCOUKRWpl1kTTmZTBx5Y469JDSvz2YTp6P50zdJa40apBckx7orgTTie3TWqYvMfWYnVZJQaS8xjxujen289huXO41037d70bvdhxfce2qOWA1gYDyd4CEvqY_w6Gh65_-4Tb0rXVw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=On+the+performance+of+data+compression+algorithms+based+upon+string+matching&rft.jtitle=IEEE+transactions+on+information+theory&rft.au=En-hui+Yang&rft.au=Kieffer%2C+J.C.&rft.date=1998-01-01&rft.issn=0018-9448&rft.volume=44&rft.issue=1&rft.spage=47&rft.epage=65&rft_id=info:doi/10.1109%2F18.650987&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_18_650987 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9448&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9448&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9448&client=summon |