On the performance of data compression algorithms based upon string matching

Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the databa...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on information theory Vol. 44; no. 1; pp. 47 - 65
Main Authors: En-hui Yang, Kieffer, J.C.
Format: Journal Article
Language:English
Published: New York IEEE 01.01.1998
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0018-9448, 1557-9654
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv (1989) is extended. In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: (1) the database and the source together form a Markov chain of finite order; (2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman (1993) is not optimal, even for memoryless or Markov sources.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-2
content type line 23
ISSN:0018-9448
1557-9654
DOI:10.1109/18.650987