ItCompress: an iterative semantic compression algorithm
Real datasets are often large enough to necessitate data compression. Traditional 'syntactic' data compression methods treat the table as a large byte string and operate at the byte level. The tradeoff in such cases is usually between the ease of retrieval (the ease with which one can retr...
Saved in:
| Published in: | 20th International Conference on Data Engineering (ICDE 2004) pp. 646 - 657 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
Los Alamitos CA
IEEE
2004
IEEE Computer Society |
| Subjects: | |
| ISBN: | 9780769520650, 0769520650 |
| ISSN: | 1063-6382 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Real datasets are often large enough to necessitate data compression. Traditional 'syntactic' data compression methods treat the table as a large byte string and operate at the byte level. The tradeoff in such cases is usually between the ease of retrieval (the ease with which one can retrieve a single tuple or attribute value without decompressing a much larger unit) and the effectiveness of the compression. In this regard, the use of semantic compression has generated considerable interest and motivated certain recent works. We propose a semantic compression algorithm called ItCompress ITerative Compression, which achieves good compression while permitting access even at attribute level without requiring the decompression of a larger unit. ItCompress iteratively improves the compression ratio of the compressed output during each scan of the table. The amount of compression can be tuned based on the number of iterations. Moreover, the initial iterations provide significant compression, thereby making it a cost-effective compression technique. Extensive experiments were conducted and the results indicate the superiority of ItCompress with respect to previously known techniques, such as 'SPARTAN' and 'fascicles'. |
|---|---|
| ISBN: | 9780769520650 0769520650 |
| ISSN: | 1063-6382 |
| DOI: | 10.1109/ICDE.2004.1320034 |

