MASC: A bitmap index encoding algorithm for fast data retrieval

The fast retrieval in archival traffic data is essential for network security and forensic analysis. A bitmap index is a data structure enabling fast search over large data collections in a limited time, but the space consumption is always a problem. WAH, PLWAH and COMPAX are proposed for compressin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE International Conference on Communications (2003) S. 1 - 6
Hauptverfasser: Wen, Yuhao, Wang, Han, Chen, Zhen, Cao, Junwei, Peng, Guodong, Huang, Wen-Liang, Hu, Ziwei, Zhou, Jing, Guo, Jinghong
Format: Tagungsbericht Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 01.05.2016
Schlagworte:
ISSN:1938-1883
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The fast retrieval in archival traffic data is essential for network security and forensic analysis. A bitmap index is a data structure enabling fast search over large data collections in a limited time, but the space consumption is always a problem. WAH, PLWAH and COMPAX are proposed for compressing bitmap indexes for less storage. In this paper, a new bitmap index encoding scheme, named MASC, is proposed to further improve the compression ratio without impairing the query performance. Instead of being limited to a fixed length (31 bits) in PLWAH and COMPAX, the stride size can be as long as possible to encode consecutive zero bits and nonzero bits in a more compact way. Instead of piggyback used in PLWAH, a new structure in MASC called carrier is introduced as piggyback in PLWAH only carries an individual nonzero bit. We also generalize the traditional literal word concept in PLWAH and COMPAX. The validity of MASC encoding scheme is demonstrated with the application in Internet Traffic Archival system. Based on experiments with real Internet traffic data set from CAIDA, MASC has a better compression ratio than PLWAH and COMPAX2 without the penalty in query performance.
Bibliographie:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
ISSN:1938-1883
DOI:10.1109/ICC.2016.7510827