Data-Aware Adaptive Compression for Stream Processing

Stream processing has been in widespread use, and one of the most common application scenarios is SQL query on streams. By 2021, the global deployment of IoT endpoints reached 12.3 billion, indicating a surge in data generation. However, the escalating demands for high throughput and low latency in...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on knowledge and data engineering Ročník 36; číslo 9; s. 4531 - 4549
Hlavní autoři: Zhang, Yu, Zhang, Feng, Li, Hourun, Zhang, Shuhao, Guo, Xiaoguang, Chen, Yuxing, Pan, Anqun, Du, Xiaoyong
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.09.2024
Témata:
ISSN:1041-4347, 1558-2191
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Stream processing has been in widespread use, and one of the most common application scenarios is SQL query on streams. By 2021, the global deployment of IoT endpoints reached 12.3 billion, indicating a surge in data generation. However, the escalating demands for high throughput and low latency in stream processing systems have posed significant challenges due to the increasing data volume and evolving user requirements. We present a compression-based stream processing engine, called CompressStreamDB, which enables adaptive fine-grained stream processing directly on compressed streams, to significantly enhance the performance of existing stream processing solutions. CompressStreamDB utilizes nine diverse compression methods tailored for different stream data types and integrates a cost model to automatically select the most efficient compression schemes. CompressStreamDB provides high throughput with low latency in stream SQL processing by identifying and eliminating redundant data among streams. Our evaluation demonstrates that CompressStreamDB improves average performance by 3.84× and reduces average delay by 68.0% compared to the state-of-the-art stream processing solution for uncompressed streams, along with 68.7% space savings. Besides, our edge trials show an average throughput/price ratio of 9.95× and a throughput/power ratio of 7.32× compared to the cloud design.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2024.3377710