A-DSP: An Adaptive Join Algorithm for Dynamic Data Stream on Cloud System

The join operations, including both equi and non-equi joins, are essential to the complex data analytics in the big data era. However, they are not inherently supported by existing DSPEs ( D istributed S tream P rocessing E ngines). The state-of-the-art join solutions on DSPEs rely on either complic...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on knowledge and data engineering Ročník 33; číslo 5; s. 1861 - 1876
Hlavní autoři: Fang, Junhua, Zhang, Rong, Zhao, Yan, Zheng, Kai, Zhou, Xiaofang, Zhou, Aoying
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.05.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1041-4347, 1558-2191
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The join operations, including both equi and non-equi joins, are essential to the complex data analytics in the big data era. However, they are not inherently supported by existing DSPEs ( D istributed S tream P rocessing E ngines). The state-of-the-art join solutions on DSPEs rely on either complicated routing strategies or resource-inefficient processing structures, which are susceptible to dynamic workload, especially when the DSPEs face various join predicate operations and skewed data distribution. In this paper, we propose a new cost-effective stream join framework, named A-DSP ( A daptive D imensional S pace P rocessing), which enhances the adaptability of real-time join model and minimizes the resource used over the dynamic workloads. Our proposal includes: 1) a join model generation algorithm devised to adaptively switch between different join schemes so as to minimize the number of processing task required; 2) a load-balancing mechanism which maximizes the processing throughput; and 3) a lightweight algorithm designed for cutting down unnecessary migration cost. Extensive experiments are conducted to compare our proposal against state-of-the-art solutions on both benchmark and real-world workloads. The experimental results verify the effectiveness of our method, especially on reducing the operational cost under pay-as-you-go pricing scheme.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2019.2947055