A-DSP: An Adaptive Join Algorithm for Dynamic Data Stream on Cloud System

The join operations, including both equi and non-equi joins, are essential to the complex data analytics in the big data era. However, they are not inherently supported by existing DSPEs ( D istributed S tream P rocessing E ngines). The state-of-the-art join solutions on DSPEs rely on either complic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering Jg. 33; H. 5; S. 1861 - 1876
Hauptverfasser: Fang, Junhua, Zhang, Rong, Zhao, Yan, Zheng, Kai, Zhou, Xiaofang, Zhou, Aoying
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.05.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1041-4347, 1558-2191
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The join operations, including both equi and non-equi joins, are essential to the complex data analytics in the big data era. However, they are not inherently supported by existing DSPEs ( D istributed S tream P rocessing E ngines). The state-of-the-art join solutions on DSPEs rely on either complicated routing strategies or resource-inefficient processing structures, which are susceptible to dynamic workload, especially when the DSPEs face various join predicate operations and skewed data distribution. In this paper, we propose a new cost-effective stream join framework, named A-DSP ( A daptive D imensional S pace P rocessing), which enhances the adaptability of real-time join model and minimizes the resource used over the dynamic workloads. Our proposal includes: 1) a join model generation algorithm devised to adaptively switch between different join schemes so as to minimize the number of processing task required; 2) a load-balancing mechanism which maximizes the processing throughput; and 3) a lightweight algorithm designed for cutting down unnecessary migration cost. Extensive experiments are conducted to compare our proposal against state-of-the-art solutions on both benchmark and real-world workloads. The experimental results verify the effectiveness of our method, especially on reducing the operational cost under pay-as-you-go pricing scheme.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2019.2947055