Pa-Stream: pattern-aware scheduling for distributed stream computing systems.
Gespeichert in:
| Titel: | Pa-Stream: pattern-aware scheduling for distributed stream computing systems. |
|---|---|
| Autoren: | Sun, Dawei1 (AUTHOR) sundaweicn@cugb.edu.cn, Fan, Yinuo1 (AUTHOR) fanyinuocn@email.cugb.edu.cn, Zhang, Ning1 (AUTHOR) zhangning@email.cugb.edu.cn, Gao, Shang2 (AUTHOR) shang.gao@deakin.edu.au, Yu, Jianguo3 (AUTHOR) yjg@zua.edu.cn, Buyya, Rajkumar4 (AUTHOR) rbuyya@unimelb.edu.au |
| Quelle: | Cluster Computing. Feb2026, Vol. 29 Issue 1, p1-22. 22p. |
| Abstract: | Adjusting operator allocations based on changes in system metrics is one of the key characteristic in distributed stream computing systems. However, Existing work often fail to detect potential pattern features within data streams, replying instead on outdated information for scheduling decisions. This results in delayed responses to data stream fluctuations, causing significant performance volatility. To address these challenges, this paper proposes a pattern-aware scheduling strategy called Pa-Stream. The main contributions of this work include: (1) Validation of performance issues: Through experiments conducted on Alibaba Cloud, we evaluate the performance of Storm’s Resource Aware Scheduler under fluctuating data streams. The results demonstrate that variations in data streams degrade system performance and lead to resource waste. (2) Data stream prediction strategy: We introduce a data stream prediction algorithm based on the Long Short-Term Memory (LSTM) network to identify data stream patterns and predict system performance. (3) Initial scheduling strategy: A novel scheduling strategy based on bin-packing algorithms and multi-objective non-dominated sorting is proposed for the initial scheduling of operators. This approach addresses the limitations of traditional bin-packing algorithms in handling scheduling challenges in heterogeneous clusters. (4) Runtime scheduling strategy: For runtime scheduling, we design a strategy based on the Deep Q-network (DQN). This strategy incorporates DQN training, a scheduling scheme generation algorithm, and an online scheduling algorithm to optimize runtime decision-making. (5) Implementation and evaluation of Pa-Stream: We deploy Pa-Stream and validate its performance through extensive experiments. The results show that, compared to SP-Ant and R-storm, Pa-Stream reduces latency by up to 57.24%, increases throughput by up to 76.18%, and decreases system load by up to 52.91%. [ABSTRACT FROM AUTHOR] |
| Datenbank: | Academic Search Index |
Schreiben Sie den ersten Kommentar!
Full Text Finder
Nájsť tento článok vo Web of Science