A hard real-time scheduler for Spark on YARN

Apache Spark is a fast and general engine for large-scale data processing using distributed memory. It provides different deploy modes to meet the needs of different users and Spark on YARN is the most popular deploy mode. Different deploy modes have different scheduling mechanisms. Spark on YARN ha...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2018 18th IEEE ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) s. 645 - 652
Hlavní autoři: Wang, Guolu, Xu, Jungang, Liu, Renfeng, Huang, Shanshan
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: Piscataway, NJ, USA IEEE Press 01.05.2018
IEEE
Edice:ACM Conferences
Témata:
ISBN:1538658151, 9781538658154
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Apache Spark is a fast and general engine for large-scale data processing using distributed memory. It provides different deploy modes to meet the needs of different users and Spark on YARN is the most popular deploy mode. Different deploy modes have different scheduling mechanisms. Spark on YARN has three different schedulers, including FIFO Scheduler, Fair Scheduler, and Capacity Scheduler. However, these three schedulers cannot fit hard real-time application scenarios. With the application of Apache Spark more widely, the needs of hard real-time scheduling will increase quickly. In this paper, we proposed a novel hard realtime scheduling algorithm called DVDA (Deadline and Value Density-Aware) in order to meet the requirements of hard realtime scheduling. Compared with traditional EDF (Earliest Deadline First) algorithm which only considers the deadline, the DVDA algorithm considers both the deadline and value density of the application. Furthermore, we implement a DVDA Scheduler for Spark on YARN based on the DVDA algorithm. Finally, the experiments are conducted to verify the effectiveness of the algorithm. Experimental results show that the proposed algorithm can increase the application completed rate by 18% and 6%, Value Income by 78% and 32% compared with default Capacity scheduler and EDF-Capacity scheduler respectively.
ISBN:1538658151
9781538658154
DOI:10.1109/CCGRID.2018.00096