Decentralized scheduling with data locality for data-parallel computation on peer-to-peer networks

Despite distributed in computation and data storage, current data-parallel computing systems are centralized in task scheduling, which results in hierarchies that create single point of failure, limit scalability, and increase administration costs. In this paper, we propose a fully decentralized sch...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) s. 337 - 344
Hlavní autori: Weina Wang, Barnard, Matthew, Lei Ying
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.09.2015
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Despite distributed in computation and data storage, current data-parallel computing systems are centralized in task scheduling, which results in hierarchies that create single point of failure, limit scalability, and increase administration costs. In this paper, we propose a fully decentralized scheduling algorithm for data-parallel computing systems on peer-to-peer (P2P) networks. Our scheduling algorithm eliminates the centralized scheduler by letting each node in the network make scheduling decisions. To achieve good performance, data locality, which stresses the efficiency of colocating tasks with their input data, and load-balancing, should be considered jointly, and in a decentralized fashion. By exploring a backpressure-based approach, the proposed task scheduling algorithm strikes the right balance between data locality and load-balancing with each node only knowing the status information of part of the nodes in the network, and proves to maximize the throughput.
DOI:10.1109/ALLERTON.2015.7447024