Decentralized scheduling with data locality for data-parallel computation on peer-to-peer networks
Despite distributed in computation and data storage, current data-parallel computing systems are centralized in task scheduling, which results in hierarchies that create single point of failure, limit scalability, and increase administration costs. In this paper, we propose a fully decentralized sch...
Uložené v:
| Vydané v: | 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) s. 337 - 344 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.09.2015
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Despite distributed in computation and data storage, current data-parallel computing systems are centralized in task scheduling, which results in hierarchies that create single point of failure, limit scalability, and increase administration costs. In this paper, we propose a fully decentralized scheduling algorithm for data-parallel computing systems on peer-to-peer (P2P) networks. Our scheduling algorithm eliminates the centralized scheduler by letting each node in the network make scheduling decisions. To achieve good performance, data locality, which stresses the efficiency of colocating tasks with their input data, and load-balancing, should be considered jointly, and in a decentralized fashion. By exploring a backpressure-based approach, the proposed task scheduling algorithm strikes the right balance between data locality and load-balancing with each node only knowing the status information of part of the nodes in the network, and proves to maximize the throughput. |
|---|---|
| DOI: | 10.1109/ALLERTON.2015.7447024 |