Scheduling Distributed Clusters of Parallel Machines : Primal-Dual and LP-based Approximation Algorithms

The Map-Reduce computing framework rose to prominence with datasets of such size that dozens of machines on a single cluster were needed for individual jobs. As datasets approach the exabyte scale, a single job may need distributed processing not only on multiple machines, but on multiple clusters ....

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Algorithmica Ročník 80; číslo 10; s. 2777 - 2798
Hlavní autoři:	Murray, Riley, Khuller, Samir, Chao, Megan
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York Springer US 01.10.2018 Springer Nature B.V
Témata:	Algorithm Analysis and Problem Complexity Algorithms Approximation Clusters Combinatorial analysis Completion time Computer Science Computer Systems Organization and Communication Networks Data Structures and Information Theory Datasets Distributed processing Employment Mapping Mathematics of Computing Production scheduling Scheduling Theory of Computation Approximation algorithms Primal-dual algorithms Machine scheduling F.2.2 Nonnumerical Algorithms and Problems LP relaxations Distributed computing
ISSN:	0178-4617, 1432-0541
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	The Map-Reduce computing framework rose to prominence with datasets of such size that dozens of machines on a single cluster were needed for individual jobs. As datasets approach the exabyte scale, a single job may need distributed processing not only on multiple machines, but on multiple clusters . We consider a scheduling problem to minimize weighted average completion time of n jobs on m distributed clusters of parallel machines. In keeping with the scale of the problems motivating this work, we assume that (1) each job is divided into m “subjobs” and (2) distinct subjobs of a given job may be processed concurrently. When each cluster is a single machine, this is the NP-Hard concurrent open shop problem. A clear limitation of such a model is that a serial processing assumption sidesteps the issue of how different tasks of a given subjob might be processed in parallel. Our algorithms explicitly model clusters as pools of resources and effectively overcome this issue. Under a variety of parameter settings, we develop two constant factor approximation algorithms for this problem. The first algorithm uses an LP relaxation tailored to this problem from prior work. This LP-based algorithm provides strong performance guarantees. Our second algorithm exploits a surprisingly simple mapping to the special case of one machine per cluster. This mapping-based algorithm is combinatorial and extremely fast. These are the first constant factor approximations for this problem.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0178-4617 1432-0541
DOI:	10.1007/s00453-017-0345-x