Steal Locally, Share Globally A Strategy for Multiprogramming in the Manycore Era

In a general-purpose computing system, several parallel applications run simultaneously on the same platform. Even if each application is highly tuned for that specific platform, additional performance issues are arising in such a dynamic environment in which multiple applications compete for the re...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	International journal of parallel programming Ročník 43; číslo 5; s. 894 - 917
Hlavní autoři:	Tousimojarad, Ashkan, Vanderbauwhede, Wim
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York Springer US 01.10.2015 Springer Nature B.V
Témata:	Algorithms Analysis C plus plus Central processing units Communication Computer programming Computer Science CPUs Dynamics Equivalence Keywords Libraries Multiprogramming Parallel processing Parallel programming Performance enhancement Platforms Processor Architectures Scheduling Software Engineering/Programming and Operating Systems Strategy Studies Theory of Computation Workload Workloads Task stealing Manycore processors Parallel programming Multiprogramming GPRM Intel Xeon Phi
ISSN:	0885-7458, 1573-7640
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In a general-purpose computing system, several parallel applications run simultaneously on the same platform. Even if each application is highly tuned for that specific platform, additional performance issues are arising in such a dynamic environment in which multiple applications compete for the resources. Different scheduling and resource management techniques have been proposed either at operating system or user level to improve the performance of concurrent workloads. In this paper, we propose a task-based strategy called “Steal Locally, Share Globally” implemented in the runtime of our parallel programming model GPRM (Glasgow Parallel Reduction Machine). We have chosen a state-of-the-art manycore parallel machine, the Intel Xeon Phi, to compare GPRM with some well-known parallel programming models, OpenMP, Intel Cilk Plus and Intel TBB, in both single-programming and multiprogramming scenarios. We show that GPRM not only performs well for single workloads, but also outperforms the other models for multiprogramming workloads. There are three considerations regarding our task-based scheme: (i) It is implemented inside the parallel framework, not as a separate layer; (ii) It improves the performance without the need to change the number of threads for each application (iii) It can be further tuned and improved, not only for the GPRM applications, but for other equivalent parallel programming models.
Bibliografie:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	0885-7458 1573-7640
DOI:	10.1007/s10766-015-0350-0