Locality-Aware Automatic Parallelization for GPGPU with OpenHMPP Directives

The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming models have been developed to integrate these accelerators with high-level programming...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	International journal of parallel programming Ročník 44; číslo 3; s. 620 - 643
Hlavní autoři:	Andión, José M., Arenaz, Manuel, Bodin, François, Rodríguez, Gabriel, Touriño, Juan
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York Springer US 01.06.2016 Springer Nature B.V
Témata:	Analysis Automation Case studies Computation Computer programming Computer Science Computing costs Demand Heterogeneity Optimization techniques Parallel processing Performance evaluation Processor Architectures Programming languages Software Software Engineering/Programming and Operating Systems Source code Studies Theory of Computation Three dimensional Transformations Heterogeneous systems Automatic parallelization Domain-independent kernel Locality GPGPU OpenHMPP
ISSN:	0885-7458, 1573-7640
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming models have been developed to integrate these accelerators with high-level programming languages, giving place to heterogeneous computing systems. Unfortunately, this heterogeneity is also exposed to the programmer complicating its exploitation. This paper presents a new technique to automatically rewrite sequential programs into a parallel counterpart targeting GPU-based heterogeneous systems. The original source code is analyzed through domain-independent computational kernels, which hide the complexity of the implementation details by presenting a non-statement-based, high-level, hierarchical representation of the application. Next, a locality-aware technique based on standard compiler transformations is applied to the original code through OpenHMPP directives. Two representative case studies from scientific applications have been selected: the three-dimensional discrete convolution and the simple-precision general matrix multiplication. The effectiveness of our technique is corroborated by a performance evaluation on NVIDIA GPUs.
Bibliografie:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	0885-7458 1573-7640
DOI:	10.1007/s10766-015-0362-9