hiCUDA: High-Level GPGPU Programming

Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting app...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on parallel and distributed systems Vol. 22; no. 1; pp. 78 - 90
Main Authors:	Han, Tianyi David, Abdelrahman, Tarek S
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.01.2011 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Application software Architecture Arrays Compilers Computer architecture Computer graphics Computer interfaces Computer networks CUDA data-parallel programming Devices directive-based language Expenses GPGPU Memory management Packaging Pipelines Program processors Programmers Programming Programming profession Prototypes source-to-source compiler Telephone number portability
ISSN:	1045-9219, 1558-2183
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hiCUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hiCUDA} directives as well as the design and implementation of a prototype compiler that translates a hiCUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hiCUDA} provides comes at no expense to performance.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2010.62