hiCUDA: High-Level GPGPU Programming

Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting app...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems Jg. 22; H. 1; S. 78 - 90
Hauptverfasser: Han, Tianyi David, Abdelrahman, Tarek S
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.01.2011
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1045-9219, 1558-2183
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hiCUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hiCUDA} directives as well as the design and implementation of a prototype compiler that translates a hiCUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hiCUDA} provides comes at no expense to performance.
AbstractList Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hiCUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hiCUDA} directives as well as the design and implementation of a prototype compiler that translates a hiCUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hiCUDA} provides comes at no expense to performance.
Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hi{\rm CUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hi{\rm CUDA} directives as well as the design and implementation of a prototype compiler that translates a hi{\rm CUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hi{\rm CUDA} provides comes at no expense to performance.
Author Abdelrahman, Tarek S
Han, Tianyi David
Author_xml – sequence: 1
  givenname: Tianyi David
  surname: Han
  fullname: Han, Tianyi David
  email: han@eecg.toronto.edu
  organization: Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada
– sequence: 2
  givenname: Tarek S
  surname: Abdelrahman
  fullname: Abdelrahman, Tarek S
  email: tsa@eecg.toronto.edu
  organization: Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada
BookMark eNp1kM9LwzAUx4NMcJvevHkZKHixM0mTNPU2Nt2EgQW3c0jb1y2jP2bSDfzvTZl4GHh678HnffnyGaBe3dSA0C3BY0Jw_LxKZp9jiv0p6AXqE85lQIkMe37HjAcxJfEVGji3w5gwjlkfPWzNdD2bvIwWZrMNlnCEcjRP5sl6lNhmY3VVmXpzjS4LXTq4-Z1DtH57XU0XwfJj_j6dLIMsJLwNcqqZjrRIecFwDFrnsU5TGUVc84JIEWaCpyCZ5BRClhc4zoEIDCnNIh7KIhyix1Pu3jZfB3CtqozLoCx1Dc3BqZgwQSMiqSfvz8hdc7C1L6cIDjGJSCSEp-iJymzjnIVCZabVrWnq1mpTelR13lTnTXXelOiin86e9tZU2n7_h9-dcAMAfyhnXq_v-QMk4XYF
CODEN ITDSEO
CitedBy_id crossref_primary_10_1007_s11227_023_05375_0
crossref_primary_10_1088_1755_1315_199_3_032102
crossref_primary_10_1002_cpe_4152
crossref_primary_10_1109_TC_2021_3125792
crossref_primary_10_1002_cpe_3981
crossref_primary_10_1007_s10559_015_9706_0
crossref_primary_10_1002_cpe_3345
crossref_primary_10_1002_cpe_3862
crossref_primary_10_1007_s11227_018_2484_5
crossref_primary_10_1109_TPDS_2016_2521373
crossref_primary_10_1051_matecconf_201820805002
crossref_primary_10_3390_app8091604
crossref_primary_10_1177_1094342017703771
crossref_primary_10_1109_LGRS_2013_2274328
crossref_primary_10_1007_s11227_019_03023_0
crossref_primary_10_1007_s11227_014_1186_x
crossref_primary_10_1145_2858788_2688505
crossref_primary_10_1109_ACCESS_2019_2928033
crossref_primary_10_1145_2579617
crossref_primary_10_1002_cpe_4756
crossref_primary_10_1016_j_swevo_2022_101153
crossref_primary_10_1109_TCE_2013_6531118
crossref_primary_10_1002_cpe_3821
crossref_primary_10_1177_1687814017707413
crossref_primary_10_1007_s42514_023_00159_7
crossref_primary_10_1007_s00607_019_00744_1
crossref_primary_10_1109_TPDS_2013_288
crossref_primary_10_1007_s11390_018_1827_2
crossref_primary_10_3390_axioms13020132
crossref_primary_10_1007_s11227_019_02749_1
crossref_primary_10_1109_ACCESS_2021_3130118
crossref_primary_10_1007_s10766_015_0362_9
crossref_primary_10_1109_TSG_2014_2387169
crossref_primary_10_1016_j_optlaseng_2016_01_011
crossref_primary_10_1080_00207160_2015_1011628
crossref_primary_10_1002_cpe_3416
crossref_primary_10_1016_j_cola_2025_101323
crossref_primary_10_15803_ijnc_5_2_253
crossref_primary_10_1145_2665079
crossref_primary_10_1016_j_imavis_2013_06_003
crossref_primary_10_1007_s12145_022_00900_w
crossref_primary_10_1109_ACCESS_2021_3124856
crossref_primary_10_1016_j_infsof_2015_10_003
crossref_primary_10_1109_TII_2017_2731362
crossref_primary_10_1016_j_procs_2015_05_212
crossref_primary_10_1007_s42514_020_00039_4
crossref_primary_10_1145_2858949_2784754
crossref_primary_10_1145_2499369_2465569
crossref_primary_10_1007_s11277_019_06575_9
crossref_primary_10_1109_TPDS_2011_291
crossref_primary_10_1016_j_procs_2015_05_208
crossref_primary_10_1016_j_image_2016_05_003
crossref_primary_10_1145_1961295_1950409
crossref_primary_10_1002_cpe_3648
crossref_primary_10_1016_j_jpdc_2013_07_013
crossref_primary_10_1007_s10766_014_0319_4
crossref_primary_10_1088_1755_1315_300_3_032106
crossref_primary_10_1145_2345156_2254067
crossref_primary_10_1145_2345156_2254066
crossref_primary_10_1007_s11227_013_0912_0
crossref_primary_10_1016_j_jpdc_2016_11_001
crossref_primary_10_1002_rnc_5909
crossref_primary_10_1145_1961296_1950409
crossref_primary_10_1109_TCSVT_2012_2202191
crossref_primary_10_1016_j_parco_2018_07_003
crossref_primary_10_7780_kjrs_2012_28_6_8
crossref_primary_10_1137_120883153
crossref_primary_10_17706_IJCEE_2015_V7_877
crossref_primary_10_1016_j_robot_2020_103536
crossref_primary_10_1007_s10586_020_03105_2
crossref_primary_10_1145_2680544
crossref_primary_10_3233_JIFS_169447
crossref_primary_10_1109_TEVC_2021_3110506
crossref_primary_10_1109_TCE_2012_6170063
Cites_doi 10.1007/978-3-540-89740-8_1
10.1007/978-3-642-13374-9_21
10.1145/800229.806957
10.1145/1375527.1375562
10.1145/1186562.1015800
10.1145/1504176.1504194
10.1145/1356058.1356084
10.1016/0169-2607(95)01640-F
10.1057/jors.1983.117
10.1145/1345206.1345220
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 2011
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 2011
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
DOI 10.1109/TPDS.2010.62
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList
Technology Research Database
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Architecture
EISSN 1558-2183
EndPage 90
ExternalDocumentID 2724262851
10_1109_TPDS_2010_62
5445082
Genre orig-research
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RZB
TN5
TWZ
UHB
VH1
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
ID FETCH-LOGICAL-c315t-d2a4a7a6b5f409eaad9abb8775a5f1863c65be84852e34df09de160eb2c7538f3
IEDL.DBID RIE
ISICitedReferencesCount 118
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000284423900009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Sun Nov 09 09:30:03 EST 2025
Sun Nov 09 08:18:19 EST 2025
Sat Nov 29 08:07:39 EST 2025
Tue Nov 18 21:39:56 EST 2025
Wed Aug 27 02:52:31 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c315t-d2a4a7a6b5f409eaad9abb8775a5f1863c65be84852e34df09de160eb2c7538f3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
content type line 23
PQID 1030171766
PQPubID 85437
PageCount 13
ParticipantIDs proquest_journals_1030171766
crossref_citationtrail_10_1109_TPDS_2010_62
proquest_miscellaneous_914627182
ieee_primary_5445082
crossref_primary_10_1109_TPDS_2010_62
PublicationCentury 2000
PublicationDate 2011-Jan.
2011-01-00
20110101
PublicationDateYYYYMMDD 2011-01-01
PublicationDate_xml – month: 01
  year: 2011
  text: 2011-Jan.
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2011
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
(ref14) 2010
ref10
(ref15) 2007
(ref1) 2006
Muchnick (ref11) 1997
ref16
(ref6) 2007
(ref8) 2003
(ref22) 2009
(ref18) 2010
Han (ref9) 2009
(ref4) 2010
ref23
ref25
ref20
Luk (ref24)
ref7
(ref2) 2007
(ref21) 2008
(ref19) 2009
ref3
ref5
Klockner (ref17)
References_xml – volume-title: PGI Fortran and C Accelerator Programming Model
  year: 2009
  ident: ref22
– ident: ref23
  doi: 10.1007/978-3-540-89740-8_1
– ident: ref10
  doi: 10.1007/978-3-642-13374-9_21
– volume-title: CUDA Fortran Programming Guide and Reference v0.9
  year: 2009
  ident: ref19
– volume-title: The CUDA Compiler Driver NVCC v1.1
  year: 2007
  ident: ref6
– ident: ref12
  doi: 10.1145/800229.806957
– volume-title: Information Technology—Programming Languages—C++
  year: 2003
  ident: ref8
– volume-title: OpenMP Specification v3.0
  year: 2008
  ident: ref21
– volume-title: Open Computing Language (OpenCL)
  year: 2010
  ident: ref4
– volume-title: Open64 Research Compiler
  year: 2010
  ident: ref14
– ident: ref25
  doi: 10.1145/1375527.1375562
– ident: ref3
  doi: 10.1145/1186562.1015800
– ident: ref20
  doi: 10.1145/1504176.1504194
– ident: ref7
  doi: 10.1145/1356058.1356084
– volume-title: NVIDIA GeForce 8800 GPU Architecture Overview
  year: 2006
  ident: ref1
– start-page: 45
  volume-title: Proc. Int’l Symp. Microarchitecture
  ident: ref24
  article-title: Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping
– volume-title: master’s thesis
  year: 2009
  ident: ref9
  article-title: Directive-Based General-Purpose GPU Programming
– volume-title: jCUDA: Java for CUDA
  year: 2010
  ident: ref18
– ident: ref16
  doi: 10.1016/0169-2607(95)01640-F
– volume-title: Pycuda v0.94beta Documentation
  ident: ref17
– volume-title: NVIDIA CUDA Programming Guide v1.1
  year: 2007
  ident: ref2
– volume-title: The Parboil Benchmark Suite
  year: 2007
  ident: ref15
– volume-title: Advanced Compiler Design and Implementation
  year: 1997
  ident: ref11
– ident: ref13
  doi: 10.1057/jors.1983.117
– ident: ref5
  doi: 10.1145/1345206.1345220
SSID ssj0014504
Score 2.3960876
Snippet Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 78
SubjectTerms Application software
Architecture
Arrays
Compilers
Computer architecture
Computer graphics
Computer interfaces
Computer networks
CUDA
data-parallel programming
Devices
directive-based language
Expenses
GPGPU
Memory management
Packaging
Pipelines
Program processors
Programmers
Programming
Programming profession
Prototypes
source-to-source compiler
Telephone number portability
Title hiCUDA: High-Level GPGPU Programming
URI https://ieeexplore.ieee.org/document/5445082
https://www.proquest.com/docview/1030171766
https://www.proquest.com/docview/914627182
Volume 22
WOSCitedRecordID wos000284423900009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH4B4kEPoqBximYHPGmFdevaeiMgeCBkiWC4Le3WRRIdhh_-_bbbWDTqwWSHJX1Zlq97fXt9r98H0FZCJYRHDEknwciTFCPBuIek8CmWnk6cM-W55zGdTNh8zoMK3JZnYZRSWfOZujO3WS0_XkZbs1XWMcQxOmRVoUqpn5_VKisGesTLmQcI4toNyyZ33pkGg6e8icvH38JPpqfyYxHOIsuw_r93OoLD4g_S7uVTfgwVlTagvlNnsAtnbcDBF6rBJrRfFv3ZoHdvm8YONDatQvYoGAUzO8g7tN602QnMhg_T_iMqFBJQ5Dpkg2IsPEGFL0mi8zQlRMyFlIxSIkjiMN-NfCIV8xjByvXipMtj5fhdnU1HOk1hiXsKtXSZqjOwXRVLjgXhwlDGsYQ5noy40MFKagxjYsHNDrgwKujDjYrFa5ilEV0eGphDA3PoYwuuS-v3nDbjD7umgbS0KdC0oLWbk7DwqXVoBNEcaggtLbDLYe0NpsQhUrXcrkOuF36swy0-__3BF7CfbwqbqwW1zWqrLmEv-tgs1qur7Iv6BOFexzI
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1BS8MwFH7MKagHp1OxOrWHedK4NW3axNvY1Il1FNzEW0jaFAXdxG3-fpO2K4p6EHoI5FHKl7y8vuTl-wCaSqiUsJgi6aQYeTLASFDmISn8AEtPJ86Z8txDGAwG9PGRRRU4K-_CKKWy4jN1bprZWX4yiedmq6xliGN0yFqCZd3C7fy2VnlmoPu8nHuAIKYdsSxzZ61h1LvPy7h8_C0AZYoqP5bhLLZc1f73VZuwUfxD2p180LegosZ1qC30GezCXeuw_oVscBuaT8_dUa9zYZvSDhSaYiH7OrqORnaU12i9arMdGF1dDrt9VGgkoNh1yAwlWHgiEL4kqc7UlBAJE1LSICCCpA713dgnUlGPEqxcL0nbLFGO39b5dKwTFZq6u1AdT8ZqD2xXJZJhQZgwpHE0pY4nYyZ0uJIaw4RYcLoAjscFgbjRsXjhWSLRZtzAzA3M3McWnJTWbzlxxh922wbS0qZA04LGYkx44VVTbiTRnMBQWlpgl93aH8whhxiryXzKmV76sQ64eP_3Fx_Dan94F_LwZnB7AGv5FrF5GlCdvc_VIazEH7Pn6ftRNrs-AWq_ynk
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=hiCUDA%3A+High-Level+GPGPU+Programming&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Han%2C+Tianyi+David&rft.au=Abdelrahman%2C+Tarek+S&rft.date=2011-01-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=22&rft.issue=1&rft.spage=78&rft_id=info:doi/10.1109%2FTPDS.2010.62&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=2724262851
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon