hiCUDA: High-Level GPGPU Programming
Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting app...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on parallel and distributed systems Jg. 22; H. 1; S. 78 - 90 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
IEEE
01.01.2011
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 1045-9219, 1558-2183 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hiCUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hiCUDA} directives as well as the design and implementation of a prototype compiler that translates a hiCUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hiCUDA} provides comes at no expense to performance. |
|---|---|
| AbstractList | Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hiCUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hiCUDA} directives as well as the design and implementation of a prototype compiler that translates a hiCUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hiCUDA} provides comes at no expense to performance. Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host and GPU memories, and of manually optimizing the utilization of the GPU memory. Practical experience shows that the programmer needs to make significant code changes, often tedious and error-prone, before getting an optimized program. We have designed hi{\rm CUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process. In this paper, we describe the hi{\rm CUDA} directives as well as the design and implementation of a prototype compiler that translates a hi{\rm CUDA} program to a CUDA program. Our compiler is able to support real-world applications that span multiple procedures and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hi{\rm CUDA} provides comes at no expense to performance. |
| Author | Abdelrahman, Tarek S Han, Tianyi David |
| Author_xml | – sequence: 1 givenname: Tianyi David surname: Han fullname: Han, Tianyi David email: han@eecg.toronto.edu organization: Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada – sequence: 2 givenname: Tarek S surname: Abdelrahman fullname: Abdelrahman, Tarek S email: tsa@eecg.toronto.edu organization: Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada |
| BookMark | eNp1kM9LwzAUx4NMcJvevHkZKHixM0mTNPU2Nt2EgQW3c0jb1y2jP2bSDfzvTZl4GHh678HnffnyGaBe3dSA0C3BY0Jw_LxKZp9jiv0p6AXqE85lQIkMe37HjAcxJfEVGji3w5gwjlkfPWzNdD2bvIwWZrMNlnCEcjRP5sl6lNhmY3VVmXpzjS4LXTq4-Z1DtH57XU0XwfJj_j6dLIMsJLwNcqqZjrRIecFwDFrnsU5TGUVc84JIEWaCpyCZ5BRClhc4zoEIDCnNIh7KIhyix1Pu3jZfB3CtqozLoCx1Dc3BqZgwQSMiqSfvz8hdc7C1L6cIDjGJSCSEp-iJymzjnIVCZabVrWnq1mpTelR13lTnTXXelOiin86e9tZU2n7_h9-dcAMAfyhnXq_v-QMk4XYF |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1007_s11227_023_05375_0 crossref_primary_10_1088_1755_1315_199_3_032102 crossref_primary_10_1002_cpe_4152 crossref_primary_10_1109_TC_2021_3125792 crossref_primary_10_1002_cpe_3981 crossref_primary_10_1007_s10559_015_9706_0 crossref_primary_10_1002_cpe_3345 crossref_primary_10_1002_cpe_3862 crossref_primary_10_1007_s11227_018_2484_5 crossref_primary_10_1109_TPDS_2016_2521373 crossref_primary_10_1051_matecconf_201820805002 crossref_primary_10_3390_app8091604 crossref_primary_10_1177_1094342017703771 crossref_primary_10_1109_LGRS_2013_2274328 crossref_primary_10_1007_s11227_019_03023_0 crossref_primary_10_1007_s11227_014_1186_x crossref_primary_10_1145_2858788_2688505 crossref_primary_10_1109_ACCESS_2019_2928033 crossref_primary_10_1145_2579617 crossref_primary_10_1002_cpe_4756 crossref_primary_10_1016_j_swevo_2022_101153 crossref_primary_10_1109_TCE_2013_6531118 crossref_primary_10_1002_cpe_3821 crossref_primary_10_1177_1687814017707413 crossref_primary_10_1007_s42514_023_00159_7 crossref_primary_10_1007_s00607_019_00744_1 crossref_primary_10_1109_TPDS_2013_288 crossref_primary_10_1007_s11390_018_1827_2 crossref_primary_10_3390_axioms13020132 crossref_primary_10_1007_s11227_019_02749_1 crossref_primary_10_1109_ACCESS_2021_3130118 crossref_primary_10_1007_s10766_015_0362_9 crossref_primary_10_1109_TSG_2014_2387169 crossref_primary_10_1016_j_optlaseng_2016_01_011 crossref_primary_10_1080_00207160_2015_1011628 crossref_primary_10_1002_cpe_3416 crossref_primary_10_1016_j_cola_2025_101323 crossref_primary_10_15803_ijnc_5_2_253 crossref_primary_10_1145_2665079 crossref_primary_10_1016_j_imavis_2013_06_003 crossref_primary_10_1007_s12145_022_00900_w crossref_primary_10_1109_ACCESS_2021_3124856 crossref_primary_10_1016_j_infsof_2015_10_003 crossref_primary_10_1109_TII_2017_2731362 crossref_primary_10_1016_j_procs_2015_05_212 crossref_primary_10_1007_s42514_020_00039_4 crossref_primary_10_1145_2858949_2784754 crossref_primary_10_1145_2499369_2465569 crossref_primary_10_1007_s11277_019_06575_9 crossref_primary_10_1109_TPDS_2011_291 crossref_primary_10_1016_j_procs_2015_05_208 crossref_primary_10_1016_j_image_2016_05_003 crossref_primary_10_1145_1961295_1950409 crossref_primary_10_1002_cpe_3648 crossref_primary_10_1016_j_jpdc_2013_07_013 crossref_primary_10_1007_s10766_014_0319_4 crossref_primary_10_1088_1755_1315_300_3_032106 crossref_primary_10_1145_2345156_2254067 crossref_primary_10_1145_2345156_2254066 crossref_primary_10_1007_s11227_013_0912_0 crossref_primary_10_1016_j_jpdc_2016_11_001 crossref_primary_10_1002_rnc_5909 crossref_primary_10_1145_1961296_1950409 crossref_primary_10_1109_TCSVT_2012_2202191 crossref_primary_10_1016_j_parco_2018_07_003 crossref_primary_10_7780_kjrs_2012_28_6_8 crossref_primary_10_1137_120883153 crossref_primary_10_17706_IJCEE_2015_V7_877 crossref_primary_10_1016_j_robot_2020_103536 crossref_primary_10_1007_s10586_020_03105_2 crossref_primary_10_1145_2680544 crossref_primary_10_3233_JIFS_169447 crossref_primary_10_1109_TEVC_2021_3110506 crossref_primary_10_1109_TCE_2012_6170063 |
| Cites_doi | 10.1007/978-3-540-89740-8_1 10.1007/978-3-642-13374-9_21 10.1145/800229.806957 10.1145/1375527.1375562 10.1145/1186562.1015800 10.1145/1504176.1504194 10.1145/1356058.1356084 10.1016/0169-2607(95)01640-F 10.1057/jors.1983.117 10.1145/1345206.1345220 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 2011 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 2011 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
| DOI | 10.1109/TPDS.2010.62 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
| DatabaseTitleList | Technology Research Database Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science Architecture |
| EISSN | 1558-2183 |
| EndPage | 90 |
| ExternalDocumentID | 2724262851 10_1109_TPDS_2010_62 5445082 |
| Genre | orig-research |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RZB TN5 TWZ UHB VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
| ID | FETCH-LOGICAL-c315t-d2a4a7a6b5f409eaad9abb8775a5f1863c65be84852e34df09de160eb2c7538f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 118 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000284423900009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Sun Nov 09 09:30:03 EST 2025 Sun Nov 09 08:18:19 EST 2025 Sat Nov 29 08:07:39 EST 2025 Tue Nov 18 21:39:56 EST 2025 Wed Aug 27 02:52:31 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c315t-d2a4a7a6b5f409eaad9abb8775a5f1863c65be84852e34df09de160eb2c7538f3 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23 |
| PQID | 1030171766 |
| PQPubID | 85437 |
| PageCount | 13 |
| ParticipantIDs | proquest_journals_1030171766 crossref_citationtrail_10_1109_TPDS_2010_62 proquest_miscellaneous_914627182 ieee_primary_5445082 crossref_primary_10_1109_TPDS_2010_62 |
| PublicationCentury | 2000 |
| PublicationDate | 2011-Jan. 2011-01-00 20110101 |
| PublicationDateYYYYMMDD | 2011-01-01 |
| PublicationDate_xml | – month: 01 year: 2011 text: 2011-Jan. |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2011 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 (ref14) 2010 ref10 (ref15) 2007 (ref1) 2006 Muchnick (ref11) 1997 ref16 (ref6) 2007 (ref8) 2003 (ref22) 2009 (ref18) 2010 Han (ref9) 2009 (ref4) 2010 ref23 ref25 ref20 Luk (ref24) ref7 (ref2) 2007 (ref21) 2008 (ref19) 2009 ref3 ref5 Klockner (ref17) |
| References_xml | – volume-title: PGI Fortran and C Accelerator Programming Model year: 2009 ident: ref22 – ident: ref23 doi: 10.1007/978-3-540-89740-8_1 – ident: ref10 doi: 10.1007/978-3-642-13374-9_21 – volume-title: CUDA Fortran Programming Guide and Reference v0.9 year: 2009 ident: ref19 – volume-title: The CUDA Compiler Driver NVCC v1.1 year: 2007 ident: ref6 – ident: ref12 doi: 10.1145/800229.806957 – volume-title: Information Technology—Programming Languages—C++ year: 2003 ident: ref8 – volume-title: OpenMP Specification v3.0 year: 2008 ident: ref21 – volume-title: Open Computing Language (OpenCL) year: 2010 ident: ref4 – volume-title: Open64 Research Compiler year: 2010 ident: ref14 – ident: ref25 doi: 10.1145/1375527.1375562 – ident: ref3 doi: 10.1145/1186562.1015800 – ident: ref20 doi: 10.1145/1504176.1504194 – ident: ref7 doi: 10.1145/1356058.1356084 – volume-title: NVIDIA GeForce 8800 GPU Architecture Overview year: 2006 ident: ref1 – start-page: 45 volume-title: Proc. Int’l Symp. Microarchitecture ident: ref24 article-title: Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping – volume-title: master’s thesis year: 2009 ident: ref9 article-title: Directive-Based General-Purpose GPU Programming – volume-title: jCUDA: Java for CUDA year: 2010 ident: ref18 – ident: ref16 doi: 10.1016/0169-2607(95)01640-F – volume-title: Pycuda v0.94beta Documentation ident: ref17 – volume-title: NVIDIA CUDA Programming Guide v1.1 year: 2007 ident: ref2 – volume-title: The Parboil Benchmark Suite year: 2007 ident: ref15 – volume-title: Advanced Compiler Design and Implementation year: 1997 ident: ref11 – ident: ref13 doi: 10.1057/jors.1983.117 – ident: ref5 doi: 10.1145/1345206.1345220 |
| SSID | ssj0014504 |
| Score | 2.3960876 |
| Snippet | Graphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, mainly driven by the improvements in GPU... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 78 |
| SubjectTerms | Application software Architecture Arrays Compilers Computer architecture Computer graphics Computer interfaces Computer networks CUDA data-parallel programming Devices directive-based language Expenses GPGPU Memory management Packaging Pipelines Program processors Programmers Programming Programming profession Prototypes source-to-source compiler Telephone number portability |
| Title | hiCUDA: High-Level GPGPU Programming |
| URI | https://ieeexplore.ieee.org/document/5445082 https://www.proquest.com/docview/1030171766 https://www.proquest.com/docview/914627182 |
| Volume | 22 |
| WOSCitedRecordID | wos000284423900009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH4B4kEPoqBximYHPGmFdevaeiMgeCBkiWC4Le3WRRIdhh_-_bbbWDTqwWSHJX1Zlq97fXt9r98H0FZCJYRHDEknwciTFCPBuIek8CmWnk6cM-W55zGdTNh8zoMK3JZnYZRSWfOZujO3WS0_XkZbs1XWMcQxOmRVoUqpn5_VKisGesTLmQcI4toNyyZ33pkGg6e8icvH38JPpqfyYxHOIsuw_r93OoLD4g_S7uVTfgwVlTagvlNnsAtnbcDBF6rBJrRfFv3ZoHdvm8YONDatQvYoGAUzO8g7tN602QnMhg_T_iMqFBJQ5Dpkg2IsPEGFL0mi8zQlRMyFlIxSIkjiMN-NfCIV8xjByvXipMtj5fhdnU1HOk1hiXsKtXSZqjOwXRVLjgXhwlDGsYQ5noy40MFKagxjYsHNDrgwKujDjYrFa5ilEV0eGphDA3PoYwuuS-v3nDbjD7umgbS0KdC0oLWbk7DwqXVoBNEcaggtLbDLYe0NpsQhUrXcrkOuF36swy0-__3BF7CfbwqbqwW1zWqrLmEv-tgs1qur7Iv6BOFexzI |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1BS8MwFH7MKagHp1OxOrWHedK4NW3axNvY1Il1FNzEW0jaFAXdxG3-fpO2K4p6EHoI5FHKl7y8vuTl-wCaSqiUsJgi6aQYeTLASFDmISn8AEtPJ86Z8txDGAwG9PGRRRU4K-_CKKWy4jN1bprZWX4yiedmq6xliGN0yFqCZd3C7fy2VnlmoPu8nHuAIKYdsSxzZ61h1LvPy7h8_C0AZYoqP5bhLLZc1f73VZuwUfxD2p180LegosZ1qC30GezCXeuw_oVscBuaT8_dUa9zYZvSDhSaYiH7OrqORnaU12i9arMdGF1dDrt9VGgkoNh1yAwlWHgiEL4kqc7UlBAJE1LSICCCpA713dgnUlGPEqxcL0nbLFGO39b5dKwTFZq6u1AdT8ZqD2xXJZJhQZgwpHE0pY4nYyZ0uJIaw4RYcLoAjscFgbjRsXjhWSLRZtzAzA3M3McWnJTWbzlxxh922wbS0qZA04LGYkx44VVTbiTRnMBQWlpgl93aH8whhxiryXzKmV76sQ64eP_3Fx_Dan94F_LwZnB7AGv5FrF5GlCdvc_VIazEH7Pn6ftRNrs-AWq_ynk |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=hiCUDA%3A+High-Level+GPGPU+Programming&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Han%2C+Tianyi+David&rft.au=Abdelrahman%2C+Tarek+S&rft.date=2011-01-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=22&rft.issue=1&rft.spage=78&rft_id=info:doi/10.1109%2FTPDS.2010.62&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=2724262851 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |