Integer Sum Reduction with OpenMP on an AMD MI100 GPU

Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tuna...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) s. 496 - 499
Hlavní autoři: Jin, Zheming, Vetter, Jeffrey S.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2022
Témata:
ISBN:9781665497480
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tunable parameters with the AOMP and GCC compilers on an AMD MI100 GPU. In addition, we explain the implementations of the OpenMP reduction by the compilers. Sweeping over the pruned parameter space, we find that the speedup is approximately 20 with AOMP, and the reduction performance using AOMP is approximately 11% higher than that using GCC. However, the OpenMP offload performance is approximately 30% lower compared to the performance of the reductions written with rocThrust or hipCUB.
AbstractList Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tunable parameters with the AOMP and GCC compilers on an AMD MI100 GPU. In addition, we explain the implementations of the OpenMP reduction by the compilers. Sweeping over the pruned parameter space, we find that the speedup is approximately 20 with AOMP, and the reduction performance using AOMP is approximately 11% higher than that using GCC. However, the OpenMP offload performance is approximately 30% lower compared to the performance of the reductions written with rocThrust or hipCUB.
Author Jin, Zheming
Vetter, Jeffrey S.
Author_xml – sequence: 1
  givenname: Zheming
  surname: Jin
  fullname: Jin, Zheming
  email: jinz@ornl.gov
  organization: Oak Ridge National Laboratory
– sequence: 2
  givenname: Jeffrey S.
  surname: Vetter
  fullname: Vetter, Jeffrey S.
  email: vetter@computer.org
  organization: Oak Ridge National Laboratory
BookMark eNo1jlFLwzAUhSMq6GZ_gSD5A603SZPePI5NZ2FlxTl8HNlyqxWXjbZD_PcG1KdzzsfhcEbsIhwCMXYnIBMC7H1Zz-rVq9ZFXmQSpMwAAPGMjYQxOrcR63OW2AL_M8IVS_r-I_akVQKtvGa6DAO9UcdXpz1_Jn_aDe0h8K92eOfLI4Wq5jG6wCfVjFelAODzen3DLhv32VPyp2O2fnx4mT6li-W8nE4WaStQDKlF6ZQxtN0CWFko743HXJIrGiN9PAGEqKzRjQPVNJECUU7eRKfEzqsxu_3dbYloc-zaveu-NxaVNjmoH8GeRs4
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IPDPSW55747.2022.00088
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1665497475
9781665497473
EndPage 499
ExternalDocumentID 9835640
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
OCL
RIB
RIC
RIE
RIL
ID FETCH-LOGICAL-i181t-982a366ebb009273dd6d842ea7f62d0290e883965fa03ff7f60ee4ed67f631cd3
IEDL.DBID RIE
ISBN 9781665497480
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000855041000063&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:24:24 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i181t-982a366ebb009273dd6d842ea7f62d0290e883965fa03ff7f60ee4ed67f631cd3
OpenAccessLink https://www.osti.gov/biblio/1883905
PageCount 4
ParticipantIDs ieee_primary_9835640
PublicationCentury 2000
PublicationDate 2022-May
PublicationDateYYYYMMDD 2022-05-01
PublicationDate_xml – month: 05
  year: 2022
  text: 2022-May
PublicationDecade 2020
PublicationTitle 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
PublicationTitleAbbrev IPDPSW
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002931892
Score 1.8076737
Snippet Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly...
SourceID ieee
SourceType Publisher
StartPage 496
SubjectTerms AMD GPU
Conferences
Distributed processing
Graphics processing units
Manuals
OpenMP target offload
Optimization
Parallel processing
Performance evaluation
Reduction
Title Integer Sum Reduction with OpenMP on an AMD MI100 GPU
URI https://ieeexplore.ieee.org/document/9835640
WOSCitedRecordID wos000855041000063&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVKxcAEqEV8ywMjoY6dOM6IKIVKtIoohW6VY19Qh6aotPx-7tKoMLCw2R4s-2645_Pde4xdqUJ5GTmJHvD0zQg6sNbkQUhcW8IZUJV0wutTMhyaySTNGux62wsDAFXxGdzQsPrL9wu3plRZJ0W4oCN8oO8kid70am3zKRi2QpPKmsWJNHURKRtRNwWHIu30s242eotjBND4LpQVUScJrvxSVamCSm__f8c5YO2f7jyebePOIWtA2WIxpfbeYclH6zl_Jj5WsjinNCunopFBxnFqS3476PJBPxSCP2TjNhv37l_uHoNaEiGYYSheBamRVmkNeU5kSYkiPSgTSbBJoaVHIwgwCHl0XFihigJXBUAEXuNIhc6rI9YsFyUcMy4TXXgnco-AJnKxT0NQeYz7JkUeaWNPWIuuPP3YsF5M69ue_r18xvbIpptSwHPWXC3XcMF23ddq9rm8rFz1Ddh_jyw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWqggQToBbxjQdGQh07cZwRUUormiqiLXSrkvhSdSBFpeX3c5dGhYGFzfZg2XfDPZ_v3mPsRuXKSi-T6AFL34ygnSQxqeMS15bIDKhSOuG1HwwGZjIJ4xq73fbCAEBZfAZ3NCz_8u0iW1OqrBUiXNAePtB3SDmr6tbaZlQwcLkmlBWPE6nqIlY2omoLdkXY6sXtePjm-wih8WUoS6pOklz5patShpXOwf8OdMiaP_15PN5GniNWg6LBfEruzWDJh-t3_kKMrGRzTolWTmUjUcxxmhT8PmrzqOcKwZ_icZONO4-jh65TiSI4cwzGKyc0MlFaQ5oSXVKgSBHKeBKSINfSohEEGAQ92s8TofIcVwWAB1bjSLmZVcesXiwKOGFcBjq3mUgtQhov823ogkp93DfIU0-b5JQ16MrTjw3vxbS67dnfy9dsrzuK-tN-b_B8zvbJvpvCwAtWXy3XcMl2s6_V_HN5VbrtG4cTknU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE+International+Parallel+and+Distributed+Processing+Symposium+Workshops+%28IPDPSW%29&rft.atitle=Integer+Sum+Reduction+with+OpenMP+on+an+AMD+MI100+GPU&rft.au=Jin%2C+Zheming&rft.au=Vetter%2C+Jeffrey+S.&rft.date=2022-05-01&rft.pub=IEEE&rft.isbn=9781665497480&rft.spage=496&rft.epage=499&rft_id=info:doi/10.1109%2FIPDPSW55747.2022.00088&rft.externalDocID=9835640
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781665497480/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781665497480/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781665497480/sc.gif&client=summon&freeimage=true