Integer Sum Reduction with OpenMP on an AMD MI100 GPU

Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tuna...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) s. 496 - 499
Hlavní autoři: Jin, Zheming, Vetter, Jeffrey S.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2022
Témata:
ISBN:9781665497480
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tunable parameters with the AOMP and GCC compilers on an AMD MI100 GPU. In addition, we explain the implementations of the OpenMP reduction by the compilers. Sweeping over the pruned parameter space, we find that the speedup is approximately 20 with AOMP, and the reduction performance using AOMP is approximately 11% higher than that using GCC. However, the OpenMP offload performance is approximately 30% lower compared to the performance of the reductions written with rocThrust or hipCUB.
ISBN:9781665497480
DOI:10.1109/IPDPSW55747.2022.00088