Evaluating performance portability of five shared-memory programming models using a high-order unstructured CFD solver

This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, inc...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of parallel and distributed computing Ročník 187; s. 104831
Hlavní autoři: Dai, Zhe, Deng, Liang, Che, YongGang, Li, Ming, Zhang, Jian, Wang, Yueqing
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.05.2024
Témata:
ISSN:0743-7315, 1096-0848
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, including NVIDIA GPUs, x86-based Intel/AMD, and Arm-based systems. The goal is to determine whether these models can provide developers with performance-portable solvers running efficiently on various architectures. The paper forms a more holistic view of a high-order solver across multiple platforms by visualizing performance portability (PP) and measuring productivity. It gives guidelines for translating existing codebases and their data structures to these models. •We port and optimize a high-order unstructured CFD application by using five shared-memory programming models.•We evaluate the performance portability of five programming models on diverse hardware.•We analyze the workload from the perspective of code volume and learning cost.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2023.104831