Evaluating performance portability of five shared-memory programming models using a high-order unstructured CFD solver

This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, inc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of parallel and distributed computing Jg. 187; S. 104831
Hauptverfasser: Dai, Zhe, Deng, Liang, Che, YongGang, Li, Ming, Zhang, Jian, Wang, Yueqing
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Inc 01.05.2024
Schlagworte:
ISSN:0743-7315, 1096-0848
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, including NVIDIA GPUs, x86-based Intel/AMD, and Arm-based systems. The goal is to determine whether these models can provide developers with performance-portable solvers running efficiently on various architectures. The paper forms a more holistic view of a high-order solver across multiple platforms by visualizing performance portability (PP) and measuring productivity. It gives guidelines for translating existing codebases and their data structures to these models. •We port and optimize a high-order unstructured CFD application by using five shared-memory programming models.•We evaluate the performance portability of five programming models on diverse hardware.•We analyze the workload from the perspective of code volume and learning cost.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2023.104831