Evaluating performance portability of five shared-memory programming models using a high-order unstructured CFD solver

This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, inc...

Full description

Saved in:
Bibliographic Details
Published in:Journal of parallel and distributed computing Vol. 187; p. 104831
Main Authors: Dai, Zhe, Deng, Liang, Che, YongGang, Li, Ming, Zhang, Jian, Wang, Yueqing
Format: Journal Article
Language:English
Published: Elsevier Inc 01.05.2024
Subjects:
ISSN:0743-7315, 1096-0848
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, including NVIDIA GPUs, x86-based Intel/AMD, and Arm-based systems. The goal is to determine whether these models can provide developers with performance-portable solvers running efficiently on various architectures. The paper forms a more holistic view of a high-order solver across multiple platforms by visualizing performance portability (PP) and measuring productivity. It gives guidelines for translating existing codebases and their data structures to these models. •We port and optimize a high-order unstructured CFD application by using five shared-memory programming models.•We evaluate the performance portability of five programming models on diverse hardware.•We analyze the workload from the perspective of code volume and learning cost.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2023.104831