Portable data‐parallel surface reconstruction on a uniform rectilinear grid

Summary With the increasing heterogeneity and on‐node parallelism of high‐performance computing hardware, a major challenge is to develop portable and efficient algorithms and software. In this work, we present our implementation of a portable code to perform surface reconstruction using NVIDIA'...

Full description

Saved in:
Bibliographic Details
Published in:International journal for numerical methods in fluids Vol. 86; no. 2; pp. 185 - 199
Main Authors: Francois, Marianne M., Lo, Li‐Ta, Sewell, Christopher, Velechovsky, Jan
Format: Journal Article
Language:English
Published: Bognor Regis Wiley Subscription Services, Inc 20.01.2018
Wiley
Subjects:
ISSN:0271-2091, 1097-0363
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary With the increasing heterogeneity and on‐node parallelism of high‐performance computing hardware, a major challenge is to develop portable and efficient algorithms and software. In this work, we present our implementation of a portable code to perform surface reconstruction using NVIDIA's Thrust library. Surface reconstruction is a technique commonly used in volume tracking methods for simulations of multimaterial flow with interfaces. We have designed a 3D mesh data structure that is easily mapped to the 1D vectors used by Thrust and at the same time is simple to use and uses familiar data structure terminology (such as cells, faces, vertices, and edges). With this new data structure in place, we have implemented a piecewise linear interface reconstruction algorithm in 3 dimensions that effectively exploits the symmetry present in a uniform rectilinear computational cell. Finally, we report performance results, which show that a single implementation of these algorithms can be compiled to multiple backends (specifically, multi‐core CPUs, NVIDIA GPUs, and Intel Xeon Phi processors), making efficient use of the available parallelism on each. We also compare performance of our implementation to a legacy FORTRAN implementation in Message Passing Interface (MPI) and show performance parity on single and multi‐core CPU and achieved good parallel speed‐ups on GPU. Our research demonstrates the advantage of performance portability of the underlying data‐parallel programming model. In this work, we present the implementation of a portable code PINION to perform surface reconstruction using NVIDIA's Thrust Library. Performance comparison of the RAGE and PINION codes for a sphere of radius 0.25 (A) and 0.45 (B), a cylinder (C), and the Stanford bunny (D) is given in the above Figure.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
LA-UR-16-21452
AC52-06NA25396
USDOE National Nuclear Security Administration (NNSA)
ISSN:0271-2091
1097-0363
DOI:10.1002/fld.4410