Portable data‐parallel surface reconstruction on a uniform rectilinear grid
Summary With the increasing heterogeneity and on‐node parallelism of high‐performance computing hardware, a major challenge is to develop portable and efficient algorithms and software. In this work, we present our implementation of a portable code to perform surface reconstruction using NVIDIA'...
Saved in:
| Published in: | International journal for numerical methods in fluids Vol. 86; no. 2; pp. 185 - 199 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Bognor Regis
Wiley Subscription Services, Inc
20.01.2018
Wiley |
| Subjects: | |
| ISSN: | 0271-2091, 1097-0363 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Summary
With the increasing heterogeneity and on‐node parallelism of high‐performance computing hardware, a major challenge is to develop portable and efficient algorithms and software. In this work, we present our implementation of a portable code to perform surface reconstruction using NVIDIA's Thrust library. Surface reconstruction is a technique commonly used in volume tracking methods for simulations of multimaterial flow with interfaces. We have designed a 3D mesh data structure that is easily mapped to the 1D vectors used by Thrust and at the same time is simple to use and uses familiar data structure terminology (such as cells, faces, vertices, and edges). With this new data structure in place, we have implemented a piecewise linear interface reconstruction algorithm in 3 dimensions that effectively exploits the symmetry present in a uniform rectilinear computational cell. Finally, we report performance results, which show that a single implementation of these algorithms can be compiled to multiple backends (specifically, multi‐core CPUs, NVIDIA GPUs, and Intel Xeon Phi processors), making efficient use of the available parallelism on each. We also compare performance of our implementation to a legacy FORTRAN implementation in Message Passing Interface (MPI) and show performance parity on single and multi‐core CPU and achieved good parallel speed‐ups on GPU. Our research demonstrates the advantage of performance portability of the underlying data‐parallel programming model.
In this work, we present the implementation of a portable code PINION to perform surface reconstruction using NVIDIA's Thrust Library. Performance comparison of the RAGE and PINION codes for a sphere of radius 0.25 (A) and 0.45 (B), a cylinder (C), and the Stanford bunny (D) is given in the above Figure. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 LA-UR-16-21452 AC52-06NA25396 USDOE National Nuclear Security Administration (NNSA) |
| ISSN: | 0271-2091 1097-0363 |
| DOI: | 10.1002/fld.4410 |