A parallel nonlinear multigrid solver for unsteady incompressible flow simulation on multi-GPU cluster

A nonlinear multigrid solver for solutions of unsteady three-dimensional incompressible viscous flow working on multi-GPU cluster is developed. The solver consists of a full approximation scheme (FAS) V-cycle scheme to accelerate the computation, in which the artificial compressibility method based...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computational physics Vol. 414; p. 109447
Main Authors: Shi, Xiaolei, Agrawal, Tanmay, Lin, Chao-An, Hwang, Feng-Nan, Chiu, Tzu-Hsuan
Format: Journal Article
Language:English
Published: Cambridge Elsevier Inc 01.08.2020
Elsevier Science Ltd
Subjects:
ISSN:0021-9991, 1090-2716
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A nonlinear multigrid solver for solutions of unsteady three-dimensional incompressible viscous flow working on multi-GPU cluster is developed. The solver consists of a full approximation scheme (FAS) V-cycle scheme to accelerate the computation, in which the artificial compressibility method based Navier-Stokes solver is used as a smoother. Multi-stream overlapping strategies are designed to assist multi-GPU computations. The numerical procedure is validated by computing 3D laminar and turbulent flows within a lid-driven cubic cavity. The predicted results compare favorably with previous benchmark solutions and measurements, both in mean and turbulent quantities. For the performance of the FAS V-cycle scheme, up to two orders of magnitude speedups are reported, and the relationship between work unit (WU) and total grid number N is O(N0.3) under the deepest FAS V-cycle. A detailed evaluation of the GPU implementation is carried out employing the Roofline model and the scalability analysis. •A parallel nonlinear multigrid solver for unsteady incompressible flow simulation is implemented on multi-GPU cluster.•The artificial compressibility method based Navier-Stokes solver is used as a smoother for multigrid.•For FAS Lev. 7, 250 speedups over its single grid counterpart is reported.•The work unit scales with the total grid number N at O(N0.3) under the deepest FAS V-cycle.•A detailed evaluation of the GPU implementation is carried out employing the Roofline model and the scalability analysis.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0021-9991
1090-2716
DOI:10.1016/j.jcp.2020.109447