Hybrid parallelization of Euler–Lagrange simulations based on MPI-3 shared memory

The use of Euler–Lagrange methods on unstructured grids extends their application area to more versatile setups. However, the lack of a regular topology limits the scalability of distributed parallel methods, especially for routines that perform a physical search in space. One of the most prominent...

Full description

Saved in:
Bibliographic Details
Published in:Advances in engineering software (1992) Vol. 174; p. 103291
Main Authors: Kopper, Patrick, Copplestone, Stephen M., Pfeiffer, Marcel, Koch, Christian, Fasoulas, Stefanos, Beck, Andrea
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.12.2022
Subjects:
ISSN:0965-9978
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The use of Euler–Lagrange methods on unstructured grids extends their application area to more versatile setups. However, the lack of a regular topology limits the scalability of distributed parallel methods, especially for routines that perform a physical search in space. One of the most prominent slowdowns is the search for halo elements in physical space for the purpose of runtime communication avoidance. In this work, we present a new communication-free halo element search algorithm utilizing the MPI-3 shared memory model. This novel method eliminates the severe performance bottleneck of many-to-many communication during initialization compared to the distributed parallelization approach and extends the possible applications beyond those achievable with the previous approach. Building on these data structures, we then present methods for efficient particle emission, scalable deposition schemes for particle–field coupling, and latency hiding approaches. The scaling performance of the proposed algorithms is validated through plasma dynamics simulations of an open-source framework on a massively parallel system, demonstrating an efficiency of up to 80% on 131 072 cores. •A novel method to identify halo elements for unstructured Euler–Lagrange solvers.•Avoidance of many-to-many communication through use of MPI-3 shared memory.•Extension to emission, latency hiding, and runtime deposition mechanisms.•Implementation evaluated through scaling tests on a state-of-the-art supercomputer.•Good initialization times and efficiency for both weak and strong scaling.
ISSN:0965-9978
DOI:10.1016/j.advengsoft.2022.103291