A performance portable implementation of the semi-Lagrangian algorithm in six dimensions

This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore physics beyond the reduced gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer physics communications Jg. 295; S. 108973
Hauptverfasser: Schild, Nils, Räth, Mario, Eibl, Sebastian, Hallatschek, Klaus, Kormann, Katharina
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 01.02.2024
Schlagworte:
ISSN:0010-4655, 1879-2944
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore physics beyond the reduced gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extensible regarding the physical model and flexible regarding the hardware for production runs. We start on the algorithmic background to simulate the 6-D Vlasov equation using a semi-Lagrangian algorithm. The performance portable software stack, which enables production runs on pure CPU as well as AMD or Nvidia GPU accelerated nodes, is presented. The extensibility of our implementation is guaranteed through the described software architecture of the main kernel, which achieves a memory bandwidth of almost 500 GB/s on a V100 Nvidia GPU and around 100 GB/s on an Intel Xeon Gold CPU using a single code base. We provide performance data on multiple node-level architectures discussing utilized and further available hardware capabilities. Finally, the network communication bottleneck of 6-D grid-based algorithms is quantified. A verification of physics beyond gyrokinetic theory, for the example of ion Bernstein waves, concludes the work. •Performance portable implementation of a semi-Lagrangian algorithm for full kinetics.•Software architecture for Lagrange interpolation stencils using design patterns.•Node level performance analysis for OpenMP, HIP and CUDA using a single code base.•Quantification of the network communication bottleneck for 6D distributed grids.
AbstractList This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore physics beyond the reduced gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extensible regarding the physical model and flexible regarding the hardware for production runs. We start on the algorithmic background to simulate the 6-D Vlasov equation using a semi-Lagrangian algorithm. The performance portable software stack, which enables production runs on pure CPU as well as AMD or Nvidia GPU accelerated nodes, is presented. The extensibility of our implementation is guaranteed through the described software architecture of the main kernel, which achieves a memory bandwidth of almost 500 GB/s on a V100 Nvidia GPU and around 100 GB/s on an Intel Xeon Gold CPU using a single code base. We provide performance data on multiple node-level architectures discussing utilized and further available hardware capabilities. Finally, the network communication bottleneck of 6-D grid-based algorithms is quantified. A verification of physics beyond gyrokinetic theory, for the example of ion Bernstein waves, concludes the work. •Performance portable implementation of a semi-Lagrangian algorithm for full kinetics.•Software architecture for Lagrange interpolation stencils using design patterns.•Node level performance analysis for OpenMP, HIP and CUDA using a single code base.•Quantification of the network communication bottleneck for 6D distributed grids.
ArticleNumber 108973
Author Kormann, Katharina
Hallatschek, Klaus
Schild, Nils
Räth, Mario
Eibl, Sebastian
Author_xml – sequence: 1
  givenname: Nils
  orcidid: 0009-0000-5048-4814
  surname: Schild
  fullname: Schild, Nils
  email: nils.schild@ipp.mpg.de
  organization: Max Planck Institut for Plasma Physics, Germany
– sequence: 2
  givenname: Mario
  surname: Räth
  fullname: Räth, Mario
  organization: Max Planck Institut for Plasma Physics, Germany
– sequence: 3
  givenname: Sebastian
  surname: Eibl
  fullname: Eibl, Sebastian
  organization: Max Planck Computing and Data Facility, Germany
– sequence: 4
  givenname: Klaus
  surname: Hallatschek
  fullname: Hallatschek, Klaus
  organization: Max Planck Institut for Plasma Physics, Germany
– sequence: 5
  givenname: Katharina
  surname: Kormann
  fullname: Kormann, Katharina
  organization: Ruhr University Bochum, Germany
BookMark eNp90MtKAzEUgOEgFazVB3CXF5iazCSTDq5K8QYFNwruQi4nbcrMZEiC6NubUlcuujpk8YVz_ms0G8MICN1RsqSEtveHpZnMsiZ1U96rTjQXaE5XoqvqjrEZmhNCScVazq_QdUoHQogQXTNHn2s8QXQhDmo0gKcQs9I9YD9MPQwwZpV9GHFwOO8BJxh8tVW7qMadVyNW_S5En_cD9iNO_htbX0wqIt2gS6f6BLd_c4E-nh7fNy_V9u35dbPeVqZhJFfGWMeZFtQ5ZSkFzrXuuLG6LKxNzWyrhRFUM103LRhgquPOguEr3bLWsmaBxOlfE0NKEZw0_rR0jsr3khJ5DCQPsgSSx0DyFKhI-k9O0Q8q_pw1DycD5aQvD1Em46GUsz6CydIGf0b_AjqGgtI
CitedBy_id crossref_primary_10_1016_j_jcp_2025_114294
crossref_primary_10_1063_5_0265541
Cites_doi 10.1051/proc/201343008
10.1051/proc/2011022
10.1016/0021-9991(76)90053-X
10.1016/j.cpc.2020.107208
10.1145/1498765.1498785
10.1016/j.jpdc.2014.07.003
10.1016/j.cpc.2023.109064
10.1063/1.4999945
10.1109/TPDS.2021.3097283
10.21105/joss.01370
10.1016/j.matcom.2009.08.038
10.1063/5.0046327
10.1109/MCSE.2021.3098509
10.1137/070693199
10.1016/j.cpc.2020.107351
10.1017/S0962492902000053
10.1103/PhysRev.109.10
ContentType Journal Article
Copyright 2023 The Author(s)
Copyright_xml – notice: 2023 The Author(s)
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.cpc.2023.108973
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 1879-2944
ExternalDocumentID 10_1016_j_cpc_2023_108973
S0010465523003181
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARLI
AAXUO
AAYFN
ABBOA
ABFNM
ABMAC
ABNEU
ABQEM
ABQYD
ABXDB
ABYKQ
ACDAQ
ACFVG
ACGFS
ACLVX
ACNNM
ACRLP
ACSBN
ACZNC
ADBBV
ADECG
ADEZE
ADJOM
ADMUD
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFZHZ
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AI.
AIALX
AIEXJ
AIKHN
AITUG
AIVDX
AJBFU
AJOXV
AJSZI
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
ATOGT
AVWKF
AXJTR
AZFZN
BBWZM
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FLBIZ
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HLZ
HME
HMV
HVGLF
HZ~
IHE
IMUCA
J1W
KOM
LG9
LZ4
M38
M41
MO0
N9A
NDZJH
O-L
O9-
OAUVE
OGIMB
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCB
SDF
SDG
SES
SEW
SHN
SPC
SPCBC
SPD
SPG
SSE
SSK
SSQ
SSV
SSZ
T5K
TN5
UPT
VH1
WUQ
ZMT
~02
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABJNI
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c340t-ccdf54b71ffad11e55bb95cdb001bc24d6b7c71b4b236ece4a95fdec58b646d43
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001164735000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0010-4655
IngestDate Sat Nov 29 07:29:55 EST 2025
Tue Nov 18 22:33:34 EST 2025
Fri Feb 23 02:35:45 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords General purpose computing on graphic processing units (GPGPU)
Kokkos
Software design patterns
Fully kinetic simulation
Semi-Lagrangian
Performance portability
Language English
License This is an open access article under the CC BY license.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c340t-ccdf54b71ffad11e55bb95cdb001bc24d6b7c71b4b236ece4a95fdec58b646d43
ORCID 0009-0000-5048-4814
OpenAccessLink https://dx.doi.org/10.1016/j.cpc.2023.108973
ParticipantIDs crossref_citationtrail_10_1016_j_cpc_2023_108973
crossref_primary_10_1016_j_cpc_2023_108973
elsevier_sciencedirect_doi_10_1016_j_cpc_2023_108973
PublicationCentury 2000
PublicationDate February 2024
2024-02-00
PublicationDateYYYYMMDD 2024-02-01
PublicationDate_xml – month: 02
  year: 2024
  text: February 2024
PublicationDecade 2020
PublicationTitle Computer physics communications
PublicationYear 2024
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Zhang, Almgren, Beckner, Bell, Blaschke, Chan, Day, Friesen, Gott, Graves, Katz, Myers, Nguyen, Nonaka, Rosso, Williams, Zingale (br0070) 2019; 4
(br0280) 1997-NNNN
(br0230) 1981
Gamma, Helm, Johnson, Vlissides (br0050) 2007; vol. 35
Crouseilles, Mehrenberger, Vecil (br0260) 2011; 32
Hager, Wellein (br0360) 2011
Trott, Lebrun-Grandie, Arndt, Ciesko, Dang, Ellingwood, Gayatri, Harvey, Hollman, Ibanez, Liber, Madsen, Miles, Poliakoff, Powell, Rajamanickam, Simberg, Sunderland, Turcksin, Wilke (br0030) 2022; 33
Einkemmer, Moriggl (br0190) 2022
Bigot, Grandgirard, Latu, Passeron, Rozar, Thomine (br0140) 2013; 43
Cheng, Knorr (br0240) 1976; 22
Harris (br0330) 2023
NVIDIA (br0400) 2023
Sturdevant, Chen, Parker (br0450) 2017; 24
Beckingsale, Burmark, Hornung, Jones, Killian, Kunen, Pearce, Robinson, Ryujin, Scogland (br0110) 2019
Ayala, Tomov, Haidar, Dongarra (br0290) 2020
Trott, Berger-Vergiat, Poliakoff Sivasankaran, Lebrun-Grandie, Madsen, Awar, Gligoric, Shipman, Womeldorff (br0300) 2021; 23
F. Allmann-Rahn, S. Lautenbach, M. Deisenhofer, R. Grauer, The muphyII Code: Multiphysics Plasma Simulation on Large HPC Systems, ArXiv (2023).
Datta, Kamil, Williams, Oliker, Shalf, Yelick (br0370) 2009; 51
Treibig, Hager, Wellein (br0390) 2010
Gregoire (br0380) 2021
Raeth (br0020) 2023
Einkemmer (br0180) 2020; 254
Germaschewski, Allen, Dannert, Hrywniak, Donaghy, Merlo, Ethier, D'Azevedo, Jenko, Bhattacharjee (br0060) 2021; 28
Ohana, Gheller, Lanti, Jocksch, Brunner, Villard (br0080) 2021; 262
McLachlan, Quispel (br0250) 2002; 11
BSL6D Authors (br0010) 2023
Markidis, Lapenta (br0220) 2010; 80
Artigues, Kormann, Rampp, Reuter (br0120) 2020; 32
Asahi, Latu, Grandgirard, Bigot (br0170) 2019
Snir, Otto, Huss-Lederman, Walker (br0270) 1998
Muralikrishnan, Frey, Vinciguerra, Ligotino, Cerfon, Stoyanov, Gayatri, Adelmann (br0100) 2022
Deakin, Price, Martineau, McIntosh-Smith (br0420) 2018; 17
Bernstein (br0440) 1958; 109
ECP-CoPa (br0090) 2023
Umeda, Fukazawa (br0150) 2014
Matthes, Widera, Zenker, Worpitz, Huebl, Bussmann (br0130) 2017
Edwards, Trott, Sunderland (br0320) 2014; 74
Bussmann, Burau, Cowan, Debus, Huebl, Juckeland, Kluge, Nagel, Pausch, Schmitt, Schramm, Schuchart, Widera (br0210) 2013
AMD (br0410) 2023
Williams, Waterman, Patterson (br0430) 2009; 52
Fedeli, Huebl, Boillod-Cerneux, Clark, Gott, Hillairet, Jaure, Leblanc, Lehe, Myers, Piechurski, Sato, Zaim, Zhang, Vay, Vincenti (br0200) 2022
Kormann, Reuter, Rampp (br0040) 2019
(br0310) 2023
(br0340) 2023
(br0350) 2023
Hazeltine, Waelbroeck (br0460) 2018
Gamma (10.1016/j.cpc.2023.108973_br0050) 2007; vol. 35
ECP-CoPa (10.1016/j.cpc.2023.108973_br0090)
Sturdevant (10.1016/j.cpc.2023.108973_br0450) 2017; 24
Bigot (10.1016/j.cpc.2023.108973_br0140) 2013; 43
Trott (10.1016/j.cpc.2023.108973_br0030) 2022; 33
Umeda (10.1016/j.cpc.2023.108973_br0150) 2014
Fedeli (10.1016/j.cpc.2023.108973_br0200) 2022
Williams (10.1016/j.cpc.2023.108973_br0430) 2009; 52
(10.1016/j.cpc.2023.108973_br0230) 1981
Deakin (10.1016/j.cpc.2023.108973_br0420) 2018; 17
Muralikrishnan (10.1016/j.cpc.2023.108973_br0100)
Einkemmer (10.1016/j.cpc.2023.108973_br0190) 2022
Cheng (10.1016/j.cpc.2023.108973_br0240) 1976; 22
Ayala (10.1016/j.cpc.2023.108973_br0290) 2020
Zhang (10.1016/j.cpc.2023.108973_br0070) 2019; 4
Matthes (10.1016/j.cpc.2023.108973_br0130) 2017
Datta (10.1016/j.cpc.2023.108973_br0370) 2009; 51
Raeth (10.1016/j.cpc.2023.108973_br0020) 2023
Crouseilles (10.1016/j.cpc.2023.108973_br0260) 2011; 32
Gregoire (10.1016/j.cpc.2023.108973_br0380) 2021
Artigues (10.1016/j.cpc.2023.108973_br0120) 2020; 32
BSL6D Authors (10.1016/j.cpc.2023.108973_br0010)
AMD (10.1016/j.cpc.2023.108973_br0410)
Ohana (10.1016/j.cpc.2023.108973_br0080) 2021; 262
Harris (10.1016/j.cpc.2023.108973_br0330)
Hazeltine (10.1016/j.cpc.2023.108973_br0460) 2018
Treibig (10.1016/j.cpc.2023.108973_br0390) 2010
Markidis (10.1016/j.cpc.2023.108973_br0220) 2010; 80
NVIDIA (10.1016/j.cpc.2023.108973_br0400)
Einkemmer (10.1016/j.cpc.2023.108973_br0180) 2020; 254
Edwards (10.1016/j.cpc.2023.108973_br0320) 2014; 74
Hager (10.1016/j.cpc.2023.108973_br0360) 2011
10.1016/j.cpc.2023.108973_br0160
Beckingsale (10.1016/j.cpc.2023.108973_br0110) 2019
Snir (10.1016/j.cpc.2023.108973_br0270) 1998
Bernstein (10.1016/j.cpc.2023.108973_br0440) 1958; 109
Kormann (10.1016/j.cpc.2023.108973_br0040) 2019
Bussmann (10.1016/j.cpc.2023.108973_br0210) 2013
Germaschewski (10.1016/j.cpc.2023.108973_br0060) 2021; 28
Trott (10.1016/j.cpc.2023.108973_br0300) 2021; 23
McLachlan (10.1016/j.cpc.2023.108973_br0250) 2002; 11
Asahi (10.1016/j.cpc.2023.108973_br0170) 2019
References_xml – year: 2023
  ident: br0020
  article-title: Beyond Gyrokinetic Theory
– volume: vol. 35
  year: 2007
  ident: br0050
  article-title: Design Patterns
  publication-title: Addison Wesley Professional Computing Series
– volume: 4
  start-page: 1370
  year: 2019
  ident: br0070
  article-title: AMReX: a framework for block-structured adaptive mesh refinement
  publication-title: J. Open Sour. Softw.
– year: 2023
  ident: br0330
  article-title: An efficient matrix transpose in CUDA C/C++
– start-page: 127
  year: 2014
  end-page: 138
  ident: br0150
  article-title: Performance tuning of Vlasov code for space plasma on the K computer
  publication-title: AsiaSim 2014
– start-page: 5:1
  year: 2013
  end-page: 5:12
  ident: br0210
  article-title: Radiative signatures of the relativistic Kelvin-Helmholtz instability
  publication-title: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
– year: 1997-NNNN
  ident: br0280
  article-title: Hierarchical Data Format, version 5
– year: 2023
  ident: br0090
  article-title: CoPA cabana - the exascale co-design center for particle applications toolkit
– volume: 80
  start-page: 1509
  year: 2010
  end-page: 1519
  ident: br0220
  article-title: Multi–scale simulations of plasma with iPIC3D
  publication-title: Math. Comput. Simul.
– volume: 17
  start-page: 247
  year: 2018
  end-page: 262
  ident: br0420
  article-title: Evaluating attainable memory bandwidth of parallel programming models via babelstream
  publication-title: Int. J. Comput. Sci. Eng.
– year: 2023
  ident: br0310
  article-title: Kokkos documentation
– volume: 24
  year: 2017
  ident: br0450
  article-title: Low frequency fully kinetic simulation of the toroidal ion temperature gradient instability
  publication-title: Phys. Plasmas
– volume: 43
  start-page: 117
  year: 2013
  end-page: 135
  ident: br0140
  article-title: Scaling GYSELA code beyond 32K-cores on blue gene
  publication-title: ESAIM Proc.
– volume: 11
  start-page: 341
  year: 2002
  end-page: 434
  ident: br0250
  article-title: Splitting methods
  publication-title: Acta Numer.
– volume: 74
  start-page: 3202
  year: 2014
  end-page: 3216
  ident: br0320
  article-title: Kokkos: enabling manycore performance portability through polymorphic memory access patterns
  publication-title: J. Parallel Distrib. Comput.
– start-page: 262
  year: 2020
  end-page: 275
  ident: br0290
  article-title: heFFTe: highly efficient FFT for exascale
  publication-title: Computational Science – ICCS 2020
– year: 2019
  ident: br0040
  article-title: A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov–Poisson equation
  publication-title: Int. J. HPC Appl.
– volume: 254
  year: 2020
  ident: br0180
  article-title: Semi-Lagrangian Vlasov simulation on GPUs
  publication-title: Comput. Phys. Commun.
– year: 2017
  ident: br0130
  article-title: Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the alpaka library
  publication-title: High Performance Computing
– volume: 23
  start-page: 10
  year: 2021
  end-page: 18
  ident: br0300
  article-title: The Kokkos EcoSystem: comprehensive performance portability for high performance computing
  publication-title: Comput. Sci. Eng.
– year: 2018
  ident: br0460
  article-title: The Framework of Plasma Physics
– year: 2023
  ident: br0010
  article-title: BSL6D - Backwards SemiLagrangian 6Dimensions
– volume: 52
  start-page: 65
  year: 2009
  end-page: 76
  ident: br0430
  article-title: Roofline: an insightful visual performance model for multicore architectures
  publication-title: Commun. ACM
– start-page: 71
  year: 2019
  end-page: 81
  ident: br0110
  article-title: RAJA: portable performance for large-scale scientific applications
  publication-title: 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)
– start-page: 117
  year: 2019
  end-page: 139
  ident: br0170
  article-title: Performance portable implementation of a kinetic plasma simulation mini-app
  publication-title: WACCPD 2019: Accelerator Programming Using Directives
– volume: 32
  start-page: 211
  year: 2011
  end-page: 230
  ident: br0260
  article-title: Discontinuous Galerkin semi-Lagrangian method for Vlasov–Poisson
  publication-title: ESAIM Proc.
– year: 2022
  ident: br0190
  article-title: Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers
  publication-title: Int. J. HPC Appl.
– year: 1981
  ident: br0230
  publication-title: Band 10: Course of Theoretical Physics
– year: 2011
  ident: br0360
  article-title: Introduction to high performance computing for scientists and engineers
  publication-title: Chapman & Hall
– year: 2010
  ident: br0390
  article-title: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments
  publication-title: Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures
– year: 2023
  ident: br0400
  article-title: Nsight compute documentation
– year: 2023
  ident: br0350
– volume: 33
  start-page: 805
  year: 2022
  end-page: 817
  ident: br0030
  article-title: Kokkos 3: programming model extensions for the exascale era
  publication-title: IEEE Trans. Parallel Distrib. Syst.
– volume: 262
  year: 2021
  ident: br0080
  article-title: Gyrokinetic simulations on many- and multi-core architectures with the global electromagnetic Particle-In-Cell Code ORB5
  publication-title: Comput. Phys. Commun.
– volume: 32
  start-page: 1
  year: 2020
  end-page: 23
  ident: br0120
  article-title: Evaluation of performance portability frameworks for the implementation of a particle-in-cell code
  publication-title: Conn. Comput. Pract. Exp.
– volume: 109
  start-page: 10
  year: 1958
  end-page: 21
  ident: br0440
  article-title: Waves in a plasma in a magnetic field
  publication-title: Phys. Rev.
– year: 2021
  ident: br0380
  article-title: Professional C++
– year: 2022
  ident: br0100
  article-title: Scaling and performance portability of the particle-in-cell scheme for plasma physics applications through mini-apps targeting exascale architectures
– volume: 51
  start-page: 129
  year: 2009
  end-page: 159
  ident: br0370
  article-title: Optimization and performance modeling of stencil computations on modern microprocessors
  publication-title: SIAM Rev.
– volume: 22
  start-page: 330
  year: 1976
  end-page: 351
  ident: br0240
  article-title: The integration of the Vlasov equation in configuration space
  publication-title: J. Comput. Phys.
– year: 1998
  ident: br0270
  article-title: MPI
– start-page: 1
  year: 2022
  end-page: 12
  ident: br0200
  article-title: Pushing the frontier in the design of laser-based electron accelerators with groundbreaking mesh-refined particle-in-cell simulations on exascale-class supercomputers
  publication-title: SC22: International Conference for High Performance Computing, Networking, Storage and Analysis
– year: 2023
  ident: br0340
– reference: F. Allmann-Rahn, S. Lautenbach, M. Deisenhofer, R. Grauer, The muphyII Code: Multiphysics Plasma Simulation on Large HPC Systems, ArXiv (2023).
– volume: 28
  year: 2021
  ident: br0060
  article-title: Toward exascale whole-device modeling of fusion devices: porting the GENE gyrokinetic microturbulence code to GPU
  publication-title: Phys. Plasmas
– year: 2023
  ident: br0410
  article-title: Rocm profiling tools user guide
– volume: 43
  start-page: 117
  year: 2013
  ident: 10.1016/j.cpc.2023.108973_br0140
  article-title: Scaling GYSELA code beyond 32K-cores on blue gene
  publication-title: ESAIM Proc.
  doi: 10.1051/proc/201343008
– volume: 32
  start-page: 211
  year: 2011
  ident: 10.1016/j.cpc.2023.108973_br0260
  article-title: Discontinuous Galerkin semi-Lagrangian method for Vlasov–Poisson
  publication-title: ESAIM Proc.
  doi: 10.1051/proc/2011022
– year: 2019
  ident: 10.1016/j.cpc.2023.108973_br0040
  article-title: A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov–Poisson equation
  publication-title: Int. J. HPC Appl.
– volume: 22
  start-page: 330
  year: 1976
  ident: 10.1016/j.cpc.2023.108973_br0240
  article-title: The integration of the Vlasov equation in configuration space
  publication-title: J. Comput. Phys.
  doi: 10.1016/0021-9991(76)90053-X
– volume: 262
  year: 2021
  ident: 10.1016/j.cpc.2023.108973_br0080
  article-title: Gyrokinetic simulations on many- and multi-core architectures with the global electromagnetic Particle-In-Cell Code ORB5
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2020.107208
– year: 2023
  ident: 10.1016/j.cpc.2023.108973_br0020
– volume: 52
  start-page: 65
  year: 2009
  ident: 10.1016/j.cpc.2023.108973_br0430
  article-title: Roofline: an insightful visual performance model for multicore architectures
  publication-title: Commun. ACM
  doi: 10.1145/1498765.1498785
– volume: 32
  start-page: 1
  year: 2020
  ident: 10.1016/j.cpc.2023.108973_br0120
  article-title: Evaluation of performance portability frameworks for the implementation of a particle-in-cell code
  publication-title: Conn. Comput. Pract. Exp.
– volume: 74
  start-page: 3202
  year: 2014
  ident: 10.1016/j.cpc.2023.108973_br0320
  article-title: Kokkos: enabling manycore performance portability through polymorphic memory access patterns
  publication-title: J. Parallel Distrib. Comput.
  doi: 10.1016/j.jpdc.2014.07.003
– year: 2011
  ident: 10.1016/j.cpc.2023.108973_br0360
  article-title: Introduction to high performance computing for scientists and engineers
– ident: 10.1016/j.cpc.2023.108973_br0160
  doi: 10.1016/j.cpc.2023.109064
– ident: 10.1016/j.cpc.2023.108973_br0010
– start-page: 127
  year: 2014
  ident: 10.1016/j.cpc.2023.108973_br0150
  article-title: Performance tuning of Vlasov code for space plasma on the K computer
– start-page: 71
  year: 2019
  ident: 10.1016/j.cpc.2023.108973_br0110
  article-title: RAJA: portable performance for large-scale scientific applications
– year: 1981
  ident: 10.1016/j.cpc.2023.108973_br0230
– volume: 24
  year: 2017
  ident: 10.1016/j.cpc.2023.108973_br0450
  article-title: Low frequency fully kinetic simulation of the toroidal ion temperature gradient instability
  publication-title: Phys. Plasmas
  doi: 10.1063/1.4999945
– year: 1998
  ident: 10.1016/j.cpc.2023.108973_br0270
– start-page: 117
  year: 2019
  ident: 10.1016/j.cpc.2023.108973_br0170
  article-title: Performance portable implementation of a kinetic plasma simulation mini-app
– volume: 17
  start-page: 247
  year: 2018
  ident: 10.1016/j.cpc.2023.108973_br0420
  article-title: Evaluating attainable memory bandwidth of parallel programming models via babelstream
  publication-title: Int. J. Comput. Sci. Eng.
– ident: 10.1016/j.cpc.2023.108973_br0400
– start-page: 262
  year: 2020
  ident: 10.1016/j.cpc.2023.108973_br0290
  article-title: heFFTe: highly efficient FFT for exascale
– year: 2010
  ident: 10.1016/j.cpc.2023.108973_br0390
  article-title: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments
– volume: 33
  start-page: 805
  year: 2022
  ident: 10.1016/j.cpc.2023.108973_br0030
  article-title: Kokkos 3: programming model extensions for the exascale era
  publication-title: IEEE Trans. Parallel Distrib. Syst.
  doi: 10.1109/TPDS.2021.3097283
– volume: 4
  start-page: 1370
  year: 2019
  ident: 10.1016/j.cpc.2023.108973_br0070
  article-title: AMReX: a framework for block-structured adaptive mesh refinement
  publication-title: J. Open Sour. Softw.
  doi: 10.21105/joss.01370
– year: 2017
  ident: 10.1016/j.cpc.2023.108973_br0130
  article-title: Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the alpaka library
– start-page: 1
  year: 2022
  ident: 10.1016/j.cpc.2023.108973_br0200
  article-title: Pushing the frontier in the design of laser-based electron accelerators with groundbreaking mesh-refined particle-in-cell simulations on exascale-class supercomputers
– ident: 10.1016/j.cpc.2023.108973_br0330
– ident: 10.1016/j.cpc.2023.108973_br0100
– volume: 80
  start-page: 1509
  year: 2010
  ident: 10.1016/j.cpc.2023.108973_br0220
  article-title: Multi–scale simulations of plasma with iPIC3D
  publication-title: Math. Comput. Simul.
  doi: 10.1016/j.matcom.2009.08.038
– year: 2021
  ident: 10.1016/j.cpc.2023.108973_br0380
– year: 2018
  ident: 10.1016/j.cpc.2023.108973_br0460
– volume: vol. 35
  year: 2007
  ident: 10.1016/j.cpc.2023.108973_br0050
  article-title: Design Patterns
– volume: 28
  year: 2021
  ident: 10.1016/j.cpc.2023.108973_br0060
  article-title: Toward exascale whole-device modeling of fusion devices: porting the GENE gyrokinetic microturbulence code to GPU
  publication-title: Phys. Plasmas
  doi: 10.1063/5.0046327
– volume: 23
  start-page: 10
  year: 2021
  ident: 10.1016/j.cpc.2023.108973_br0300
  article-title: The Kokkos EcoSystem: comprehensive performance portability for high performance computing
  publication-title: Comput. Sci. Eng.
  doi: 10.1109/MCSE.2021.3098509
– volume: 51
  start-page: 129
  year: 2009
  ident: 10.1016/j.cpc.2023.108973_br0370
  article-title: Optimization and performance modeling of stencil computations on modern microprocessors
  publication-title: SIAM Rev.
  doi: 10.1137/070693199
– volume: 254
  year: 2020
  ident: 10.1016/j.cpc.2023.108973_br0180
  article-title: Semi-Lagrangian Vlasov simulation on GPUs
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2020.107351
– year: 2022
  ident: 10.1016/j.cpc.2023.108973_br0190
  article-title: Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers
  publication-title: Int. J. HPC Appl.
– volume: 11
  start-page: 341
  year: 2002
  ident: 10.1016/j.cpc.2023.108973_br0250
  article-title: Splitting methods
  publication-title: Acta Numer.
  doi: 10.1017/S0962492902000053
– start-page: 5:1
  year: 2013
  ident: 10.1016/j.cpc.2023.108973_br0210
  article-title: Radiative signatures of the relativistic Kelvin-Helmholtz instability
– ident: 10.1016/j.cpc.2023.108973_br0410
– ident: 10.1016/j.cpc.2023.108973_br0090
– volume: 109
  start-page: 10
  year: 1958
  ident: 10.1016/j.cpc.2023.108973_br0440
  article-title: Waves in a plasma in a magnetic field
  publication-title: Phys. Rev.
  doi: 10.1103/PhysRev.109.10
SSID ssj0007793
Score 2.4592686
Snippet This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 108973
SubjectTerms Fully kinetic simulation
General purpose computing on graphic processing units (GPGPU)
Kokkos
Performance portability
Semi-Lagrangian
Software design patterns
Title A performance portable implementation of the semi-Lagrangian algorithm in six dimensions
URI https://dx.doi.org/10.1016/j.cpc.2023.108973
Volume 295
WOSCitedRecordID wos001164735000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-2944
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0007793
  issn: 0010-4655
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fb9MwELaqDSReED_FNkB-4AkU1DhObD9WaGjAmCZ1oL5F_pWSKU2rNkyV9s9zjuO06wABEi9RFdVJ5Pty_u5y9xmhV1zFNottHKnMqIgaE0cyozxKGTUFp9wM2y7-r6fs7IxPJuJ8MLgOvTBXFatrvl6LxX81NZwDY7vW2b8wd39ROAG_wehwBLPD8Y8MP3JSxH0zQEuvXXdUOQuV4oEjOsq5srMyOpVTWLGm7lWX1XS-LJtvM5cHWZXrN8ap_6_6pF7QNOj2gugSIytXmb7pM-lp-rhtFfeAqzZF9e3HeeoTOp8hVp_3tL5UbUZ6bGFxbbaAe-LS_Q0E4rZ13p8qeTNfQWgocQ5JtFuNNN4xw3LgpNz8suR9MWciIsLLQwZnTfyWnLccv89BXL7VC6dLSRJXOyn8Jik7etrjVpIIbgXBl_NoEDrvE5YKcIn7ow_Hk4_9Qs5Yp9ncPVv4KN6WB-7c6Oe0ZouqXDxA97sYA488Nh6iga0fobvn3lSP0WSEtxCCA0LwTYTgeYEBIXgHIbhHCC5rDAjBG4Q8QV_eH1-8O4m6DTYindBhE2ltipQqFheFNHFs01QpkWrjmLTShJpMMc1iRRVJMqstlSItjNUpVxnNDE2eor16XttnCGtLuOJKJBL4qaRCZplLDWhiJSlMUhygYZifXHfq824TlCoPZYaXOUxp7qY091N6gF73QxZeeuV3f6Zh0vOOO3pOmANCfj3s8N-GHaF7G2g_R3vN8rt9ge7oq6ZcLV92OPoB1gCVag
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+performance+portable+implementation+of+the+semi-Lagrangian+algorithm+in+six+dimensions&rft.jtitle=Computer+physics+communications&rft.au=Schild%2C+Nils&rft.au=R%C3%A4th%2C+Mario&rft.au=Eibl%2C+Sebastian&rft.au=Hallatschek%2C+Klaus&rft.date=2024-02-01&rft.pub=Elsevier+B.V&rft.issn=0010-4655&rft.eissn=1879-2944&rft.volume=295&rft_id=info:doi/10.1016%2Fj.cpc.2023.108973&rft.externalDocID=S0010465523003181
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon