A performance portable implementation of the semi-Lagrangian algorithm in six dimensions
This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore physics beyond the reduced gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage c...
Gespeichert in:
| Veröffentlicht in: | Computer physics communications Jg. 295; S. 108973 |
|---|---|
| Hauptverfasser: | , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier B.V
01.02.2024
|
| Schlagworte: | |
| ISSN: | 0010-4655, 1879-2944 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore physics beyond the reduced gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extensible regarding the physical model and flexible regarding the hardware for production runs. We start on the algorithmic background to simulate the 6-D Vlasov equation using a semi-Lagrangian algorithm. The performance portable software stack, which enables production runs on pure CPU as well as AMD or Nvidia GPU accelerated nodes, is presented. The extensibility of our implementation is guaranteed through the described software architecture of the main kernel, which achieves a memory bandwidth of almost 500 GB/s on a V100 Nvidia GPU and around 100 GB/s on an Intel Xeon Gold CPU using a single code base. We provide performance data on multiple node-level architectures discussing utilized and further available hardware capabilities. Finally, the network communication bottleneck of 6-D grid-based algorithms is quantified. A verification of physics beyond gyrokinetic theory, for the example of ion Bernstein waves, concludes the work.
•Performance portable implementation of a semi-Lagrangian algorithm for full kinetics.•Software architecture for Lagrange interpolation stencils using design patterns.•Node level performance analysis for OpenMP, HIP and CUDA using a single code base.•Quantification of the network communication bottleneck for 6D distributed grids. |
|---|---|
| AbstractList | This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore physics beyond the reduced gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extensible regarding the physical model and flexible regarding the hardware for production runs. We start on the algorithmic background to simulate the 6-D Vlasov equation using a semi-Lagrangian algorithm. The performance portable software stack, which enables production runs on pure CPU as well as AMD or Nvidia GPU accelerated nodes, is presented. The extensibility of our implementation is guaranteed through the described software architecture of the main kernel, which achieves a memory bandwidth of almost 500 GB/s on a V100 Nvidia GPU and around 100 GB/s on an Intel Xeon Gold CPU using a single code base. We provide performance data on multiple node-level architectures discussing utilized and further available hardware capabilities. Finally, the network communication bottleneck of 6-D grid-based algorithms is quantified. A verification of physics beyond gyrokinetic theory, for the example of ion Bernstein waves, concludes the work.
•Performance portable implementation of a semi-Lagrangian algorithm for full kinetics.•Software architecture for Lagrange interpolation stencils using design patterns.•Node level performance analysis for OpenMP, HIP and CUDA using a single code base.•Quantification of the network communication bottleneck for 6D distributed grids. |
| ArticleNumber | 108973 |
| Author | Kormann, Katharina Hallatschek, Klaus Schild, Nils Räth, Mario Eibl, Sebastian |
| Author_xml | – sequence: 1 givenname: Nils orcidid: 0009-0000-5048-4814 surname: Schild fullname: Schild, Nils email: nils.schild@ipp.mpg.de organization: Max Planck Institut for Plasma Physics, Germany – sequence: 2 givenname: Mario surname: Räth fullname: Räth, Mario organization: Max Planck Institut for Plasma Physics, Germany – sequence: 3 givenname: Sebastian surname: Eibl fullname: Eibl, Sebastian organization: Max Planck Computing and Data Facility, Germany – sequence: 4 givenname: Klaus surname: Hallatschek fullname: Hallatschek, Klaus organization: Max Planck Institut for Plasma Physics, Germany – sequence: 5 givenname: Katharina surname: Kormann fullname: Kormann, Katharina organization: Ruhr University Bochum, Germany |
| BookMark | eNp90MtKAzEUgOEgFazVB3CXF5iazCSTDq5K8QYFNwruQi4nbcrMZEiC6NubUlcuujpk8YVz_ms0G8MICN1RsqSEtveHpZnMsiZ1U96rTjQXaE5XoqvqjrEZmhNCScVazq_QdUoHQogQXTNHn2s8QXQhDmo0gKcQs9I9YD9MPQwwZpV9GHFwOO8BJxh8tVW7qMadVyNW_S5En_cD9iNO_htbX0wqIt2gS6f6BLd_c4E-nh7fNy_V9u35dbPeVqZhJFfGWMeZFtQ5ZSkFzrXuuLG6LKxNzWyrhRFUM103LRhgquPOguEr3bLWsmaBxOlfE0NKEZw0_rR0jsr3khJ5DCQPsgSSx0DyFKhI-k9O0Q8q_pw1DycD5aQvD1Em46GUsz6CydIGf0b_AjqGgtI |
| CitedBy_id | crossref_primary_10_1016_j_jcp_2025_114294 crossref_primary_10_1063_5_0265541 |
| Cites_doi | 10.1051/proc/201343008 10.1051/proc/2011022 10.1016/0021-9991(76)90053-X 10.1016/j.cpc.2020.107208 10.1145/1498765.1498785 10.1016/j.jpdc.2014.07.003 10.1016/j.cpc.2023.109064 10.1063/1.4999945 10.1109/TPDS.2021.3097283 10.21105/joss.01370 10.1016/j.matcom.2009.08.038 10.1063/5.0046327 10.1109/MCSE.2021.3098509 10.1137/070693199 10.1016/j.cpc.2020.107351 10.1017/S0962492902000053 10.1103/PhysRev.109.10 |
| ContentType | Journal Article |
| Copyright | 2023 The Author(s) |
| Copyright_xml | – notice: 2023 The Author(s) |
| DBID | 6I. AAFTH AAYXX CITATION |
| DOI | 10.1016/j.cpc.2023.108973 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 1879-2944 |
| ExternalDocumentID | 10_1016_j_cpc_2023_108973 S0010465523003181 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1RT 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 6I. 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARLI AAXUO AAYFN ABBOA ABFNM ABMAC ABNEU ABQEM ABQYD ABXDB ABYKQ ACDAQ ACFVG ACGFS ACLVX ACNNM ACRLP ACSBN ACZNC ADBBV ADECG ADEZE ADJOM ADMUD AEBSH AEKER AENEX AFKWA AFTJW AFZHZ AGHFR AGUBO AGYEJ AHHHB AHZHX AI. AIALX AIEXJ AIKHN AITUG AIVDX AJBFU AJOXV AJSZI ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG ATOGT AVWKF AXJTR AZFZN BBWZM BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FLBIZ FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HME HMV HVGLF HZ~ IHE IMUCA J1W KOM LG9 LZ4 M38 M41 MO0 N9A NDZJH O-L O9- OAUVE OGIMB OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCB SDF SDG SES SEW SHN SPC SPCBC SPD SPG SSE SSK SSQ SSV SSZ T5K TN5 UPT VH1 WUQ ZMT ~02 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c340t-ccdf54b71ffad11e55bb95cdb001bc24d6b7c71b4b236ece4a95fdec58b646d43 |
| ISICitedReferencesCount | 6 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001164735000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0010-4655 |
| IngestDate | Sat Nov 29 07:29:55 EST 2025 Tue Nov 18 22:33:34 EST 2025 Fri Feb 23 02:35:45 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | General purpose computing on graphic processing units (GPGPU) Kokkos Software design patterns Fully kinetic simulation Semi-Lagrangian Performance portability |
| Language | English |
| License | This is an open access article under the CC BY license. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c340t-ccdf54b71ffad11e55bb95cdb001bc24d6b7c71b4b236ece4a95fdec58b646d43 |
| ORCID | 0009-0000-5048-4814 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.cpc.2023.108973 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_cpc_2023_108973 crossref_primary_10_1016_j_cpc_2023_108973 elsevier_sciencedirect_doi_10_1016_j_cpc_2023_108973 |
| PublicationCentury | 2000 |
| PublicationDate | February 2024 2024-02-00 |
| PublicationDateYYYYMMDD | 2024-02-01 |
| PublicationDate_xml | – month: 02 year: 2024 text: February 2024 |
| PublicationDecade | 2020 |
| PublicationTitle | Computer physics communications |
| PublicationYear | 2024 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Zhang, Almgren, Beckner, Bell, Blaschke, Chan, Day, Friesen, Gott, Graves, Katz, Myers, Nguyen, Nonaka, Rosso, Williams, Zingale (br0070) 2019; 4 (br0280) 1997-NNNN (br0230) 1981 Gamma, Helm, Johnson, Vlissides (br0050) 2007; vol. 35 Crouseilles, Mehrenberger, Vecil (br0260) 2011; 32 Hager, Wellein (br0360) 2011 Trott, Lebrun-Grandie, Arndt, Ciesko, Dang, Ellingwood, Gayatri, Harvey, Hollman, Ibanez, Liber, Madsen, Miles, Poliakoff, Powell, Rajamanickam, Simberg, Sunderland, Turcksin, Wilke (br0030) 2022; 33 Einkemmer, Moriggl (br0190) 2022 Bigot, Grandgirard, Latu, Passeron, Rozar, Thomine (br0140) 2013; 43 Cheng, Knorr (br0240) 1976; 22 Harris (br0330) 2023 NVIDIA (br0400) 2023 Sturdevant, Chen, Parker (br0450) 2017; 24 Beckingsale, Burmark, Hornung, Jones, Killian, Kunen, Pearce, Robinson, Ryujin, Scogland (br0110) 2019 Ayala, Tomov, Haidar, Dongarra (br0290) 2020 Trott, Berger-Vergiat, Poliakoff Sivasankaran, Lebrun-Grandie, Madsen, Awar, Gligoric, Shipman, Womeldorff (br0300) 2021; 23 F. Allmann-Rahn, S. Lautenbach, M. Deisenhofer, R. Grauer, The muphyII Code: Multiphysics Plasma Simulation on Large HPC Systems, ArXiv (2023). Datta, Kamil, Williams, Oliker, Shalf, Yelick (br0370) 2009; 51 Treibig, Hager, Wellein (br0390) 2010 Gregoire (br0380) 2021 Raeth (br0020) 2023 Einkemmer (br0180) 2020; 254 Germaschewski, Allen, Dannert, Hrywniak, Donaghy, Merlo, Ethier, D'Azevedo, Jenko, Bhattacharjee (br0060) 2021; 28 Ohana, Gheller, Lanti, Jocksch, Brunner, Villard (br0080) 2021; 262 McLachlan, Quispel (br0250) 2002; 11 BSL6D Authors (br0010) 2023 Markidis, Lapenta (br0220) 2010; 80 Artigues, Kormann, Rampp, Reuter (br0120) 2020; 32 Asahi, Latu, Grandgirard, Bigot (br0170) 2019 Snir, Otto, Huss-Lederman, Walker (br0270) 1998 Muralikrishnan, Frey, Vinciguerra, Ligotino, Cerfon, Stoyanov, Gayatri, Adelmann (br0100) 2022 Deakin, Price, Martineau, McIntosh-Smith (br0420) 2018; 17 Bernstein (br0440) 1958; 109 ECP-CoPa (br0090) 2023 Umeda, Fukazawa (br0150) 2014 Matthes, Widera, Zenker, Worpitz, Huebl, Bussmann (br0130) 2017 Edwards, Trott, Sunderland (br0320) 2014; 74 Bussmann, Burau, Cowan, Debus, Huebl, Juckeland, Kluge, Nagel, Pausch, Schmitt, Schramm, Schuchart, Widera (br0210) 2013 AMD (br0410) 2023 Williams, Waterman, Patterson (br0430) 2009; 52 Fedeli, Huebl, Boillod-Cerneux, Clark, Gott, Hillairet, Jaure, Leblanc, Lehe, Myers, Piechurski, Sato, Zaim, Zhang, Vay, Vincenti (br0200) 2022 Kormann, Reuter, Rampp (br0040) 2019 (br0310) 2023 (br0340) 2023 (br0350) 2023 Hazeltine, Waelbroeck (br0460) 2018 Gamma (10.1016/j.cpc.2023.108973_br0050) 2007; vol. 35 ECP-CoPa (10.1016/j.cpc.2023.108973_br0090) Sturdevant (10.1016/j.cpc.2023.108973_br0450) 2017; 24 Bigot (10.1016/j.cpc.2023.108973_br0140) 2013; 43 Trott (10.1016/j.cpc.2023.108973_br0030) 2022; 33 Umeda (10.1016/j.cpc.2023.108973_br0150) 2014 Fedeli (10.1016/j.cpc.2023.108973_br0200) 2022 Williams (10.1016/j.cpc.2023.108973_br0430) 2009; 52 (10.1016/j.cpc.2023.108973_br0230) 1981 Deakin (10.1016/j.cpc.2023.108973_br0420) 2018; 17 Muralikrishnan (10.1016/j.cpc.2023.108973_br0100) Einkemmer (10.1016/j.cpc.2023.108973_br0190) 2022 Cheng (10.1016/j.cpc.2023.108973_br0240) 1976; 22 Ayala (10.1016/j.cpc.2023.108973_br0290) 2020 Zhang (10.1016/j.cpc.2023.108973_br0070) 2019; 4 Matthes (10.1016/j.cpc.2023.108973_br0130) 2017 Datta (10.1016/j.cpc.2023.108973_br0370) 2009; 51 Raeth (10.1016/j.cpc.2023.108973_br0020) 2023 Crouseilles (10.1016/j.cpc.2023.108973_br0260) 2011; 32 Gregoire (10.1016/j.cpc.2023.108973_br0380) 2021 Artigues (10.1016/j.cpc.2023.108973_br0120) 2020; 32 BSL6D Authors (10.1016/j.cpc.2023.108973_br0010) AMD (10.1016/j.cpc.2023.108973_br0410) Ohana (10.1016/j.cpc.2023.108973_br0080) 2021; 262 Harris (10.1016/j.cpc.2023.108973_br0330) Hazeltine (10.1016/j.cpc.2023.108973_br0460) 2018 Treibig (10.1016/j.cpc.2023.108973_br0390) 2010 Markidis (10.1016/j.cpc.2023.108973_br0220) 2010; 80 NVIDIA (10.1016/j.cpc.2023.108973_br0400) Einkemmer (10.1016/j.cpc.2023.108973_br0180) 2020; 254 Edwards (10.1016/j.cpc.2023.108973_br0320) 2014; 74 Hager (10.1016/j.cpc.2023.108973_br0360) 2011 10.1016/j.cpc.2023.108973_br0160 Beckingsale (10.1016/j.cpc.2023.108973_br0110) 2019 Snir (10.1016/j.cpc.2023.108973_br0270) 1998 Bernstein (10.1016/j.cpc.2023.108973_br0440) 1958; 109 Kormann (10.1016/j.cpc.2023.108973_br0040) 2019 Bussmann (10.1016/j.cpc.2023.108973_br0210) 2013 Germaschewski (10.1016/j.cpc.2023.108973_br0060) 2021; 28 Trott (10.1016/j.cpc.2023.108973_br0300) 2021; 23 McLachlan (10.1016/j.cpc.2023.108973_br0250) 2002; 11 Asahi (10.1016/j.cpc.2023.108973_br0170) 2019 |
| References_xml | – year: 2023 ident: br0020 article-title: Beyond Gyrokinetic Theory – volume: vol. 35 year: 2007 ident: br0050 article-title: Design Patterns publication-title: Addison Wesley Professional Computing Series – volume: 4 start-page: 1370 year: 2019 ident: br0070 article-title: AMReX: a framework for block-structured adaptive mesh refinement publication-title: J. Open Sour. Softw. – year: 2023 ident: br0330 article-title: An efficient matrix transpose in CUDA C/C++ – start-page: 127 year: 2014 end-page: 138 ident: br0150 article-title: Performance tuning of Vlasov code for space plasma on the K computer publication-title: AsiaSim 2014 – start-page: 5:1 year: 2013 end-page: 5:12 ident: br0210 article-title: Radiative signatures of the relativistic Kelvin-Helmholtz instability publication-title: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis – year: 1997-NNNN ident: br0280 article-title: Hierarchical Data Format, version 5 – year: 2023 ident: br0090 article-title: CoPA cabana - the exascale co-design center for particle applications toolkit – volume: 80 start-page: 1509 year: 2010 end-page: 1519 ident: br0220 article-title: Multi–scale simulations of plasma with iPIC3D publication-title: Math. Comput. Simul. – volume: 17 start-page: 247 year: 2018 end-page: 262 ident: br0420 article-title: Evaluating attainable memory bandwidth of parallel programming models via babelstream publication-title: Int. J. Comput. Sci. Eng. – year: 2023 ident: br0310 article-title: Kokkos documentation – volume: 24 year: 2017 ident: br0450 article-title: Low frequency fully kinetic simulation of the toroidal ion temperature gradient instability publication-title: Phys. Plasmas – volume: 43 start-page: 117 year: 2013 end-page: 135 ident: br0140 article-title: Scaling GYSELA code beyond 32K-cores on blue gene publication-title: ESAIM Proc. – volume: 11 start-page: 341 year: 2002 end-page: 434 ident: br0250 article-title: Splitting methods publication-title: Acta Numer. – volume: 74 start-page: 3202 year: 2014 end-page: 3216 ident: br0320 article-title: Kokkos: enabling manycore performance portability through polymorphic memory access patterns publication-title: J. Parallel Distrib. Comput. – start-page: 262 year: 2020 end-page: 275 ident: br0290 article-title: heFFTe: highly efficient FFT for exascale publication-title: Computational Science – ICCS 2020 – year: 2019 ident: br0040 article-title: A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov–Poisson equation publication-title: Int. J. HPC Appl. – volume: 254 year: 2020 ident: br0180 article-title: Semi-Lagrangian Vlasov simulation on GPUs publication-title: Comput. Phys. Commun. – year: 2017 ident: br0130 article-title: Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the alpaka library publication-title: High Performance Computing – volume: 23 start-page: 10 year: 2021 end-page: 18 ident: br0300 article-title: The Kokkos EcoSystem: comprehensive performance portability for high performance computing publication-title: Comput. Sci. Eng. – year: 2018 ident: br0460 article-title: The Framework of Plasma Physics – year: 2023 ident: br0010 article-title: BSL6D - Backwards SemiLagrangian 6Dimensions – volume: 52 start-page: 65 year: 2009 end-page: 76 ident: br0430 article-title: Roofline: an insightful visual performance model for multicore architectures publication-title: Commun. ACM – start-page: 71 year: 2019 end-page: 81 ident: br0110 article-title: RAJA: portable performance for large-scale scientific applications publication-title: 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC) – start-page: 117 year: 2019 end-page: 139 ident: br0170 article-title: Performance portable implementation of a kinetic plasma simulation mini-app publication-title: WACCPD 2019: Accelerator Programming Using Directives – volume: 32 start-page: 211 year: 2011 end-page: 230 ident: br0260 article-title: Discontinuous Galerkin semi-Lagrangian method for Vlasov–Poisson publication-title: ESAIM Proc. – year: 2022 ident: br0190 article-title: Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers publication-title: Int. J. HPC Appl. – year: 1981 ident: br0230 publication-title: Band 10: Course of Theoretical Physics – year: 2011 ident: br0360 article-title: Introduction to high performance computing for scientists and engineers publication-title: Chapman & Hall – year: 2010 ident: br0390 article-title: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments publication-title: Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures – year: 2023 ident: br0400 article-title: Nsight compute documentation – year: 2023 ident: br0350 – volume: 33 start-page: 805 year: 2022 end-page: 817 ident: br0030 article-title: Kokkos 3: programming model extensions for the exascale era publication-title: IEEE Trans. Parallel Distrib. Syst. – volume: 262 year: 2021 ident: br0080 article-title: Gyrokinetic simulations on many- and multi-core architectures with the global electromagnetic Particle-In-Cell Code ORB5 publication-title: Comput. Phys. Commun. – volume: 32 start-page: 1 year: 2020 end-page: 23 ident: br0120 article-title: Evaluation of performance portability frameworks for the implementation of a particle-in-cell code publication-title: Conn. Comput. Pract. Exp. – volume: 109 start-page: 10 year: 1958 end-page: 21 ident: br0440 article-title: Waves in a plasma in a magnetic field publication-title: Phys. Rev. – year: 2021 ident: br0380 article-title: Professional C++ – year: 2022 ident: br0100 article-title: Scaling and performance portability of the particle-in-cell scheme for plasma physics applications through mini-apps targeting exascale architectures – volume: 51 start-page: 129 year: 2009 end-page: 159 ident: br0370 article-title: Optimization and performance modeling of stencil computations on modern microprocessors publication-title: SIAM Rev. – volume: 22 start-page: 330 year: 1976 end-page: 351 ident: br0240 article-title: The integration of the Vlasov equation in configuration space publication-title: J. Comput. Phys. – year: 1998 ident: br0270 article-title: MPI – start-page: 1 year: 2022 end-page: 12 ident: br0200 article-title: Pushing the frontier in the design of laser-based electron accelerators with groundbreaking mesh-refined particle-in-cell simulations on exascale-class supercomputers publication-title: SC22: International Conference for High Performance Computing, Networking, Storage and Analysis – year: 2023 ident: br0340 – reference: F. Allmann-Rahn, S. Lautenbach, M. Deisenhofer, R. Grauer, The muphyII Code: Multiphysics Plasma Simulation on Large HPC Systems, ArXiv (2023). – volume: 28 year: 2021 ident: br0060 article-title: Toward exascale whole-device modeling of fusion devices: porting the GENE gyrokinetic microturbulence code to GPU publication-title: Phys. Plasmas – year: 2023 ident: br0410 article-title: Rocm profiling tools user guide – volume: 43 start-page: 117 year: 2013 ident: 10.1016/j.cpc.2023.108973_br0140 article-title: Scaling GYSELA code beyond 32K-cores on blue gene publication-title: ESAIM Proc. doi: 10.1051/proc/201343008 – volume: 32 start-page: 211 year: 2011 ident: 10.1016/j.cpc.2023.108973_br0260 article-title: Discontinuous Galerkin semi-Lagrangian method for Vlasov–Poisson publication-title: ESAIM Proc. doi: 10.1051/proc/2011022 – year: 2019 ident: 10.1016/j.cpc.2023.108973_br0040 article-title: A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov–Poisson equation publication-title: Int. J. HPC Appl. – volume: 22 start-page: 330 year: 1976 ident: 10.1016/j.cpc.2023.108973_br0240 article-title: The integration of the Vlasov equation in configuration space publication-title: J. Comput. Phys. doi: 10.1016/0021-9991(76)90053-X – volume: 262 year: 2021 ident: 10.1016/j.cpc.2023.108973_br0080 article-title: Gyrokinetic simulations on many- and multi-core architectures with the global electromagnetic Particle-In-Cell Code ORB5 publication-title: Comput. Phys. Commun. doi: 10.1016/j.cpc.2020.107208 – year: 2023 ident: 10.1016/j.cpc.2023.108973_br0020 – volume: 52 start-page: 65 year: 2009 ident: 10.1016/j.cpc.2023.108973_br0430 article-title: Roofline: an insightful visual performance model for multicore architectures publication-title: Commun. ACM doi: 10.1145/1498765.1498785 – volume: 32 start-page: 1 year: 2020 ident: 10.1016/j.cpc.2023.108973_br0120 article-title: Evaluation of performance portability frameworks for the implementation of a particle-in-cell code publication-title: Conn. Comput. Pract. Exp. – volume: 74 start-page: 3202 year: 2014 ident: 10.1016/j.cpc.2023.108973_br0320 article-title: Kokkos: enabling manycore performance portability through polymorphic memory access patterns publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2014.07.003 – year: 2011 ident: 10.1016/j.cpc.2023.108973_br0360 article-title: Introduction to high performance computing for scientists and engineers – ident: 10.1016/j.cpc.2023.108973_br0160 doi: 10.1016/j.cpc.2023.109064 – ident: 10.1016/j.cpc.2023.108973_br0010 – start-page: 127 year: 2014 ident: 10.1016/j.cpc.2023.108973_br0150 article-title: Performance tuning of Vlasov code for space plasma on the K computer – start-page: 71 year: 2019 ident: 10.1016/j.cpc.2023.108973_br0110 article-title: RAJA: portable performance for large-scale scientific applications – year: 1981 ident: 10.1016/j.cpc.2023.108973_br0230 – volume: 24 year: 2017 ident: 10.1016/j.cpc.2023.108973_br0450 article-title: Low frequency fully kinetic simulation of the toroidal ion temperature gradient instability publication-title: Phys. Plasmas doi: 10.1063/1.4999945 – year: 1998 ident: 10.1016/j.cpc.2023.108973_br0270 – start-page: 117 year: 2019 ident: 10.1016/j.cpc.2023.108973_br0170 article-title: Performance portable implementation of a kinetic plasma simulation mini-app – volume: 17 start-page: 247 year: 2018 ident: 10.1016/j.cpc.2023.108973_br0420 article-title: Evaluating attainable memory bandwidth of parallel programming models via babelstream publication-title: Int. J. Comput. Sci. Eng. – ident: 10.1016/j.cpc.2023.108973_br0400 – start-page: 262 year: 2020 ident: 10.1016/j.cpc.2023.108973_br0290 article-title: heFFTe: highly efficient FFT for exascale – year: 2010 ident: 10.1016/j.cpc.2023.108973_br0390 article-title: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments – volume: 33 start-page: 805 year: 2022 ident: 10.1016/j.cpc.2023.108973_br0030 article-title: Kokkos 3: programming model extensions for the exascale era publication-title: IEEE Trans. Parallel Distrib. Syst. doi: 10.1109/TPDS.2021.3097283 – volume: 4 start-page: 1370 year: 2019 ident: 10.1016/j.cpc.2023.108973_br0070 article-title: AMReX: a framework for block-structured adaptive mesh refinement publication-title: J. Open Sour. Softw. doi: 10.21105/joss.01370 – year: 2017 ident: 10.1016/j.cpc.2023.108973_br0130 article-title: Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the alpaka library – start-page: 1 year: 2022 ident: 10.1016/j.cpc.2023.108973_br0200 article-title: Pushing the frontier in the design of laser-based electron accelerators with groundbreaking mesh-refined particle-in-cell simulations on exascale-class supercomputers – ident: 10.1016/j.cpc.2023.108973_br0330 – ident: 10.1016/j.cpc.2023.108973_br0100 – volume: 80 start-page: 1509 year: 2010 ident: 10.1016/j.cpc.2023.108973_br0220 article-title: Multi–scale simulations of plasma with iPIC3D publication-title: Math. Comput. Simul. doi: 10.1016/j.matcom.2009.08.038 – year: 2021 ident: 10.1016/j.cpc.2023.108973_br0380 – year: 2018 ident: 10.1016/j.cpc.2023.108973_br0460 – volume: vol. 35 year: 2007 ident: 10.1016/j.cpc.2023.108973_br0050 article-title: Design Patterns – volume: 28 year: 2021 ident: 10.1016/j.cpc.2023.108973_br0060 article-title: Toward exascale whole-device modeling of fusion devices: porting the GENE gyrokinetic microturbulence code to GPU publication-title: Phys. Plasmas doi: 10.1063/5.0046327 – volume: 23 start-page: 10 year: 2021 ident: 10.1016/j.cpc.2023.108973_br0300 article-title: The Kokkos EcoSystem: comprehensive performance portability for high performance computing publication-title: Comput. Sci. Eng. doi: 10.1109/MCSE.2021.3098509 – volume: 51 start-page: 129 year: 2009 ident: 10.1016/j.cpc.2023.108973_br0370 article-title: Optimization and performance modeling of stencil computations on modern microprocessors publication-title: SIAM Rev. doi: 10.1137/070693199 – volume: 254 year: 2020 ident: 10.1016/j.cpc.2023.108973_br0180 article-title: Semi-Lagrangian Vlasov simulation on GPUs publication-title: Comput. Phys. Commun. doi: 10.1016/j.cpc.2020.107351 – year: 2022 ident: 10.1016/j.cpc.2023.108973_br0190 article-title: Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers publication-title: Int. J. HPC Appl. – volume: 11 start-page: 341 year: 2002 ident: 10.1016/j.cpc.2023.108973_br0250 article-title: Splitting methods publication-title: Acta Numer. doi: 10.1017/S0962492902000053 – start-page: 5:1 year: 2013 ident: 10.1016/j.cpc.2023.108973_br0210 article-title: Radiative signatures of the relativistic Kelvin-Helmholtz instability – ident: 10.1016/j.cpc.2023.108973_br0410 – ident: 10.1016/j.cpc.2023.108973_br0090 – volume: 109 start-page: 10 year: 1958 ident: 10.1016/j.cpc.2023.108973_br0440 article-title: Waves in a plasma in a magnetic field publication-title: Phys. Rev. doi: 10.1103/PhysRev.109.10 |
| SSID | ssj0007793 |
| Score | 2.4592686 |
| Snippet | This paper describes our approach to developing a simulation software application for the fully kinetic 6D-Vlasov equation, which will be used to explore... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 108973 |
| SubjectTerms | Fully kinetic simulation General purpose computing on graphic processing units (GPGPU) Kokkos Performance portability Semi-Lagrangian Software design patterns |
| Title | A performance portable implementation of the semi-Lagrangian algorithm in six dimensions |
| URI | https://dx.doi.org/10.1016/j.cpc.2023.108973 |
| Volume | 295 |
| WOSCitedRecordID | wos001164735000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2944 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0007793 issn: 0010-4655 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fb9MwELaqDSReED_FNkB-4AkU1DhObD9WaGjAmCZ1oL5F_pWSKU2rNkyV9s9zjuO06wABEi9RFdVJ5Pty_u5y9xmhV1zFNottHKnMqIgaE0cyozxKGTUFp9wM2y7-r6fs7IxPJuJ8MLgOvTBXFatrvl6LxX81NZwDY7vW2b8wd39ROAG_wehwBLPD8Y8MP3JSxH0zQEuvXXdUOQuV4oEjOsq5srMyOpVTWLGm7lWX1XS-LJtvM5cHWZXrN8ap_6_6pF7QNOj2gugSIytXmb7pM-lp-rhtFfeAqzZF9e3HeeoTOp8hVp_3tL5UbUZ6bGFxbbaAe-LS_Q0E4rZ13p8qeTNfQWgocQ5JtFuNNN4xw3LgpNz8suR9MWciIsLLQwZnTfyWnLccv89BXL7VC6dLSRJXOyn8Jik7etrjVpIIbgXBl_NoEDrvE5YKcIn7ow_Hk4_9Qs5Yp9ncPVv4KN6WB-7c6Oe0ZouqXDxA97sYA488Nh6iga0fobvn3lSP0WSEtxCCA0LwTYTgeYEBIXgHIbhHCC5rDAjBG4Q8QV_eH1-8O4m6DTYindBhE2ltipQqFheFNHFs01QpkWrjmLTShJpMMc1iRRVJMqstlSItjNUpVxnNDE2eor16XttnCGtLuOJKJBL4qaRCZplLDWhiJSlMUhygYZifXHfq824TlCoPZYaXOUxp7qY091N6gF73QxZeeuV3f6Zh0vOOO3pOmANCfj3s8N-GHaF7G2g_R3vN8rt9ge7oq6ZcLV92OPoB1gCVag |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+performance+portable+implementation+of+the+semi-Lagrangian+algorithm+in+six+dimensions&rft.jtitle=Computer+physics+communications&rft.au=Schild%2C+Nils&rft.au=R%C3%A4th%2C+Mario&rft.au=Eibl%2C+Sebastian&rft.au=Hallatschek%2C+Klaus&rft.date=2024-02-01&rft.pub=Elsevier+B.V&rft.issn=0010-4655&rft.eissn=1879-2944&rft.volume=295&rft_id=info:doi/10.1016%2Fj.cpc.2023.108973&rft.externalDocID=S0010465523003181 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon |