GenASiSBasics: Object-oriented utilitarian functionality for large-scale physics simulations (Version 3)

GenASiSBasics provides Fortran 2003 classes furnishing extensible object-oriented utilitarian functionality for large-scale physics simulations on distributed memory supercomputers. This functionality includes physical units and constants; display to the screen or standard output device; message pas...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computer physics communications Ročník 244; s. 483 - 486
Hlavní autori: Budiardja, Reuben D., Cardall, Christian Y.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier B.V 01.11.2019
Predmet:
ISSN:0010-4655, 1879-2944
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract GenASiSBasics provides Fortran 2003 classes furnishing extensible object-oriented utilitarian functionality for large-scale physics simulations on distributed memory supercomputers. This functionality includes physical units and constants; display to the screen or standard output device; message passing; I/O to disk; and runtime parameter management and usage statistics. This revision – Version 3 of Basics – includes a significant name change, some minor additions to functionality, and a major addition to functionality: infrastructure facilitating the offloading of computational kernels to devices such as GPUs. Program Title: SineWaveAdvection, SawtoothWaveAdvection, and RiemannProblem (fluid dynamics example problems illustrating GenASiSBasics); ArgonEquilibrium and ClusterFormation (molecular dynamics example problems illustrating GenASiSBasics) Program Files doi:http://dx.doi.org/10.17632/6w9ygpygmc.2 Licensing provisions: GPLv3 Programming language: Fortran 2003 (tested with GNU Compiler Collection 8.1.0, Intel Fortran Compiler 18.0.3, Cray Compiler Environment 8.6.5, IBM XL Fortran 16.1.0) Journal reference of previous version: Computer Physics Communications 214 (2017) 247 Does the new version supersede the previous version?: Yes Reasons for the new version: This version includes a significant name change, some minor additions to functionality, and a major addition to functionality: infrastructure facilitating the offloading of computational kernels to devices such as GPUs. Summary of revisions: The class VariableGroupForm – a major workhorse for handling set of related fields – has been renamed StorageForm. The ability to use unicode characters in standard output has been added, but is currently only supported by the GNU Compiler Collection (GCC). This capability is used to display exponents as numerical superscripts, as well as symbols such as ħ, ⊙, and Å in the display of relevant units. It is made operational by the line [Display omitted] which is now included in the machine-specific makefile fragments with a GCC suffix in the Build/Machines directory. There are some changes to units and constants. The geometrized units of past releases (G=c=k=1, with a fundamental unit of meter) have been replaced by natural units (ħ=c=k=1, with MeV as the fundamental unit). Lorentz–Heaviside electromagnetic units are employed (permeability μ=1; no factors of 4π in the Maxwell equations). This refers to numbers as processed internally by the code; as described in the initial release, users can employ the members of the UNIT singleton for input/output purposes, that is, to specify or display numbers with any available units they wish. A number of units have been added, and the specification all units has been put on a more rational basis in keeping with six of the seven standard SI base units (meter, kilogram, second, ampere, kelvin, mole; we have not needed the candela; see [3]). Some physical and astrophysical constants have also been added. All constants have been updated to 2018 values [4]. For notifications to standard output, a few tweaks to ignorability levels have been made in various classes. The default output to screen is now less verbose (ignorability INFO_1, our designation for messages of significance just below WARNING). A couple of additions have been made to MessagePassing: null subcommunicators are accommodated, and an AllToAll_V method has been added to CollectiveOperation_R_Form. Enhancements to timer functionality have been made. The class TimerForm now has a member Level, which is specified in order to control indentation in screen output. Some functionality has been added to PROGRAM_HEADER_Singleton to work with timers. A method TimerPointer returns a pointer to a timer with a specified Handle (typically a meaningfully named integer). The new members TimerLevel and TimerDisplayFraction of PROGRAM_HEADER_Singleton, which can be set from the command line, can be used to suppress output from timings deemed insignificant, based on timer level or a measured time interval falling below a specified fraction of the total execution time. The most significant addition in functionality in this release is the addition of infrastructure to offload computational kernels to hardware accelerators such as GPUs using OpenMP device-related directives and runtime library routines in OpenMP 4.5 and later.22https://www.openmp.org/specifications/.This infrastructure, implemented in a new subdivision Devices (see Fig. 1), provides lower-level routines to perform memory management between the host (CPU) and device (GPU) including data allocation, data movement between host and device, and device-to-host memory address association. The routines are implemented as Fortran wrappers to the OpenMP runtime library and CUDA33https://developer.nvidia.com/about-cuda.routines written in C. Additional methods and an option utilizing the lower-level Devices routines have been added to our StorageForm class. They are: UpdateHost() and UpdateDevice() to copy data from device to host and host to device, respectively; AllocateDevice() to allocate memory on the device mirroring the allocation on the host; and PinnedOption as an optional flag to the Initialize() method to allocate the host memory in a page-locked region to facilitate faster data transfer between host and device. A detailed description of the implementation of this functionality can be found in [5]. To deal with different levels of compiler support for device-related OpenMP directives, we use the preprocessor in some source files in Devices to guard against attempted compilation of unsupported features. Preprocessor macro substitution is also utilized in OpenMP directives to switch between multi-threading parallelism on CPUs and offload parallelism to GPUs. Setting the makefile variable ENABLE_OMP_OFFLOAD to 1 – which is the default in the machine-specific makefile Makefile_POWER_XL for the XL compiler on POWER-based supercomputers – sets the appropriate flags and preprocessing to enable compilation for OpenMP offload parallelism. Alternatively, the command [Display omitted] sets this variable when make is invoked from the command line. Information regarding the number of devices available to the program, the kind of OpenMP parallelism enabled (i.e. multi-threading or offload), and the selected OpenMP loop scheduling are displayed at runtime by PROGRAM_HEADER_Singleton. When offload parallelism is enabled, the loop scheduling is automatically set to static with chunk-size of 1. With multi-threading parallelism, the schedule defaults to guided but can be overridden at runtime by setting the environment variable OMP_SCHEDULE appropriately. The example problem RiemannProblem in the Examples directory under the Basics division has been modified to exploit the GPUs using this new functionality. The computational kernels for the problem have been annotated with new OpenMP directives (via the appropriate preprocessor macros) such that they are offloaded to the GPUs when offload parallelism is enabled during compilation. In [5] we demonstrate the weak scaling of this example problem up to 8000 GPUs on the Summit supercomputer at the Oak Ridge Leadership Computing Facility.44https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/.Figs. 2 and 3 show a visualization of the three-dimensional version of RiemannProblem at 12803 resolution executed with 1000 GPUs. Nature of problem: By way of illustrating GenASiSBasics functionality, solve example fluid dynamics and molecular dynamics problems. Solution method: For fluid dynamics examples, finite-volume. For molecular dynamics examples, leapfrog and velocity-Verlet integration. External routines/libraries: MPI [1] and Silo [2] Additional comments including restrictions and unusual features: The example problems named above are not ends in themselves, but serve to illustrate our object-oriented approach and the functionality available though GenASiSBasics. In addition to these more substantial examples, we provide individual unit test programs for each of the classes comprised by GenASiSBasics. GenASiSBasics is available in the CPC Program Library and also at https://github.com/GenASiS. http://www.mcs.anl.gov/mpi/ https://wci.llnl.gov/simulation/computer-codes/silo https://en.wikipedia.org/wiki/SI_base_unit M. Tanabashi et al. (Particle Data Group), Phys. Rev. D 98 (2018) 030001 Budiardja, R.D. and Cardall, C.Y., “Targeting GPUs with OpenMP Directives on Summit: A Simple and Effective Fortran Experience,” submitted for publication Parallel Computing: Systems and Applications, arXiv:1812.07977 [physics.comp-ph],
AbstractList GenASiSBasics provides Fortran 2003 classes furnishing extensible object-oriented utilitarian functionality for large-scale physics simulations on distributed memory supercomputers. This functionality includes physical units and constants; display to the screen or standard output device; message passing; I/O to disk; and runtime parameter management and usage statistics. This revision – Version 3 of Basics – includes a significant name change, some minor additions to functionality, and a major addition to functionality: infrastructure facilitating the offloading of computational kernels to devices such as GPUs. Program Title: SineWaveAdvection, SawtoothWaveAdvection, and RiemannProblem (fluid dynamics example problems illustrating GenASiSBasics); ArgonEquilibrium and ClusterFormation (molecular dynamics example problems illustrating GenASiSBasics) Program Files doi:http://dx.doi.org/10.17632/6w9ygpygmc.2 Licensing provisions: GPLv3 Programming language: Fortran 2003 (tested with GNU Compiler Collection 8.1.0, Intel Fortran Compiler 18.0.3, Cray Compiler Environment 8.6.5, IBM XL Fortran 16.1.0) Journal reference of previous version: Computer Physics Communications 214 (2017) 247 Does the new version supersede the previous version?: Yes Reasons for the new version: This version includes a significant name change, some minor additions to functionality, and a major addition to functionality: infrastructure facilitating the offloading of computational kernels to devices such as GPUs. Summary of revisions: The class VariableGroupForm – a major workhorse for handling set of related fields – has been renamed StorageForm. The ability to use unicode characters in standard output has been added, but is currently only supported by the GNU Compiler Collection (GCC). This capability is used to display exponents as numerical superscripts, as well as symbols such as ħ, ⊙, and Å in the display of relevant units. It is made operational by the line [Display omitted] which is now included in the machine-specific makefile fragments with a GCC suffix in the Build/Machines directory. There are some changes to units and constants. The geometrized units of past releases (G=c=k=1, with a fundamental unit of meter) have been replaced by natural units (ħ=c=k=1, with MeV as the fundamental unit). Lorentz–Heaviside electromagnetic units are employed (permeability μ=1; no factors of 4π in the Maxwell equations). This refers to numbers as processed internally by the code; as described in the initial release, users can employ the members of the UNIT singleton for input/output purposes, that is, to specify or display numbers with any available units they wish. A number of units have been added, and the specification all units has been put on a more rational basis in keeping with six of the seven standard SI base units (meter, kilogram, second, ampere, kelvin, mole; we have not needed the candela; see [3]). Some physical and astrophysical constants have also been added. All constants have been updated to 2018 values [4]. For notifications to standard output, a few tweaks to ignorability levels have been made in various classes. The default output to screen is now less verbose (ignorability INFO_1, our designation for messages of significance just below WARNING). A couple of additions have been made to MessagePassing: null subcommunicators are accommodated, and an AllToAll_V method has been added to CollectiveOperation_R_Form. Enhancements to timer functionality have been made. The class TimerForm now has a member Level, which is specified in order to control indentation in screen output. Some functionality has been added to PROGRAM_HEADER_Singleton to work with timers. A method TimerPointer returns a pointer to a timer with a specified Handle (typically a meaningfully named integer). The new members TimerLevel and TimerDisplayFraction of PROGRAM_HEADER_Singleton, which can be set from the command line, can be used to suppress output from timings deemed insignificant, based on timer level or a measured time interval falling below a specified fraction of the total execution time. The most significant addition in functionality in this release is the addition of infrastructure to offload computational kernels to hardware accelerators such as GPUs using OpenMP device-related directives and runtime library routines in OpenMP 4.5 and later.22https://www.openmp.org/specifications/.This infrastructure, implemented in a new subdivision Devices (see Fig. 1), provides lower-level routines to perform memory management between the host (CPU) and device (GPU) including data allocation, data movement between host and device, and device-to-host memory address association. The routines are implemented as Fortran wrappers to the OpenMP runtime library and CUDA33https://developer.nvidia.com/about-cuda.routines written in C. Additional methods and an option utilizing the lower-level Devices routines have been added to our StorageForm class. They are: UpdateHost() and UpdateDevice() to copy data from device to host and host to device, respectively; AllocateDevice() to allocate memory on the device mirroring the allocation on the host; and PinnedOption as an optional flag to the Initialize() method to allocate the host memory in a page-locked region to facilitate faster data transfer between host and device. A detailed description of the implementation of this functionality can be found in [5]. To deal with different levels of compiler support for device-related OpenMP directives, we use the preprocessor in some source files in Devices to guard against attempted compilation of unsupported features. Preprocessor macro substitution is also utilized in OpenMP directives to switch between multi-threading parallelism on CPUs and offload parallelism to GPUs. Setting the makefile variable ENABLE_OMP_OFFLOAD to 1 – which is the default in the machine-specific makefile Makefile_POWER_XL for the XL compiler on POWER-based supercomputers – sets the appropriate flags and preprocessing to enable compilation for OpenMP offload parallelism. Alternatively, the command [Display omitted] sets this variable when make is invoked from the command line. Information regarding the number of devices available to the program, the kind of OpenMP parallelism enabled (i.e. multi-threading or offload), and the selected OpenMP loop scheduling are displayed at runtime by PROGRAM_HEADER_Singleton. When offload parallelism is enabled, the loop scheduling is automatically set to static with chunk-size of 1. With multi-threading parallelism, the schedule defaults to guided but can be overridden at runtime by setting the environment variable OMP_SCHEDULE appropriately. The example problem RiemannProblem in the Examples directory under the Basics division has been modified to exploit the GPUs using this new functionality. The computational kernels for the problem have been annotated with new OpenMP directives (via the appropriate preprocessor macros) such that they are offloaded to the GPUs when offload parallelism is enabled during compilation. In [5] we demonstrate the weak scaling of this example problem up to 8000 GPUs on the Summit supercomputer at the Oak Ridge Leadership Computing Facility.44https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/.Figs. 2 and 3 show a visualization of the three-dimensional version of RiemannProblem at 12803 resolution executed with 1000 GPUs. Nature of problem: By way of illustrating GenASiSBasics functionality, solve example fluid dynamics and molecular dynamics problems. Solution method: For fluid dynamics examples, finite-volume. For molecular dynamics examples, leapfrog and velocity-Verlet integration. External routines/libraries: MPI [1] and Silo [2] Additional comments including restrictions and unusual features: The example problems named above are not ends in themselves, but serve to illustrate our object-oriented approach and the functionality available though GenASiSBasics. In addition to these more substantial examples, we provide individual unit test programs for each of the classes comprised by GenASiSBasics. GenASiSBasics is available in the CPC Program Library and also at https://github.com/GenASiS. http://www.mcs.anl.gov/mpi/ https://wci.llnl.gov/simulation/computer-codes/silo https://en.wikipedia.org/wiki/SI_base_unit M. Tanabashi et al. (Particle Data Group), Phys. Rev. D 98 (2018) 030001 Budiardja, R.D. and Cardall, C.Y., “Targeting GPUs with OpenMP Directives on Summit: A Simple and Effective Fortran Experience,” submitted for publication Parallel Computing: Systems and Applications, arXiv:1812.07977 [physics.comp-ph],
Author Budiardja, Reuben D.
Cardall, Christian Y.
Author_xml – sequence: 1
  givenname: Reuben D.
  surname: Budiardja
  fullname: Budiardja, Reuben D.
  email: reubendb@ornl.gov
  organization: National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN 37831-6354, USA
– sequence: 2
  givenname: Christian Y.
  surname: Cardall
  fullname: Cardall, Christian Y.
  email: cardallcy@ornl.gov
  organization: Physics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831-6354, USA
BookMark eNotkMFOwzAMhiM0JLrBA3DLEQ4tTtOmLZzGBANp0g4DrlGauJCqtFPSIe1teBaejFRwsmVZv_19czLrhx4JuWSQMGDipk30XicpsCqBPAGWnZCIlUUVp1WWzUgEwCDORJ6fkbn3LQAURcUjYtfYL3d2d6-81f6WbusW9RgPzmI_oqGH0XZ2VM6qnjaHXo926FWYHGkzONop946x16pDuv84ThHU289Dp6Y9T6_e0PnQ_Xzz63Ny2qjO48V_XZDXx4eX1VO82a6fV8tNjCxNx_BjmjNUaWbyCuuyETVjTLNClErktS4brgQI4CqwFroEY7CsjamV4CItecUX5O4vF8ORL4tOeh1gNBrrApo0g5UM5GRNtjJYk5M1CbkM1vgvTCVl1Q
ContentType Journal Article
Copyright 2019 Elsevier B.V.
Copyright_xml – notice: 2019 Elsevier B.V.
DOI 10.1016/j.cpc.2019.05.014
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 1879-2944
EndPage 486
ExternalDocumentID S0010465519301729
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARLI
AAXUO
AAYFN
ABBOA
ABFNM
ABMAC
ABNEU
ABQEM
ABQYD
ABXDB
ABYKQ
ACDAQ
ACFVG
ACGFS
ACLVX
ACNNM
ACRLP
ACSBN
ACZNC
ADBBV
ADECG
ADEZE
ADJOM
ADMUD
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFZHZ
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AI.
AIALX
AIEXJ
AIKHN
AITUG
AIVDX
AJBFU
AJOXV
AJSZI
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
ATOGT
AVWKF
AXJTR
AZFZN
BBWZM
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FLBIZ
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HLZ
HME
HMV
HVGLF
HZ~
IHE
IMUCA
J1W
KOM
LG9
LZ4
M38
M41
MO0
N9A
NDZJH
O-L
O9-
OAUVE
OGIMB
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCB
SDF
SDG
SES
SEW
SHN
SPC
SPCBC
SPD
SPG
SSE
SSK
SSQ
SSV
SSZ
T5K
TN5
UPT
VH1
WUQ
ZMT
~02
~G-
ID FETCH-LOGICAL-e122t-46251ea24d59eb8f6b111c1768a65bc8f3a60603a2017c80dde8bddba63628393
ISSN 0010-4655
IngestDate Fri Feb 23 02:28:54 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Fortran 2003
Object-oriented programming
Simulation framework
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-e122t-46251ea24d59eb8f6b111c1768a65bc8f3a60603a2017c80dde8bddba63628393
PageCount 4
ParticipantIDs elsevier_sciencedirect_doi_10_1016_j_cpc_2019_05_014
PublicationCentury 2000
PublicationDate November 2019
PublicationDateYYYYMMDD 2019-11-01
PublicationDate_xml – month: 11
  year: 2019
  text: November 2019
PublicationDecade 2010
PublicationTitle Computer physics communications
PublicationYear 2019
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
SSID ssj0007793
Score 2.320323
Snippet GenASiSBasics provides Fortran 2003 classes furnishing extensible object-oriented utilitarian functionality for large-scale physics simulations on distributed...
SourceID elsevier
SourceType Publisher
StartPage 483
SubjectTerms Fortran 2003
Object-oriented programming
Simulation framework
Title GenASiSBasics: Object-oriented utilitarian functionality for large-scale physics simulations (Version 3)
URI https://dx.doi.org/10.1016/j.cpc.2019.05.014
Volume 244
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect database
  customDbUrl:
  eissn: 1879-2944
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0007793
  issn: 0010-4655
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3LatwwFBXTpIVuSp-kT7RooCV4sGXNyOpumqavRRqYFKYrI8ka8JA6wZ4J-YZ-Rb-lX9ajkexxkk1b6MbYBtuyzkX33KurI0JeplyngFpHoPZFxDnXUaZNEiXFPBNWxqAgfL3ZhDg8zGYzeTQY_GjXwpyfiKrKLi7k2X-FGvcAtls6-xdwdy_FDZwDdBwBO45_BPwHW02m5fStalwNEAL-L9rlWqJTp2js-CW-7ZS563V-A27NZwPLULp54krDowbQ2ZD2aPaa8vuqrZkDIw05tt19tjuJu0xtK3cQtonoHjb9JSibpPwKhlkXi8BdV9pWe--GmxmRulB-QsSrH7i2fhv2UxSJDGv1urzZtbUzfiyGB3Dqbd4T-eE3EzJi0itCtuMzC5d-hOV-35vgrLnX0b7mB3xKYjE0Z06mMpFenZVvnF5XijhdKxShGWCyLh6WN8g2EyOJEXJ78ulg9rnz60IECefQ7naOfF0teOVDPWrToyvHd8mdEGfQibePe2Rgq_vk1pHH5AEpL1nJG3rFRmjPRuglG6GwEdqzERpgpj0boa-Chfz6mb5-SL6-Pzje_xiFXTcimzC2xK-B8lrFeDGSVmfzsYY7NAnCUjUeaZPNU4WgN04VfleYLIZ_zHRRaDUGFwLdTh-Rreq0sjuEWm7YXKRMMIuo2zBlDS-sslJJ4YT5HxPe9lIeCJ8ncjlwzNv6w0WOzs1d5-bxKEfnPvm3x56S2xvjfEa2lvXKPic3zfmybOoXAe3fWtV7jA
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GenASiSBasics%3A+Object-oriented+utilitarian+functionality+for+large-scale+physics+simulations+%28Version%C2%A03%29&rft.jtitle=Computer+physics+communications&rft.au=Budiardja%2C+Reuben+D.&rft.au=Cardall%2C+Christian+Y.&rft.date=2019-11-01&rft.pub=Elsevier+B.V&rft.issn=0010-4655&rft.eissn=1879-2944&rft.volume=244&rft.spage=483&rft.epage=486&rft_id=info:doi/10.1016%2Fj.cpc.2019.05.014&rft.externalDocID=S0010465519301729
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon