SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes
The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two diff...
Gespeichert in:
| Veröffentlicht in: | Computer physics communications Jg. 224; S. 325 - 332 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier B.V
01.03.2018
Elsevier |
| Schlagworte: | |
| ISSN: | 0010-4655, 1879-2944 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles.
The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel’s MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs.
Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.
Program Title: SoAx
Program Files doi:http://dx.doi.org/10.17632/m463pc4mv8.1
Licensing provisions: GPLv3
Programming language: C++
Nature of problem: Structures of arrays (SoA) are generally faster than arrays of structures (AoS) while AoS are more handy. This library (SoAx) combines the advantages of both. By means of C++(11) meta-template programming SoAx achieves maximal performance (efficient use of vector units and cache of modern CPUs) while providing a very convenient user interface (including object-oriented element handling) and flexibility. It has been designed to handle list-like sets of particles (similar to struct int id; double[3] pos; float[3] vel;;) in the context of high-performance numerical simulations. It can be applied to many other problems.
Solution method: Template Metaprogramming, Expression Templates |
|---|---|
| AbstractList | The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles.
The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel’s MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs.
Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.
Program Title: SoAx
Program Files doi:http://dx.doi.org/10.17632/m463pc4mv8.1
Licensing provisions: GPLv3
Programming language: C++
Nature of problem: Structures of arrays (SoA) are generally faster than arrays of structures (AoS) while AoS are more handy. This library (SoAx) combines the advantages of both. By means of C++(11) meta-template programming SoAx achieves maximal performance (efficient use of vector units and cache of modern CPUs) while providing a very convenient user interface (including object-oriented element handling) and flexibility. It has been designed to handle list-like sets of particles (similar to struct int id; double[3] pos; float[3] vel;;) in the context of high-performance numerical simulations. It can be applied to many other problems.
Solution method: Template Metaprogramming, Expression Templates |
| Author | Laenen, Francois Homann, Holger |
| Author_xml | – sequence: 1 givenname: Holger surname: Homann fullname: Homann, Holger email: holger.homann@oca.eu – sequence: 2 givenname: Francois surname: Laenen fullname: Laenen, Francois |
| BackLink | https://hal.science/hal-02308014$$DView record in HAL |
| BookMark | eNp9kE1Lw0AQhhdRsH78AG97FUmcyabZRE8hqBULCtXzstlM6paYLbtR7L83pXrx0MMwMLzPDPOcsMPe9cTYBUKMgNn1KjZrEyeAMkaMAacHbIK5LKKkSNNDNgFAiNJsOj1mJyGsAEDKQkzY08KV3ze85EvqyVvDq6srvhj8pxk-PXHX8tJ7vQm8dZ6_677pbL_ka-0HazoK3PZ89lJx4xoKZ-yo1V2g899-yt7u716rWTR_fnisynlkBCZD1EgtoE5EndYtQZtrSFMhp5RLSNAQSV1rQmGwaPOximYMaJNRm9UoTS7EKbvc7X3XnVp7-6H9Rjlt1aycq-0MEgE5YPqFY1bussa7EDy1ythBD9b1g9e2Uwhq60-t1OhPbf0pRDX6G0n8R_6d2sfc7hga3_-y5FUwlnpDjfVkBtU4u4f-AQO8iI8 |
| CitedBy_id | crossref_primary_10_1016_j_jpdc_2020_09_008 crossref_primary_10_1016_j_scico_2020_102481 crossref_primary_10_1016_j_jocs_2025_102590 crossref_primary_10_1002_spe_3077 crossref_primary_10_3390_fi16090341 crossref_primary_10_1007_s00366_021_01304_y crossref_primary_10_1016_j_jcp_2022_111234 crossref_primary_10_1016_j_cpc_2022_108406 crossref_primary_10_1002_cpe_70199 |
| Cites_doi | 10.1103/PhysRevLett.93.064502 10.1051/0004-6361:20011817 10.1146/annurev.fluid.010908.165210 10.1109/PDP.2013.24 10.4208/aamm.2014.m468 10.1109/TC.2014.2366754 10.1016/j.cpc.2014.01.005 |
| ContentType | Journal Article |
| Copyright | 2017 Elsevier B.V. Distributed under a Creative Commons Attribution 4.0 International License |
| Copyright_xml | – notice: 2017 Elsevier B.V. – notice: Distributed under a Creative Commons Attribution 4.0 International License |
| DBID | AAYXX CITATION 1XC |
| DOI | 10.1016/j.cpc.2017.11.015 |
| DatabaseName | CrossRef Hyper Article en Ligne (HAL) |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 1879-2944 |
| EndPage | 332 |
| ExternalDocumentID | oai:HAL:hal-02308014v1 10_1016_j_cpc_2017_11_015 S0010465517303983 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1RT 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 7-5 71M 8P~~ IHE IMUCA J1W KOM LG9 LZ4 M38 M41 MO0 N9A NDZJH O-L O9- OAUVE OGIMB OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCB SDF SDG SES SEW SHN SPC SPCBC SPD SPG SSE SSK SSQ SSV SSZ T5K TN5 UPT VH1 WUQ ZMT ~02 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 1XC |
| ID | FETCH-LOGICAL-c312t-d7a30b23b4bfe0f8a044375e87021cee7abae13c19f819f9da04ac6ef6b17c833 |
| ISSN | 0010-4655 |
| IngestDate | Tue Oct 14 20:45:57 EDT 2025 Tue Nov 18 20:49:28 EST 2025 Sat Nov 29 03:58:13 EST 2025 Fri Feb 23 02:30:57 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | C++ Heterogeneous data Template metaprogramming Generic programming |
| Language | English |
| License | Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c312t-d7a30b23b4bfe0f8a044375e87021cee7abae13c19f819f9da04ac6ef6b17c833 |
| PageCount | 8 |
| ParticipantIDs | hal_primary_oai_HAL_hal_02308014v1 crossref_citationtrail_10_1016_j_cpc_2017_11_015 crossref_primary_10_1016_j_cpc_2017_11_015 elsevier_sciencedirect_doi_10_1016_j_cpc_2017_11_015 |
| PublicationCentury | 2000 |
| PublicationDate | March 2018 2018-03-00 2018-03 |
| PublicationDateYYYYMMDD | 2018-03-01 |
| PublicationDate_xml | – month: 03 year: 2018 text: March 2018 |
| PublicationDecade | 2010 |
| PublicationTitle | Computer physics communications |
| PublicationYear | 2018 |
| Publisher | Elsevier B.V Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier |
| References | K. Germaschewski, W. Fox, S. Abbott, N. Ahmadi, K. Maynard, L. Wang, H. Ruhl, A. Bhattacharjee, The plasma simulation code: A modern particle-in-cell code with load-balancing and gpu support. Strzodka (b11) 2011 Grigoryev, Vshivkov, Fedoruk (b3) 2002 Huang, Shi, He, Chai (b9) 2015; 7 Veldhuizen (b14) 1995; 7 Toschi, Bodenschatz (b5) 2009; 41 Vandevoorde, Josuttis (b13) 2002 Teyssier (b2) 2002; 385 N. Faria, R. Silva, J.L. Sobral, Impact of data structure layout on performance, in: Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on, IEEE, 2013, pp. 116–120. Biferale, Boffetta, Celani, Devenish, Lanotte, Toschi (b6) 2004; 93 Abrahams, Gurtovoy (b12) 2005 Aragón (b15) 2014; 185 Bodenheimer, Laughlin, Rozyczka, Plewa, Yorke, Yorke (b1) 2006 N. Faria, R. Silva, J. Sobral, Impact of data structure layout on performance, in: Conference: Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing. Xue, Yang, Fu, Wang, Xu, Liao, Gan, Lu, Ranjan, Wang (b10) 2015; 64 Bell, Hoberock (b16) 2011 Abrahams (10.1016/j.cpc.2017.11.015_b12) 2005 Teyssier (10.1016/j.cpc.2017.11.015_b2) 2002; 385 Strzodka (10.1016/j.cpc.2017.11.015_b11) 2011 Huang (10.1016/j.cpc.2017.11.015_b9) 2015; 7 Biferale (10.1016/j.cpc.2017.11.015_b6) 2004; 93 Bell (10.1016/j.cpc.2017.11.015_b16) 2011 Toschi (10.1016/j.cpc.2017.11.015_b5) 2009; 41 Aragón (10.1016/j.cpc.2017.11.015_b15) 2014; 185 Vandevoorde (10.1016/j.cpc.2017.11.015_b13) 2002 10.1016/j.cpc.2017.11.015_b7 Bodenheimer (10.1016/j.cpc.2017.11.015_b1) 2006 10.1016/j.cpc.2017.11.015_b8 Grigoryev (10.1016/j.cpc.2017.11.015_b3) 2002 10.1016/j.cpc.2017.11.015_b4 Xue (10.1016/j.cpc.2017.11.015_b10) 2015; 64 Veldhuizen (10.1016/j.cpc.2017.11.015_b14) 1995; 7 |
| References_xml | – reference: N. Faria, R. Silva, J. Sobral, Impact of data structure layout on performance, in: Conference: Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing. – year: 2006 ident: b1 publication-title: Numerical Methods in Astrophysics: An Introduction – year: 2002 ident: b13 publication-title: C++ Templates: The Complete Guide – volume: 41 start-page: 375 year: 2009 end-page: 404 ident: b5 publication-title: Annu. Rev. Fluid Mech. – volume: 93 start-page: 4502 year: 2004 ident: b6 publication-title: Phys. Rev. Lett. – volume: 64 start-page: 2382 year: 2015 end-page: 2393 ident: b10 publication-title: IEEE Trans. Comput. – volume: 7 start-page: 1 year: 2015 end-page: 12 ident: b9 publication-title: Adv. Appl. Math. Mech. – volume: 385 start-page: 337 year: 2002 ident: b2 publication-title: Astron. Astrophys. – reference: K. Germaschewski, W. Fox, S. Abbott, N. Ahmadi, K. Maynard, L. Wang, H. Ruhl, A. Bhattacharjee, The plasma simulation code: A modern particle-in-cell code with load-balancing and gpu support. – year: 2005 ident: b12 publication-title: C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and beyond – reference: N. Faria, R. Silva, J.L. Sobral, Impact of data structure layout on performance, in: Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on, IEEE, 2013, pp. 116–120. – start-page: 359 year: 2011 ident: b16 publication-title: GPU Comput. Gems Jade Ed. – volume: 185 start-page: 1681 year: 2014 end-page: 1696 ident: b15 publication-title: Comput. Phys. Comm. – start-page: 253 year: 2011 end-page: 269 ident: b11 publication-title: GPU Comput. Gems Jade Ed. – year: 2002 ident: b3 publication-title: Numerical “Particle-in-Cell” Methods: Theory and Applications – volume: 7 start-page: 26 year: 1995 end-page: 32 ident: b14 publication-title: C++ Report – volume: 93 start-page: 4502 issue: 6 year: 2004 ident: 10.1016/j.cpc.2017.11.015_b6 publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.93.064502 – volume: 385 start-page: 337 year: 2002 ident: 10.1016/j.cpc.2017.11.015_b2 publication-title: Astron. Astrophys. doi: 10.1051/0004-6361:20011817 – volume: 41 start-page: 375 year: 2009 ident: 10.1016/j.cpc.2017.11.015_b5 publication-title: Annu. Rev. Fluid Mech. doi: 10.1146/annurev.fluid.010908.165210 – year: 2005 ident: 10.1016/j.cpc.2017.11.015_b12 – ident: 10.1016/j.cpc.2017.11.015_b7 doi: 10.1109/PDP.2013.24 – ident: 10.1016/j.cpc.2017.11.015_b4 – volume: 7 start-page: 26 year: 1995 ident: 10.1016/j.cpc.2017.11.015_b14 publication-title: C++ Report – year: 2006 ident: 10.1016/j.cpc.2017.11.015_b1 – volume: 7 start-page: 1 issue: 01 year: 2015 ident: 10.1016/j.cpc.2017.11.015_b9 publication-title: Adv. Appl. Math. Mech. doi: 10.4208/aamm.2014.m468 – start-page: 253 year: 2011 ident: 10.1016/j.cpc.2017.11.015_b11 publication-title: GPU Comput. Gems Jade Ed. – year: 2002 ident: 10.1016/j.cpc.2017.11.015_b3 – volume: 64 start-page: 2382 issue: 8 year: 2015 ident: 10.1016/j.cpc.2017.11.015_b10 publication-title: IEEE Trans. Comput. doi: 10.1109/TC.2014.2366754 – volume: 185 start-page: 1681 year: 2014 ident: 10.1016/j.cpc.2017.11.015_b15 publication-title: Comput. Phys. Comm. doi: 10.1016/j.cpc.2014.01.005 – start-page: 359 year: 2011 ident: 10.1016/j.cpc.2017.11.015_b16 publication-title: GPU Comput. Gems Jade Ed. – ident: 10.1016/j.cpc.2017.11.015_b8 doi: 10.1109/PDP.2013.24 – year: 2002 ident: 10.1016/j.cpc.2017.11.015_b13 |
| SSID | ssj0007793 |
| Score | 2.3817575 |
| Snippet | The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations.... |
| SourceID | hal crossref elsevier |
| SourceType | Open Access Repository Enrichment Source Index Database Publisher |
| StartPage | 325 |
| SubjectTerms | Astrophysics C++ Generic programming Heterogeneous data Physics Template metaprogramming |
| Title | SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes |
| URI | https://dx.doi.org/10.1016/j.cpc.2017.11.015 https://hal.science/hal-02308014 |
| Volume | 224 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2944 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0007793 issn: 0010-4655 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1da9swFBVZusFexj5Z94UY28uCi2XJkb03L3R4WymBdpA3I8tSm5LZIelC9u93ZUl21tGyPgyMMcIWio9zdXR177kIvWNcU1KyOEiZ1AHjsgqSRJaBNkospUgrqWRbbIIfHyezWTodDE58Lsxmwes62W7T5X-FGtoAbJM6ewu4u06hAa4BdDgD7HD-J-BPmmxr883PjKT0XI4m76NPcJgd6J92x8DQz9VK_GrFGEat0kKble7D5IwXJJ9ORibhfb3LX30RCOcRWZuQ9D7BpOPnefPDFV_Om8VZHwB8JGBIHWGWzfwPrwNJ-rArb0nBfhvttV1LGkVsxxZSm9HsplVq3Zh_WWzrPLg4kEsjKEn4gdFUJXE_Pfkt-SuzVhdL6MPULgroojBdwKqmCI3wwF7E4zQZor3sy-HsazdBc-60mN1P8JvdbdjflXFcR1funHvHe0tETh-iB24FgTML1SM0UPVjdG9q8XiCvhn8P-IMO_TxZDTCHfK40dgijwF57JHHHfJ4XmNAHrfIP0XfPx-eTvLAVcwIJCXRZVBxQcMyoiUrtQp1IkLGKI8VGOWIAB3iohSKUElSDUxQpxXcIORY6XFJuEwofYaGdVOr5wiPZSyUiEUYi4ipMk6Ay1ZAVwmrSiUivY9C_2IK6eTkTVWTRXEtIPvoQ_fI0mqp3HQz82-7cGTQkrwCvpybHnsLyHTdG_H0PDsqTJtZbRutpA15cZuBvET3-z_AKzQEwNRrdFduLufr1Rv3bf0GAUuH2Q |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SoAx%3A+A+generic+C%2B%2B+Structure+of+Arrays+for+handling+particles+in+HPC+codes&rft.jtitle=Computer+physics+communications&rft.au=Homann%2C+Holger&rft.au=Laenen%2C+Francois&rft.date=2018-03-01&rft.issn=0010-4655&rft.volume=224&rft.spage=325&rft.epage=332&rft_id=info:doi/10.1016%2Fj.cpc.2017.11.015&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_cpc_2017_11_015 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon |