High performance implementations of the 2D Ising model on GPUs
We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to...
Uloženo v:
| Vydáno v: | Computer physics communications Ročník 256; s. 107473 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.11.2020
|
| Témata: | |
| ISSN: | 0010-4655, 1879-2944 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016).
Program title: cuIsing (optimized).
CPC Library link to program files:http://dx.doi.org/10.17632/xrb9xtkbcp.1
Licensing provisions: MIT license.
Programming languages: CUDA C, Python.
Nature of problem: Two dimensional Ising model for spin systems.
Solution method: Checkerboard Metropolis algorithm. |
|---|---|
| AbstractList | We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016).
Program title: cuIsing (optimized).
CPC Library link to program files:http://dx.doi.org/10.17632/xrb9xtkbcp.1
Licensing provisions: MIT license.
Programming languages: CUDA C, Python.
Nature of problem: Two dimensional Ising model for spin systems.
Solution method: Checkerboard Metropolis algorithm. |
| ArticleNumber | 107473 |
| Author | Romero, Joshua Bisson, Mauro Fatica, Massimiliano Bernaschi, Massimo |
| Author_xml | – sequence: 1 givenname: Joshua orcidid: 0000-0003-1358-5565 surname: Romero fullname: Romero, Joshua email: joshr@nvidia.com – sequence: 2 givenname: Mauro surname: Bisson fullname: Bisson, Mauro – sequence: 3 givenname: Massimiliano surname: Fatica fullname: Fatica, Massimiliano – sequence: 4 givenname: Massimo surname: Bernaschi fullname: Bernaschi, Massimo |
| BookMark | eNp9kMtqwzAQRUVJoUnaD-hOP-B0JCuSRaFQ0kcCgXbRrIUijxMF2zKSKfTv6-Cuusjqzl2cgXtmZNKGFgm5Z7BgwOTDaeE6t-DAz10JlV-RKSuUzrgWYkKmAAwyIZfLGzJL6QQASul8Sp7W_nCkHcYqxMa2DqlvuhobbHvb-9AmGiraH5HyF7pJvj3QJpRY09DS989duiXXla0T3v3lnOzeXr9W62z78b5ZPW8zx7XqM-b2RakAOFPgCg25KnQlRS4qtFCVSkqJrkDJ9xz5cErNi5Jb7ZxmVljM54SNf10MKUWsTBd9Y-OPYWDOAszJDALMWYAZBQyM-sc4P47qo_X1RfJxJHGY9O0xmuQ8DnJKH9H1pgz-Av0LaAp2Nw |
| CitedBy_id | crossref_primary_10_1038_s41467_024_46645_6 crossref_primary_10_1103_wjvx_5nk7 crossref_primary_10_1016_j_asr_2024_04_021 crossref_primary_10_1103_ngkf_7816 crossref_primary_10_1016_j_cpc_2025_109734 crossref_primary_10_1088_1742_5468_ace0b7 crossref_primary_10_1103_PhysRevApplied_21_054002 crossref_primary_10_1002_aic_17651 crossref_primary_10_1016_j_compchemeng_2024_108627 crossref_primary_10_1016_j_cpc_2024_109234 crossref_primary_10_1038_s41598_022_22043_0 crossref_primary_10_1063_5_0184774 crossref_primary_10_1088_2399_1984_ad299a crossref_primary_10_1038_s41524_025_01762_8 crossref_primary_10_1016_j_cpc_2025_109690 |
| Cites_doi | 10.1103/PhysRevLett.47.693 10.1063/1.1699114 10.1016/j.cpc.2012.02.015 10.1016/j.jcp.2011.12.008 10.1103/PhysRevLett.62.361 10.1109/TPDS.2015.2505725 10.1016/0021-9991(81)90089-9 10.1016/j.cpc.2013.10.019 10.1145/2833157.2833162 10.1016/j.cpc.2010.05.005 10.1016/j.jcp.2009.03.018 |
| ContentType | Journal Article |
| Copyright | 2020 Elsevier B.V. |
| Copyright_xml | – notice: 2020 Elsevier B.V. |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.cpc.2020.107473 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 1879-2944 |
| ExternalDocumentID | 10_1016_j_cpc_2020_107473 S0010465520302228 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1RT 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARLI AAXUO AAYFN ABBOA ABFNM ABMAC ABNEU ABQEM ABQYD ABXDB ABYKQ ACDAQ ACFVG ACGFS ACLVX ACNNM ACRLP ACSBN ACZNC ADBBV ADECG ADEZE ADJOM ADMUD AEBSH AEKER AENEX AFKWA AFTJW AFZHZ AGHFR AGUBO AGYEJ AHHHB AHZHX AI. AIALX AIEXJ AIKHN AITUG AIVDX AJBFU AJOXV AJSZI ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG ATOGT AVWKF AXJTR AZFZN BBWZM BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FLBIZ FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HME HMV HVGLF HZ~ IHE IMUCA J1W KOM LG9 LZ4 M38 M41 MO0 N9A NDZJH O-L O9- OAUVE OGIMB OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCB SDF SDG SES SEW SHN SPC SPCBC SPD SPG SSE SSK SSQ SSV SSZ T5K TN5 UPT VH1 WUQ ZMT ~02 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c297t-1cb8d7002170c8903789f6434fea0fd7666ec8e62b2e26ec6928d2a9cc91a4ae3 |
| ISICitedReferencesCount | 22 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000564482200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0010-4655 |
| IngestDate | Sat Nov 29 07:33:21 EST 2025 Tue Nov 18 22:43:40 EST 2025 Fri Feb 23 02:46:30 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | 6.5 software including parallel algorithms Ising model GPU programming 23 statistical physics and thermodynamics |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c297t-1cb8d7002170c8903789f6434fea0fd7666ec8e62b2e26ec6928d2a9cc91a4ae3 |
| ORCID | 0000-0003-1358-5565 |
| ParticipantIDs | crossref_primary_10_1016_j_cpc_2020_107473 crossref_citationtrail_10_1016_j_cpc_2020_107473 elsevier_sciencedirect_doi_10_1016_j_cpc_2020_107473 |
| PublicationCentury | 2000 |
| PublicationDate | November 2020 2020-11-00 |
| PublicationDateYYYYMMDD | 2020-11-01 |
| PublicationDate_xml | – month: 11 year: 2020 text: November 2020 |
| PublicationDecade | 2020 |
| PublicationTitle | Computer physics communications |
| PublicationYear | 2020 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Bernaschi, Fatica, Parisi, Parisi (b8) 2012; 183 Onsager (b9) 1944; 65 cuRAND Library OpenMP Block, Virnau, Preis (b13) 2010; 181 Baity-Jesi (b15) 2014; 185 NVIDIA DGX-2 . Ising (b5) 1925; XXXI Metropolis, Rosenbluth, Rosenbluth, Teller, Teller (b6) 1953; 21 Binder (b22) 1981; 47 Nvidia CUDA Preis, Virnau, Paul, Schneider (b12) 2009; 228 S. K. Lam, A. Pitrou Antoine, S. Seibert, Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, November, 2015, 15–15, Austin, Texas, p. 1–6. NVLink and NVSwitch Kun Yang, Yi-Fan Chen, Georgios Roumpos, Chris Colby, John Anderson, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’19,2019. Ortega-Zamorano, Montemurro, Cannas, Jerez, Franco (b11) 2016; 27 Weigel (b14) 2012; 231 R. Okuta, Y. Unno, D. Nishino, S. Hido, C. Loomis, Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems, 2017. OpenACC Wolff (b7) 1989; 62 cuBLAS Library Jacobs, Rebbi (b20) 1981; 41 Ortega-Zamorano (10.1016/j.cpc.2020.107473_b11) 2016; 27 Onsager (10.1016/j.cpc.2020.107473_b9) 1944; 65 10.1016/j.cpc.2020.107473_b17 Ising (10.1016/j.cpc.2020.107473_b5) 1925; XXXI 10.1016/j.cpc.2020.107473_b16 10.1016/j.cpc.2020.107473_b19 10.1016/j.cpc.2020.107473_b18 Metropolis (10.1016/j.cpc.2020.107473_b6) 1953; 21 Wolff (10.1016/j.cpc.2020.107473_b7) 1989; 62 Preis (10.1016/j.cpc.2020.107473_b12) 2009; 228 10.1016/j.cpc.2020.107473_b4 10.1016/j.cpc.2020.107473_b3 10.1016/j.cpc.2020.107473_b10 Baity-Jesi (10.1016/j.cpc.2020.107473_b15) 2014; 185 10.1016/j.cpc.2020.107473_b21 10.1016/j.cpc.2020.107473_b2 10.1016/j.cpc.2020.107473_b1 Bernaschi (10.1016/j.cpc.2020.107473_b8) 2012; 183 Weigel (10.1016/j.cpc.2020.107473_b14) 2012; 231 Jacobs (10.1016/j.cpc.2020.107473_b20) 1981; 41 Binder (10.1016/j.cpc.2020.107473_b22) 1981; 47 Block (10.1016/j.cpc.2020.107473_b13) 2010; 181 |
| References_xml | – volume: 41 start-page: 203 year: 1981 end-page: 210 ident: b20 publication-title: J. Comput. Phys. – reference: NVIDIA DGX-2, – volume: 47 start-page: 693 year: 1981 ident: b22 publication-title: Phys. Rev. Lett. – volume: 27 start-page: 2618 year: 2016 end-page: 2627 ident: b11 publication-title: IEEE Trans. Parallel Distrib. Syst. – volume: 65 start-page: 117 year: 1944 end-page: 149 ident: b9 publication-title: Phys. Rev. Ser. II – volume: 62 start-page: 361 year: 1989 ident: b7 publication-title: Phys. Rev. Lett. – volume: 21 start-page: 1087 year: 1953 end-page: 1092 ident: b6 publication-title: J. Chem. Phys. – volume: 181 start-page: 1549 year: 2010 end-page: 1556 ident: b13 publication-title: Comput. Phys. Comm. – reference: Kun Yang, Yi-Fan Chen, Georgios Roumpos, Chris Colby, John Anderson, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’19,2019. – volume: 185 start-page: 550 year: 2014 end-page: 559 ident: b15 publication-title: Comput. Phys. Comm. – reference: . – volume: 231 start-page: 3064 year: 2012 end-page: 3082 ident: b14 publication-title: J. Comput. Phys. – reference: S. K. Lam, A. Pitrou Antoine, S. Seibert, Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, November, 2015, 15–15, Austin, Texas, p. 1–6. – volume: XXXI year: 1925 ident: b5 publication-title: Z. Phys. – volume: 183 year: 2012 ident: b8 publication-title: Comput. Phys. Comm. – reference: cuRAND Library, – reference: R. Okuta, Y. Unno, D. Nishino, S. Hido, C. Loomis, Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems, 2017. – reference: Nvidia CUDA, – reference: NVLink and NVSwitch, – reference: cuBLAS Library, – reference: OpenACC, – volume: 228 start-page: 4468 year: 2009 end-page: 4477 ident: b12 publication-title: J. Comput. Phys. – reference: OpenMP, – volume: 47 start-page: 693 year: 1981 ident: 10.1016/j.cpc.2020.107473_b22 publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.47.693 – ident: 10.1016/j.cpc.2020.107473_b10 – volume: 21 start-page: 1087 issue: 6 year: 1953 ident: 10.1016/j.cpc.2020.107473_b6 publication-title: J. Chem. Phys. doi: 10.1063/1.1699114 – ident: 10.1016/j.cpc.2020.107473_b4 – volume: 65 start-page: 117 issue: 3–4 year: 1944 ident: 10.1016/j.cpc.2020.107473_b9 publication-title: Phys. Rev. Ser. II – ident: 10.1016/j.cpc.2020.107473_b3 – volume: 183 year: 2012 ident: 10.1016/j.cpc.2020.107473_b8 publication-title: Comput. Phys. Comm. doi: 10.1016/j.cpc.2012.02.015 – volume: 231 start-page: 3064 year: 2012 ident: 10.1016/j.cpc.2020.107473_b14 publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2011.12.008 – volume: 62 start-page: 361 year: 1989 ident: 10.1016/j.cpc.2020.107473_b7 publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.62.361 – volume: 27 start-page: 2618 issue: 9 year: 2016 ident: 10.1016/j.cpc.2020.107473_b11 publication-title: IEEE Trans. Parallel Distrib. Syst. doi: 10.1109/TPDS.2015.2505725 – volume: 41 start-page: 203 issue: 1 year: 1981 ident: 10.1016/j.cpc.2020.107473_b20 publication-title: J. Comput. Phys. doi: 10.1016/0021-9991(81)90089-9 – volume: 185 start-page: 550 year: 2014 ident: 10.1016/j.cpc.2020.107473_b15 publication-title: Comput. Phys. Comm. doi: 10.1016/j.cpc.2013.10.019 – ident: 10.1016/j.cpc.2020.107473_b21 – ident: 10.1016/j.cpc.2020.107473_b1 – ident: 10.1016/j.cpc.2020.107473_b2 – ident: 10.1016/j.cpc.2020.107473_b16 doi: 10.1145/2833157.2833162 – volume: 181 start-page: 1549 issue: 9 year: 2010 ident: 10.1016/j.cpc.2020.107473_b13 publication-title: Comput. Phys. Comm. doi: 10.1016/j.cpc.2010.05.005 – ident: 10.1016/j.cpc.2020.107473_b17 – volume: XXXI year: 1925 ident: 10.1016/j.cpc.2020.107473_b5 publication-title: Z. Phys. – volume: 228 start-page: 4468 issue: 12 year: 2009 ident: 10.1016/j.cpc.2020.107473_b12 publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2009.03.018 – ident: 10.1016/j.cpc.2020.107473_b18 – ident: 10.1016/j.cpc.2020.107473_b19 |
| SSID | ssj0007793 |
| Score | 2.4964015 |
| Snippet | We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 107473 |
| SubjectTerms | 23 statistical physics and thermodynamics 6.5 software including parallel algorithms GPU programming Ising model |
| Title | High performance implementations of the 2D Ising model on GPUs |
| URI | https://dx.doi.org/10.1016/j.cpc.2020.107473 |
| Volume | 256 |
| WOSCitedRecordID | wos000564482200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2944 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0007793 issn: 0010-4655 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fT9swELa2wiRepjE2DcaQH_a0KShx0th-mcT4MeAB9QGkvkWu42hFa1I1LeLP5y52nKwbaDzsJYpc5xL5vl7uLp_vCPmcaJYqrQG8MRbVFqoIhFQsiPMIm6LwJBmqptkEv7oS47EcOVpR3bQT4GUp7u_l_L-qGsZA2bh19hnq9kJhAM5B6XAEtcPxnxSPzA2sRuz3A0xnLUfcs97Q22QnXy-aREHTDAc_GvwY3dR9Z7Xt-ODSHzXyz7vdJB1JvpoZu1vmsqp_rroIH3TquPlqtag8UrBGrLLD8IecTTHR4n_9jvlJiLin3YSqn5mAMDTymQlnbcHGY322vrVlw769RDqobWXyhym3WYXbQz3HSpMMR9q5v5fNXnudeZJhy1-7zUBEhiIyK-Il2WB8KMWAbBxdnI4v_Zubc1ek2T13-xW84QOuPcff_Zieb3L9hrx2QQU9smDYJi9M-Za8Glmt7ZBvCAnagwRdgwStCgqQoOyENpCgDSRoVVKExDtyc3Z6fXweuL4ZgWaSL4NIT0TOm2gz1EKGMReyAM8zKYwKi5xDxGq0MCmbMMPgNJVM5ExJrWWkEmXi92RQVqX5QGjBjGQFeCwsMkkK4lOIz_NwEhZMxyYKd0nYrkKmXVF57G3yK3t09XfJF3_J3FZUeWpy0i5t5lxC6-plAJPHL9t7zj0-kq0OvftksFyszCeyqe-W03px4DDyAHm4fog |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=High+performance+implementations+of+the+2D+Ising+model+on+GPUs&rft.jtitle=Computer+physics+communications&rft.au=Romero%2C+Joshua&rft.au=Bisson%2C+Mauro&rft.au=Fatica%2C+Massimiliano&rft.au=Bernaschi%2C+Massimo&rft.date=2020-11-01&rft.issn=0010-4655&rft.volume=256&rft.spage=107473&rft_id=info:doi/10.1016%2Fj.cpc.2020.107473&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_cpc_2020_107473 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon |