High performance implementations of the 2D Ising model on GPUs

We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Computer physics communications Ročník 256; s. 107473
Hlavní autoři: Romero, Joshua, Bisson, Mauro, Fatica, Massimiliano, Bernaschi, Massimo
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.11.2020
Témata:
ISSN:0010-4655, 1879-2944
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016). Program title: cuIsing (optimized). CPC Library link to program files:http://dx.doi.org/10.17632/xrb9xtkbcp.1 Licensing provisions: MIT license. Programming languages: CUDA C, Python. Nature of problem: Two dimensional Ising model for spin systems. Solution method: Checkerboard Metropolis algorithm.
AbstractList We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016). Program title: cuIsing (optimized). CPC Library link to program files:http://dx.doi.org/10.17632/xrb9xtkbcp.1 Licensing provisions: MIT license. Programming languages: CUDA C, Python. Nature of problem: Two dimensional Ising model for spin systems. Solution method: Checkerboard Metropolis algorithm.
ArticleNumber 107473
Author Romero, Joshua
Bisson, Mauro
Fatica, Massimiliano
Bernaschi, Massimo
Author_xml – sequence: 1
  givenname: Joshua
  orcidid: 0000-0003-1358-5565
  surname: Romero
  fullname: Romero, Joshua
  email: joshr@nvidia.com
– sequence: 2
  givenname: Mauro
  surname: Bisson
  fullname: Bisson, Mauro
– sequence: 3
  givenname: Massimiliano
  surname: Fatica
  fullname: Fatica, Massimiliano
– sequence: 4
  givenname: Massimo
  surname: Bernaschi
  fullname: Bernaschi, Massimo
BookMark eNp9kMtqwzAQRUVJoUnaD-hOP-B0JCuSRaFQ0kcCgXbRrIUijxMF2zKSKfTv6-Cuusjqzl2cgXtmZNKGFgm5Z7BgwOTDaeE6t-DAz10JlV-RKSuUzrgWYkKmAAwyIZfLGzJL6QQASul8Sp7W_nCkHcYqxMa2DqlvuhobbHvb-9AmGiraH5HyF7pJvj3QJpRY09DS989duiXXla0T3v3lnOzeXr9W62z78b5ZPW8zx7XqM-b2RakAOFPgCg25KnQlRS4qtFCVSkqJrkDJ9xz5cErNi5Jb7ZxmVljM54SNf10MKUWsTBd9Y-OPYWDOAszJDALMWYAZBQyM-sc4P47qo_X1RfJxJHGY9O0xmuQ8DnJKH9H1pgz-Av0LaAp2Nw
CitedBy_id crossref_primary_10_1038_s41467_024_46645_6
crossref_primary_10_1103_wjvx_5nk7
crossref_primary_10_1016_j_asr_2024_04_021
crossref_primary_10_1103_ngkf_7816
crossref_primary_10_1016_j_cpc_2025_109734
crossref_primary_10_1088_1742_5468_ace0b7
crossref_primary_10_1103_PhysRevApplied_21_054002
crossref_primary_10_1002_aic_17651
crossref_primary_10_1016_j_compchemeng_2024_108627
crossref_primary_10_1016_j_cpc_2024_109234
crossref_primary_10_1038_s41598_022_22043_0
crossref_primary_10_1063_5_0184774
crossref_primary_10_1088_2399_1984_ad299a
crossref_primary_10_1038_s41524_025_01762_8
crossref_primary_10_1016_j_cpc_2025_109690
Cites_doi 10.1103/PhysRevLett.47.693
10.1063/1.1699114
10.1016/j.cpc.2012.02.015
10.1016/j.jcp.2011.12.008
10.1103/PhysRevLett.62.361
10.1109/TPDS.2015.2505725
10.1016/0021-9991(81)90089-9
10.1016/j.cpc.2013.10.019
10.1145/2833157.2833162
10.1016/j.cpc.2010.05.005
10.1016/j.jcp.2009.03.018
ContentType Journal Article
Copyright 2020 Elsevier B.V.
Copyright_xml – notice: 2020 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.cpc.2020.107473
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 1879-2944
ExternalDocumentID 10_1016_j_cpc_2020_107473
S0010465520302228
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARLI
AAXUO
AAYFN
ABBOA
ABFNM
ABMAC
ABNEU
ABQEM
ABQYD
ABXDB
ABYKQ
ACDAQ
ACFVG
ACGFS
ACLVX
ACNNM
ACRLP
ACSBN
ACZNC
ADBBV
ADECG
ADEZE
ADJOM
ADMUD
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFZHZ
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AI.
AIALX
AIEXJ
AIKHN
AITUG
AIVDX
AJBFU
AJOXV
AJSZI
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
ATOGT
AVWKF
AXJTR
AZFZN
BBWZM
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FLBIZ
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HLZ
HME
HMV
HVGLF
HZ~
IHE
IMUCA
J1W
KOM
LG9
LZ4
M38
M41
MO0
N9A
NDZJH
O-L
O9-
OAUVE
OGIMB
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCB
SDF
SDG
SES
SEW
SHN
SPC
SPCBC
SPD
SPG
SSE
SSK
SSQ
SSV
SSZ
T5K
TN5
UPT
VH1
WUQ
ZMT
~02
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABJNI
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c297t-1cb8d7002170c8903789f6434fea0fd7666ec8e62b2e26ec6928d2a9cc91a4ae3
ISICitedReferencesCount 22
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000564482200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0010-4655
IngestDate Sat Nov 29 07:33:21 EST 2025
Tue Nov 18 22:43:40 EST 2025
Fri Feb 23 02:46:30 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords 6.5 software including parallel algorithms
Ising model
GPU programming
23 statistical physics and thermodynamics
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c297t-1cb8d7002170c8903789f6434fea0fd7666ec8e62b2e26ec6928d2a9cc91a4ae3
ORCID 0000-0003-1358-5565
ParticipantIDs crossref_primary_10_1016_j_cpc_2020_107473
crossref_citationtrail_10_1016_j_cpc_2020_107473
elsevier_sciencedirect_doi_10_1016_j_cpc_2020_107473
PublicationCentury 2000
PublicationDate November 2020
2020-11-00
PublicationDateYYYYMMDD 2020-11-01
PublicationDate_xml – month: 11
  year: 2020
  text: November 2020
PublicationDecade 2020
PublicationTitle Computer physics communications
PublicationYear 2020
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Bernaschi, Fatica, Parisi, Parisi (b8) 2012; 183
Onsager (b9) 1944; 65
cuRAND Library
OpenMP
Block, Virnau, Preis (b13) 2010; 181
Baity-Jesi (b15) 2014; 185
NVIDIA DGX-2
.
Ising (b5) 1925; XXXI
Metropolis, Rosenbluth, Rosenbluth, Teller, Teller (b6) 1953; 21
Binder (b22) 1981; 47
Nvidia CUDA
Preis, Virnau, Paul, Schneider (b12) 2009; 228
S. K. Lam, A. Pitrou Antoine, S. Seibert, Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, November, 2015, 15–15, Austin, Texas, p. 1–6.
NVLink and NVSwitch
Kun Yang, Yi-Fan Chen, Georgios Roumpos, Chris Colby, John Anderson, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’19,2019.
Ortega-Zamorano, Montemurro, Cannas, Jerez, Franco (b11) 2016; 27
Weigel (b14) 2012; 231
R. Okuta, Y. Unno, D. Nishino, S. Hido, C. Loomis, Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems, 2017.
OpenACC
Wolff (b7) 1989; 62
cuBLAS Library
Jacobs, Rebbi (b20) 1981; 41
Ortega-Zamorano (10.1016/j.cpc.2020.107473_b11) 2016; 27
Onsager (10.1016/j.cpc.2020.107473_b9) 1944; 65
10.1016/j.cpc.2020.107473_b17
Ising (10.1016/j.cpc.2020.107473_b5) 1925; XXXI
10.1016/j.cpc.2020.107473_b16
10.1016/j.cpc.2020.107473_b19
10.1016/j.cpc.2020.107473_b18
Metropolis (10.1016/j.cpc.2020.107473_b6) 1953; 21
Wolff (10.1016/j.cpc.2020.107473_b7) 1989; 62
Preis (10.1016/j.cpc.2020.107473_b12) 2009; 228
10.1016/j.cpc.2020.107473_b4
10.1016/j.cpc.2020.107473_b3
10.1016/j.cpc.2020.107473_b10
Baity-Jesi (10.1016/j.cpc.2020.107473_b15) 2014; 185
10.1016/j.cpc.2020.107473_b21
10.1016/j.cpc.2020.107473_b2
10.1016/j.cpc.2020.107473_b1
Bernaschi (10.1016/j.cpc.2020.107473_b8) 2012; 183
Weigel (10.1016/j.cpc.2020.107473_b14) 2012; 231
Jacobs (10.1016/j.cpc.2020.107473_b20) 1981; 41
Binder (10.1016/j.cpc.2020.107473_b22) 1981; 47
Block (10.1016/j.cpc.2020.107473_b13) 2010; 181
References_xml – volume: 41
  start-page: 203
  year: 1981
  end-page: 210
  ident: b20
  publication-title: J. Comput. Phys.
– reference: NVIDIA DGX-2,
– volume: 47
  start-page: 693
  year: 1981
  ident: b22
  publication-title: Phys. Rev. Lett.
– volume: 27
  start-page: 2618
  year: 2016
  end-page: 2627
  ident: b11
  publication-title: IEEE Trans. Parallel Distrib. Syst.
– volume: 65
  start-page: 117
  year: 1944
  end-page: 149
  ident: b9
  publication-title: Phys. Rev. Ser. II
– volume: 62
  start-page: 361
  year: 1989
  ident: b7
  publication-title: Phys. Rev. Lett.
– volume: 21
  start-page: 1087
  year: 1953
  end-page: 1092
  ident: b6
  publication-title: J. Chem. Phys.
– volume: 181
  start-page: 1549
  year: 2010
  end-page: 1556
  ident: b13
  publication-title: Comput. Phys. Comm.
– reference: Kun Yang, Yi-Fan Chen, Georgios Roumpos, Chris Colby, John Anderson, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’19,2019.
– volume: 185
  start-page: 550
  year: 2014
  end-page: 559
  ident: b15
  publication-title: Comput. Phys. Comm.
– reference: .
– volume: 231
  start-page: 3064
  year: 2012
  end-page: 3082
  ident: b14
  publication-title: J. Comput. Phys.
– reference: S. K. Lam, A. Pitrou Antoine, S. Seibert, Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, November, 2015, 15–15, Austin, Texas, p. 1–6.
– volume: XXXI
  year: 1925
  ident: b5
  publication-title: Z. Phys.
– volume: 183
  year: 2012
  ident: b8
  publication-title: Comput. Phys. Comm.
– reference: cuRAND Library,
– reference: R. Okuta, Y. Unno, D. Nishino, S. Hido, C. Loomis, Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems, 2017.
– reference: Nvidia CUDA,
– reference: NVLink and NVSwitch,
– reference: cuBLAS Library,
– reference: OpenACC,
– volume: 228
  start-page: 4468
  year: 2009
  end-page: 4477
  ident: b12
  publication-title: J. Comput. Phys.
– reference: OpenMP,
– volume: 47
  start-page: 693
  year: 1981
  ident: 10.1016/j.cpc.2020.107473_b22
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.47.693
– ident: 10.1016/j.cpc.2020.107473_b10
– volume: 21
  start-page: 1087
  issue: 6
  year: 1953
  ident: 10.1016/j.cpc.2020.107473_b6
  publication-title: J. Chem. Phys.
  doi: 10.1063/1.1699114
– ident: 10.1016/j.cpc.2020.107473_b4
– volume: 65
  start-page: 117
  issue: 3–4
  year: 1944
  ident: 10.1016/j.cpc.2020.107473_b9
  publication-title: Phys. Rev. Ser. II
– ident: 10.1016/j.cpc.2020.107473_b3
– volume: 183
  year: 2012
  ident: 10.1016/j.cpc.2020.107473_b8
  publication-title: Comput. Phys. Comm.
  doi: 10.1016/j.cpc.2012.02.015
– volume: 231
  start-page: 3064
  year: 2012
  ident: 10.1016/j.cpc.2020.107473_b14
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2011.12.008
– volume: 62
  start-page: 361
  year: 1989
  ident: 10.1016/j.cpc.2020.107473_b7
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.62.361
– volume: 27
  start-page: 2618
  issue: 9
  year: 2016
  ident: 10.1016/j.cpc.2020.107473_b11
  publication-title: IEEE Trans. Parallel Distrib. Syst.
  doi: 10.1109/TPDS.2015.2505725
– volume: 41
  start-page: 203
  issue: 1
  year: 1981
  ident: 10.1016/j.cpc.2020.107473_b20
  publication-title: J. Comput. Phys.
  doi: 10.1016/0021-9991(81)90089-9
– volume: 185
  start-page: 550
  year: 2014
  ident: 10.1016/j.cpc.2020.107473_b15
  publication-title: Comput. Phys. Comm.
  doi: 10.1016/j.cpc.2013.10.019
– ident: 10.1016/j.cpc.2020.107473_b21
– ident: 10.1016/j.cpc.2020.107473_b1
– ident: 10.1016/j.cpc.2020.107473_b2
– ident: 10.1016/j.cpc.2020.107473_b16
  doi: 10.1145/2833157.2833162
– volume: 181
  start-page: 1549
  issue: 9
  year: 2010
  ident: 10.1016/j.cpc.2020.107473_b13
  publication-title: Comput. Phys. Comm.
  doi: 10.1016/j.cpc.2010.05.005
– ident: 10.1016/j.cpc.2020.107473_b17
– volume: XXXI
  year: 1925
  ident: 10.1016/j.cpc.2020.107473_b5
  publication-title: Z. Phys.
– volume: 228
  start-page: 4468
  issue: 12
  year: 2009
  ident: 10.1016/j.cpc.2020.107473_b12
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2009.03.018
– ident: 10.1016/j.cpc.2020.107473_b18
– ident: 10.1016/j.cpc.2020.107473_b19
SSID ssj0007793
Score 2.4964015
Snippet We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 107473
SubjectTerms 23 statistical physics and thermodynamics
6.5 software including parallel algorithms
GPU programming
Ising model
Title High performance implementations of the 2D Ising model on GPUs
URI https://dx.doi.org/10.1016/j.cpc.2020.107473
Volume 256
WOSCitedRecordID wos000564482200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-2944
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0007793
  issn: 0010-4655
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fT9swELa2wiRepjE2DcaQH_a0KShx0th-mcT4MeAB9QGkvkWu42hFa1I1LeLP5y52nKwbaDzsJYpc5xL5vl7uLp_vCPmcaJYqrQG8MRbVFqoIhFQsiPMIm6LwJBmqptkEv7oS47EcOVpR3bQT4GUp7u_l_L-qGsZA2bh19hnq9kJhAM5B6XAEtcPxnxSPzA2sRuz3A0xnLUfcs97Q22QnXy-aREHTDAc_GvwY3dR9Z7Xt-ODSHzXyz7vdJB1JvpoZu1vmsqp_rroIH3TquPlqtag8UrBGrLLD8IecTTHR4n_9jvlJiLin3YSqn5mAMDTymQlnbcHGY322vrVlw769RDqobWXyhym3WYXbQz3HSpMMR9q5v5fNXnudeZJhy1-7zUBEhiIyK-Il2WB8KMWAbBxdnI4v_Zubc1ek2T13-xW84QOuPcff_Zieb3L9hrx2QQU9smDYJi9M-Za8Glmt7ZBvCAnagwRdgwStCgqQoOyENpCgDSRoVVKExDtyc3Z6fXweuL4ZgWaSL4NIT0TOm2gz1EKGMReyAM8zKYwKi5xDxGq0MCmbMMPgNJVM5ExJrWWkEmXi92RQVqX5QGjBjGQFeCwsMkkK4lOIz_NwEhZMxyYKd0nYrkKmXVF57G3yK3t09XfJF3_J3FZUeWpy0i5t5lxC6-plAJPHL9t7zj0-kq0OvftksFyszCeyqe-W03px4DDyAHm4fog
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=High+performance+implementations+of+the+2D+Ising+model+on+GPUs&rft.jtitle=Computer+physics+communications&rft.au=Romero%2C+Joshua&rft.au=Bisson%2C+Mauro&rft.au=Fatica%2C+Massimiliano&rft.au=Bernaschi%2C+Massimo&rft.date=2020-11-01&rft.issn=0010-4655&rft.volume=256&rft.spage=107473&rft_id=info:doi/10.1016%2Fj.cpc.2020.107473&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_cpc_2020_107473
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon