An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures

Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for moder...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of chemical theory and computation Ročník 20; číslo 21; s. 9394
Hlavní autoři: Snowdon, Calum, Barca, Giuseppe M J
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States 12.11.2024
ISSN:1549-9626, 1549-9626
On-line přístup:Zjistit podrobnosti o přístupu
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation. Our novel algorithm achieves near-peak performance on GPU-based supercomputers through the development of a distributed memory algorithm for forming RI-MP2 intermediate tensors with zero internode communication, except for a single asynchronous broadcast, and a distributed memory algorithm for the energy reduction step, capable of sustaining near-peak performance on clusters with several hundred GPUs. Comparative analysis shows our implementation outperforms state-of-the-art quantum chemistry software by over 3.5 times in speed while achieving an 8-fold reduction in computational power consumption. Benchmarking on the Perlmutter supercomputer, our algorithm achieves 11.8 PFLOP/s (83% of peak performance) performing and the RI-MP2 energy calculation on a 314-water cluster with 7850 primary and 30,144 auxiliary basis functions in 4 min on 180 nodes and 720 A100 GPUs. This performance represents a substantial improvement over traditional CPU-based methods, demonstrating significant time-to-solution and power consumption benefits of leveraging modern GPU-accelerated computing environments for quantum chemistry calculations.
AbstractList Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation. Our novel algorithm achieves near-peak performance on GPU-based supercomputers through the development of a distributed memory algorithm for forming RI-MP2 intermediate tensors with zero internode communication, except for a single asynchronous broadcast, and a distributed memory algorithm for the energy reduction step, capable of sustaining near-peak performance on clusters with several hundred GPUs. Comparative analysis shows our implementation outperforms state-of-the-art quantum chemistry software by over 3.5 times in speed while achieving an 8-fold reduction in computational power consumption. Benchmarking on the Perlmutter supercomputer, our algorithm achieves 11.8 PFLOP/s (83% of peak performance) performing and the RI-MP2 energy calculation on a 314-water cluster with 7850 primary and 30,144 auxiliary basis functions in 4 min on 180 nodes and 720 A100 GPUs. This performance represents a substantial improvement over traditional CPU-based methods, demonstrating significant time-to-solution and power consumption benefits of leveraging modern GPU-accelerated computing environments for quantum chemistry calculations.
Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation. Our novel algorithm achieves near-peak performance on GPU-based supercomputers through the development of a distributed memory algorithm for forming RI-MP2 intermediate tensors with zero internode communication, except for a single O(N2) asynchronous broadcast, and a distributed memory algorithm for the O(N5) energy reduction step, capable of sustaining near-peak performance on clusters with several hundred GPUs. Comparative analysis shows our implementation outperforms state-of-the-art quantum chemistry software by over 3.5 times in speed while achieving an 8-fold reduction in computational power consumption. Benchmarking on the Perlmutter supercomputer, our algorithm achieves 11.8 PFLOP/s (83% of peak performance) performing and the RI-MP2 energy calculation on a 314-water cluster with 7850 primary and 30,144 auxiliary basis functions in 4 min on 180 nodes and 720 A100 GPUs. This performance represents a substantial improvement over traditional CPU-based methods, demonstrating significant time-to-solution and power consumption benefits of leveraging modern GPU-accelerated computing environments for quantum chemistry calculations.Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation. Our novel algorithm achieves near-peak performance on GPU-based supercomputers through the development of a distributed memory algorithm for forming RI-MP2 intermediate tensors with zero internode communication, except for a single O(N2) asynchronous broadcast, and a distributed memory algorithm for the O(N5) energy reduction step, capable of sustaining near-peak performance on clusters with several hundred GPUs. Comparative analysis shows our implementation outperforms state-of-the-art quantum chemistry software by over 3.5 times in speed while achieving an 8-fold reduction in computational power consumption. Benchmarking on the Perlmutter supercomputer, our algorithm achieves 11.8 PFLOP/s (83% of peak performance) performing and the RI-MP2 energy calculation on a 314-water cluster with 7850 primary and 30,144 auxiliary basis functions in 4 min on 180 nodes and 720 A100 GPUs. This performance represents a substantial improvement over traditional CPU-based methods, demonstrating significant time-to-solution and power consumption benefits of leveraging modern GPU-accelerated computing environments for quantum chemistry calculations.
Author Snowdon, Calum
Barca, Giuseppe M J
Author_xml – sequence: 1
  givenname: Calum
  surname: Snowdon
  fullname: Snowdon, Calum
  organization: School of Computing, Australian National University, Canberra 2600, Australia
– sequence: 2
  givenname: Giuseppe M J
  orcidid: 0000-0001-5109-4279
  surname: Barca
  fullname: Barca, Giuseppe M J
  organization: School of Computing and Information Systems, University of Melbourne, Melbourne 3010, Australia
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39422609$$D View this record in MEDLINE/PubMed
BookMark eNpNjztPwzAYRS0Eog_YmVBGlhS_4sRj1JZSqRUVonNkf7GpqzyK7Qz99yBRJKZzh6MrnQm67vrOIPRA8IxgSp4VhNkRIsw4YFwQfoXGJOMylYKK6397hCYhHDFmjFN2i0ZMckoFlmO0KLtkaa0DZ7qYvK_T7Y4mZfPZexcPbWJ7nyxciN7pIZo62arunK52-6T0cHDRQBy8CXfoxqommPsLp2j_svyYv6abt9V6Xm5SxUgeUw0Fl6ZWOcuFEIxYhgnTwAEszzhwzqnRNiuEthbqIqsVCMGZ1LXmmRaSTtHT7-_J91-DCbFqXQDTNKoz_RAqRkguZcEE_VEfL-qgW1NXJ-9a5c_VXzn9BuLAXLI
ContentType Journal Article
DBID NPM
7X8
DOI 10.1021/acs.jctc.4c00814
DatabaseName PubMed
MEDLINE - Academic
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Chemistry
EISSN 1549-9626
ExternalDocumentID 39422609
Genre Journal Article
GroupedDBID 4.4
53G
55A
5GY
5VS
7~N
AABXI
ABBLG
ABJNI
ABLBI
ABMVS
ABQRX
ABUCX
ACGFS
ACIWK
ACS
ADHLV
AEESW
AENEX
AFEFF
AHGAQ
ALMA_UNASSIGNED_HOLDINGS
AQSVZ
BAANH
CS3
CUPRZ
D0L
DU5
EBS
ED~
F5P
GGK
GNL
IH9
J9A
JG~
NPM
P2P
RNS
ROL
UI2
VF5
VG9
W1F
7X8
ID FETCH-LOGICAL-a317t-bc849eda73766631f3013bc4ccf454c4442ebf586bffcd85dac66439bdb45b692
IEDL.DBID 7X8
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001338407600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1549-9626
IngestDate Fri Jul 11 12:26:25 EDT 2025
Mon Jul 21 05:54:48 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 21
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a317t-bc849eda73766631f3013bc4ccf454c4442ebf586bffcd85dac66439bdb45b692
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0001-5109-4279
PMID 39422609
PQID 3117998362
PQPubID 23479
ParticipantIDs proquest_miscellaneous_3117998362
pubmed_primary_39422609
PublicationCentury 2000
PublicationDate 2024-11-12
PublicationDateYYYYMMDD 2024-11-12
PublicationDate_xml – month: 11
  year: 2024
  text: 2024-11-12
  day: 12
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of chemical theory and computation
PublicationTitleAlternate J Chem Theory Comput
PublicationYear 2024
SSID ssj0033423
Score 2.4496715
Snippet Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 9394
Title An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures
URI https://www.ncbi.nlm.nih.gov/pubmed/39422609
https://www.proquest.com/docview/3117998362
Volume 20
WOSCitedRecordID wos001338407600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3JTsMwELWAIsGFfSmbjMTVLXEmi08oailwaFUhKvVW2Y7NIkgLCXw_4zQBLkhIXHJyong0nnn2S94j5EwpTypfgCPdgWHHs0xxETFjFBjgXiqsLM0mosEgHo_FsDpwy6vPKuuaWBbqdKrdGXnbL7XLYqy3F7NX5lyjHLtaWWgskoaPUMZldTT-YhF8p25X6qWCU6HkNU2Jba0tdd560oVugXZdEX4HmGWj6a3_9xU3yFoFMWkyz4lNsmCyLbLSqZ3dtkk3yehlKR2BHYfe3rD-kNPk-R6fVTy8UISxtOuGOissk9I-1gt2NRzR5AfpkO-QUe_yrnPNKjcFJhEjFEzpGIRJZYQlBWGGZ3Fp-0qD1hYC0ADAjbJBHCprdRoHqdShgysqVRCoUPBdspRNM7NPqNu0RGBCIwSA4lJoK4NQi0B5EW5Y0iY5rQM0wak5CkJmZvqeT75D1CR78yhPZnNZjQlmDGLBc3Hwh7sPySpHdOF-CvT4EWlYXKvmmCzrj-Ixfzsp0wCvg2H_EzlZvEk
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Efficient+RI-MP2+Algorithm+for+Distributed+Many-GPU+Architectures&rft.jtitle=Journal+of+chemical+theory+and+computation&rft.au=Snowdon%2C+Calum&rft.au=Barca%2C+Giuseppe+M+J&rft.date=2024-11-12&rft.issn=1549-9626&rft.eissn=1549-9626&rft.volume=20&rft.issue=21&rft.spage=9394&rft_id=info:doi/10.1021%2Facs.jctc.4c00814&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-9626&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-9626&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-9626&client=summon