Massively parallel algorithm and implementation of RI-MP2 energy calculation for peta-scale many-core supercomputers

A new parallel algorithm and its implementation for the RI‐MP2 energy calculation utilizing peta‐flop‐class many‐core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual‐level hierarchical parallelization...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Journal of computational chemistry Ročník 37; číslo 30; s. 2623 - 2633
Hlavní autori:	Katouda, Michio, Naruse, Akira, Hirano, Yukihiko, Nakajima, Takahito
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	United States Blackwell Publishing Ltd 15.11.2016 Wiley Subscription Services, Inc
Predmet:	Algorithms Central processing units CPUs electron correlation theory Energy GPGPU K computer massively parallel algorithm NTChem second-order Møller-Plesset perturbation theory Supercomputers TSUBAME 2.5 TSUBAME 2.5 electron correlation theory second-order Møller-Plesset perturbation theory NTChem massively parallel algorithm K computer GPGPU
ISSN:	0192-8651, 1096-987X, 1096-987X
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	A new parallel algorithm and its implementation for the RI‐MP2 energy calculation utilizing peta‐flop‐class many‐core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual‐level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi‐node and multi‐GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi‐node and multi‐GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc. A new parallel algorithm and its implementation for the RI‐MP2 energy calculation utilizing peta‐flop‐class many‐core supercomputers are presented. In this algorithm, (1) a dual‐level hierarchical parallelization scheme that enables the use of more than 10,000 MPI processes and (2) a new data communication scheme that reduces network communication overhead are applied. Benchmark calculations using the new implementation on the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high parallel efficiency.
Bibliografia:	ArticleID:JCC24491 Next-Generation Supercomputer project (the K computer project) by MEXT, Japan TSUBAME grand challenge program, category A by Tokyo Institute of Technology ark:/67375/WNG-C2W2TB8R-Z istex:2E001E55C6A54F0B72DC68896DA63DCAB7383383 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	0192-8651 1096-987X 1096-987X
DOI:	10.1002/jcc.24491