Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations
We present the first graphical processing unit (GPU) coprocessor‐enabled version of the Order‐N Electronic Total Energy Package (ONETEP) code for linear‐scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve a...
Gespeichert in:
| Veröffentlicht in: | Journal of computational chemistry Jg. 34; H. 28; S. 2446 - 2459 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
United States
Blackwell Publishing Ltd
30.10.2013
Wiley Subscription Services, Inc |
| Schlagworte: | |
| ISSN: | 0192-8651, 1096-987X, 1096-987X |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | We present the first graphical processing unit (GPU) coprocessor‐enabled version of the Order‐N Electronic Total Energy Package (ONETEP) code for linear‐scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve atom‐localized fast Fourier transform (FFT) operations. These are among the most computationally intensive parts of the code and are used in core algorithms such as the calculation of the charge density, the local potential integrals, the kinetic energy integrals, and the nonorthogonal generalized Wannier function gradient. We have found that direct porting of the isolated FFT operations did not provide any benefit. Instead, it was necessary to tailor the port to each of the aforementioned algorithms to optimize data transfer to and from the GPU. A detailed discussion of the methods used and tests of the resulting performance are presented, which show that individual steps in the relevant algorithms are accelerated by a significant amount. However, the transfer of data between the GPU and host machine is a significant bottleneck in the reported version of the code. In addition, an initial investigation into a dynamic precision scheme for the ONETEP energy calculation has been performed to take advantage of the enhanced single precision capabilities of GPUs. The methods used here result in no disruption to the existing code base. Furthermore, as the developments reported here concern the core algorithms, they will benefit the full range of ONETEP functionality. Our use of a directive‐based programming model ensures portability to other forms of coprocessors and will allow this work to form the basis of future developments to the code designed to support emerging high‐performance computing platforms.Copyright © 2013 Wiley Periodicals, Inc.
The Order–N Electronic Total Energy Package (ONETEP) linear‐scaling quantum chemistry code is ported on GPU coprocessorbased architectures in a manner that is highly portable, while maintaining the full functionality of the code. |
|---|---|
| Bibliographie: | Engineering and Physical Sciences Research Council - No. EP/I006613/1 ark:/67375/WNG-2F1QP0MM-C istex:9A8C26009D2B4FB2ED4982BCA6ED853C9766F436 ArticleID:JCC23410 Royal Society for a University Research Fellowship SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0192-8651 1096-987X 1096-987X |
| DOI: | 10.1002/jcc.23410 |