Accelerating Multi - Process Communication for Parallel 3-D FFT
Today largest and most powerful supercomputers in the world are built on heterogeneous platforms; and using the combined power of multi-core CPUs and GPUs, has had a great impact accelerating large-scale applications. However, on these architectures, parallel algorithms, such as the Fast Fourier Tra...
Saved in:
| Published in: | 2021 Workshop on Exascale MPI (ExaMPI) pp. 46 - 53 |
|---|---|
| Main Authors: | , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.11.2021
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Today largest and most powerful supercomputers in the world are built on heterogeneous platforms; and using the combined power of multi-core CPUs and GPUs, has had a great impact accelerating large-scale applications. However, on these architectures, parallel algorithms, such as the Fast Fourier Transform (FFT), encounter that inter-processor communication become a bottleneck and limits their scalability. In this paper, we present techniques for speeding up multi-process communication cost during the computation of FFTs, considering hybrid network connections as those expected on upcoming exascale machines. Among our techniques, we present algorithmic tuning, making use of phase diagrams; parametric tuning, using different FFT settings; and MPI distribution tuning based on FFT size and computational resources available. We present several experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 40,960 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs. |
|---|---|
| AbstractList | Today largest and most powerful supercomputers in the world are built on heterogeneous platforms; and using the combined power of multi-core CPUs and GPUs, has had a great impact accelerating large-scale applications. However, on these architectures, parallel algorithms, such as the Fast Fourier Transform (FFT), encounter that inter-processor communication become a bottleneck and limits their scalability. In this paper, we present techniques for speeding up multi-process communication cost during the computation of FFTs, considering hybrid network connections as those expected on upcoming exascale machines. Among our techniques, we present algorithmic tuning, making use of phase diagrams; parametric tuning, using different FFT settings; and MPI distribution tuning based on FFT size and computational resources available. We present several experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 40,960 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs. |
| Author | Haidar, Azzam Ayala, Alan Dongarra, Jack Stoyanov, Miroslav Tomov, Stan |
| Author_xml | – sequence: 1 givenname: Alan surname: Ayala fullname: Ayala, Alan organization: University of Tennessee,Knoxville,TN,USA – sequence: 2 givenname: Stan surname: Tomov fullname: Tomov, Stan organization: University of Tennessee,Knoxville,TN,USA – sequence: 3 givenname: Miroslav surname: Stoyanov fullname: Stoyanov, Miroslav organization: Oak Ridge National Laboratory,Oak Ridge,TN,USA – sequence: 4 givenname: Azzam surname: Haidar fullname: Haidar, Azzam organization: Nvidia Corporation,Santa Clara,CA,USA – sequence: 5 givenname: Jack surname: Dongarra fullname: Dongarra, Jack organization: University of Tennessee,Knoxville,TN,USA |
| BookMark | eNotjEFLwzAYQCPoQed-gSD5A635kiZpTjLqOgcb9jDPIyZfJJC2knag_96Cnt7hPd4duR7GAQl5BFYCMPO0_bbHbi8rqaqSMw4lYwzgiqyNrkEpWS1VzW7J88Y5TJjtHIdPerykOdKCdnl0OE20Gfv-MkS32HGgYcy0s9mmhImK4oW27eme3ASbJlz_c0Xe2-2peS0Ob7t9szkUkTMxF7zW3nDD0INWyCwPULugIDgJaJ3xWruP2lmtnPLSK68kcLMUSgcQJogVefj7RkQ8f-XY2_xzNkryWmjxCzCERpo |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ExaMPI54564.2021.00011 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665411080 1665411082 |
| EndPage | 53 |
| ExternalDocumentID | 9652837 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Office of Science and the National Nuclear Security Administration funderid: 10.13039/100006168 |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i203t-287d9290ed176e0a2f18cf61fc51eac9d77cb8ca76c6d5d6d65129cf667f139f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758726600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Thu Jun 29 18:37:46 EDT 2023 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i203t-287d9290ed176e0a2f18cf61fc51eac9d77cb8ca76c6d5d6d65129cf667f139f3 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_9652837 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Nov. |
| PublicationDateYYYYMMDD | 2021-11-01 |
| PublicationDate_xml | – month: 11 year: 2021 text: 2021-Nov. |
| PublicationDecade | 2020 |
| PublicationTitle | 2021 Workshop on Exascale MPI (ExaMPI) |
| PublicationTitleAbbrev | EXAMPI |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.8030696 |
| Snippet | Today largest and most powerful supercomputers in the world are built on heterogeneous platforms; and using the combined power of multi-core CPUs and GPUs, has... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 46 |
| SubjectTerms | Exascale FFT Fast Fourier transforms Graphics processing units Hybrid systems Libraries MPI tuning Scalability Slabs Software Supercomputers |
| Title | Accelerating Multi - Process Communication for Parallel 3-D FFT |
| URI | https://ieeexplore.ieee.org/document/9652837 |
| WOSCitedRecordID | wos000758726600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4sGTSiu-ycGjaZN95HES0S56sOyhQm8lm4cUZCvtVvz5JulSKXjxFkIgTIbk-zLJNwNwm4YEIqkm2HnugTOVWSxyVWFS5UZksiK5c7HYBJ9MxGwmyw7c7bQw1tr4-cwOQzO-5Zul3oRQ2UiykIqEd6HLOdtqtVrRLyVyNP5Wr-VLYAQhVpLQYeQ7e1VTImgUR_-b7hgGv-o7VO5w5QQ6tu7D_YPWHiGCv-p3FGWzCKP2mz_ak3kgz0NRqVahSsoHSvETKorpAN6K8fTxGbfFD_AiIWmD_U3GeOpCrKGcWaISR4V2jDqdU39YSsO5roRWnGlmcsMMC9DtRzDuPKtz6Sn06mVtzwApKTxpos5vVZZ5hqGUTHKXcam0qDyEnUM_GD__3Oa3mLd2X_zdfQmHYXW3erwr6DWrjb2GA_3VLNarm-iUHz4Ojcs |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5qFfSk0opvc_Bo2uxuNo-TiHZpsS17qNBbyeYhBdlKbcWfb7JdKoIXbyEEwmRIvi-TfDMAt0lIIJJogp3nHpgqarFIVYFJkRpBZUFS56piE3w8FtOpzBtwt9XCWGurz2e2E5rVW75Z6HUIlXUlC6lI-A7sppTGZKPWqmW_EZHd3pca5YPACUK0JI46FeP5VTelgo3s8H8THkH7R3-H8i2yHEPDli24f9DaY0TwWPmKKuEswqj-6I9-CT2QZ6IoV8tQJ-UNJfgJZdmkDS9Zb_LYx3X5AzyPSbLC_i5jPHkh1kScWaJiFwntWOR0GvnjUhrOdSG04kwzkxpmWABvP4Jx53mdS06gWS5KewpISeFpU-T8ZmXUcwylZJw6yqXSovAgdgatYPzsfZPhYlbbff539w3s9yej4Ww4GD9fwEFY6Y067xKaq-XaXsGe_lzNP5bXlYO-AdFKkRI |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+Workshop+on+Exascale+MPI+%28ExaMPI%29&rft.atitle=Accelerating+Multi+-+Process+Communication+for+Parallel+3-D+FFT&rft.au=Ayala%2C+Alan&rft.au=Tomov%2C+Stan&rft.au=Stoyanov%2C+Miroslav&rft.au=Haidar%2C+Azzam&rft.date=2021-11-01&rft.pub=IEEE&rft.spage=46&rft.epage=53&rft_id=info:doi/10.1109%2FExaMPI54564.2021.00011&rft.externalDocID=9652837 |