Exascale Deep Learning for Climate Analytics
We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit system...
Gespeichert in:
| Veröffentlicht in: | SC18: International Conference for High Performance Computing, Networking, Storage and Analysis S. 649 - 660 |
|---|---|
| Hauptverfasser: | , , , , , , , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
01.11.2018
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively. |
|---|---|
| AbstractList | We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively. |
| Author | Romero, Joshua Fatica, Massimiliano Deslippe, Jack Prabhat, Prabhat Kurth, Thorsten Phillips, Everett Treichler, Sean Mudigonda, Mayur Mahesh, Ankur Houston, Michael Luehr, Nathan Matheson, Michael |
| Author_xml | – sequence: 1 givenname: Thorsten surname: Kurth fullname: Kurth, Thorsten – sequence: 2 givenname: Sean surname: Treichler fullname: Treichler, Sean – sequence: 3 givenname: Joshua surname: Romero fullname: Romero, Joshua – sequence: 4 givenname: Mayur surname: Mudigonda fullname: Mudigonda, Mayur – sequence: 5 givenname: Nathan surname: Luehr fullname: Luehr, Nathan – sequence: 6 givenname: Everett surname: Phillips fullname: Phillips, Everett – sequence: 7 givenname: Ankur surname: Mahesh fullname: Mahesh, Ankur – sequence: 8 givenname: Michael surname: Matheson fullname: Matheson, Michael – sequence: 9 givenname: Jack surname: Deslippe fullname: Deslippe, Jack – sequence: 10 givenname: Massimiliano surname: Fatica fullname: Fatica, Massimiliano – sequence: 11 givenname: Prabhat surname: Prabhat fullname: Prabhat, Prabhat – sequence: 12 givenname: Michael surname: Houston fullname: Houston, Michael |
| BookMark | eNotzrFOwzAQgGEjwdAWxk5d_AAk-HyJ4xurUApSJAZgrs7JuYoU3CrJQN-eSjD926d_qW7TKYlSazA5gKGnjzq3BnxujCmLG7WEEr3z6AtaqMfdD08tD6KfRc66ER5Tn446nkZdD_03z6K3iYfL3LfTvbqLPEzy8N-V-nrZfdavWfO-f6u3TcZQ-jmrPHYxdOzAMkIgInQRQwc2Bg8WMVBhWrKBDAQgJ4HJsg9QmcoV4nClNn9uLyKH83jdGC8H71xZXa1fyhM8og |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/SC.2018.00054 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 1538683849 9781538683842 |
| EndPage | 660 |
| ExternalDocumentID | 8665799 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-a158t-783dfbda612a31b99936f3bd12fb81233b940c92b901b196eba92a8b170764e63 |
| IEDL.DBID | RIE |
| IngestDate | Thu Jun 29 18:39:01 EDT 2023 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a158t-783dfbda612a31b99936f3bd12fb81233b940c92b901b196eba92a8b170764e63 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_8665799 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-Nov |
| PublicationDateYYYYMMDD | 2018-11-01 |
| PublicationDate_xml | – month: 11 year: 2018 text: 2018-Nov |
| PublicationDecade | 2010 |
| PublicationTitle | SC18: International Conference for High Performance Computing, Networking, Storage and Analysis |
| PublicationTitleAbbrev | SC |
| PublicationYear | 2018 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 2.239504 |
| Snippet | We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 649 |
| SubjectTerms | Computational modeling Computer architecture Convolutional codes Deep learning Meteorology Technological innovation Training |
| Title | Exascale Deep Learning for Climate Analytics |
| URI | https://ieeexplore.ieee.org/document/8665799 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LagIxFL2odNFVW7T0TRZdmjqTzOSxtkpXIrQFd5LHnSIUFV_085tkBt10011IFiEHwjkk95wL8MyY504YR5nXihY6M9SUvqClcF7roBDCYmo2IScTNZvpaQv6Ry8MIqbiM3yJw_SX71duH5_KBjGbTWrdhraUovZqnWIzB-_DWKkVSyOzFO5_apaSuGJ88b9dLqF3Mt2R6ZFOrqCFyy70Rz9mG2BE8oq4Jk0a6hcJUpMMvxdBbiJJuSIxbbkHn-PRx_CNNg0OqMlLtaNScV9Zb4LKMDy3QatxUXHrc1bZQLyc24CW08wG0rbhqqA1mhllc5lJUaDg19BZrpZ4A6QwRjLkzjOBhXDMonaZ8lhl3jIp5C1040nn6zrDYt4c8u7v6Xs4j1DWnrsH6Ow2e3yEM3fYLbabpwT8L9W7hgA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1qFfSk0orf5uCxsbvJbrI515aKtRSs0FvJx6wUpC39EH--SVraixdvITmEmRDeI5n3BuCRMcet0JYypwqaqURTnbuM5sI6pTxD8Iux2YTs94vRSA0q0NhpYRAxFp_hUxjGv3w3s-vwVNYM3mxSqQM4zD2OJhu11t44s_neCrVaoTgyifb--3YpES06p__b5wzqe9kdGewA5RwqOK1Bo_2jlz6RSJ4R52Trh_pJPNkkra-JJ5xIorNI8Fuuw0enPWx16bbFAdVpXqyoLLgrjdOeZ2ieGs_WuCi5cSkrjYdezo3Pl1XMeNg2_rKg0YrpwqQykSJDwS-gOp1N8RJIprVkyK1jAjNhmUFlk8JhmTjDpJBXUAuRjucbF4vxNsjrv6cf4Lg7fOuNey_91xs4CWndKPBuobparPEOjuz3arJc3MdD-AUKz4lN |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=SC18%3A+International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis&rft.atitle=Exascale+Deep+Learning+for+Climate+Analytics&rft.au=Kurth%2C+Thorsten&rft.au=Treichler%2C+Sean&rft.au=Romero%2C+Joshua&rft.au=Mudigonda%2C+Mayur&rft.date=2018-11-01&rft.pub=IEEE&rft.spage=649&rft.epage=660&rft_id=info:doi/10.1109%2FSC.2018.00054&rft.externalDocID=8665799 |