MPI detach — Towards automatic asynchronous local completion
When aiming for large-scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent...
Saved in:
| Published in: | Parallel computing Vol. 109; p. 102859 |
|---|---|
| Main Authors: | , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.03.2022
Elsevier |
| Subjects: | |
| ISSN: | 0167-8191, 1872-7336 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | When aiming for large-scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C++. Using our concept allows to span task dependency graphs for asynchronous execution over the full distributed memory application. We furthermore investigate compile-time analysis necessary to transform an application using blocking MPI communication into an application integrating OpenMP tasks with our proposed MPI interface extension.
•MPI interface extensions to transfer request completion back to the MPI library.•callback-driven notification of asynchronous completion back to the application.•prototype implementation of the interface independent of the MPI implementation.•integration of MPI communication into OpenMP task programming.•compile-time analysis to convert blocking communication into non-blocking. |
|---|---|
| AbstractList | When aiming for large scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C ++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C ++. We furthermore investigate compile-time analysis necessary to transform an application using blocking MPI communication into an application integrating OpenMP tasks with our proposed MPI interface extension. When aiming for large-scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C++. Using our concept allows to span task dependency graphs for asynchronous execution over the full distributed memory application. We furthermore investigate compile-time analysis necessary to transform an application using blocking MPI communication into an application integrating OpenMP tasks with our proposed MPI interface extension. •MPI interface extensions to transfer request completion back to the MPI library.•callback-driven notification of asynchronous completion back to the application.•prototype implementation of the interface independent of the MPI implementation.•integration of MPI communication into OpenMP task programming.•compile-time analysis to convert blocking communication into non-blocking. |
| ArticleNumber | 102859 |
| Author | Protze, Joachim Müller, Matthias S. Barthou, Denis Hermanns, Marc-André Carribault, Patrick Saillard, Emmanuelle Nguyen, Van Man Jaeger, Julien |
| Author_xml | – sequence: 1 givenname: Joachim orcidid: 0000-0003-0640-8966 surname: Protze fullname: Protze, Joachim email: protze@itc.rwth-aachen.de organization: RWTH Aachen University, ITC, Seffenter Weg 23, Aachen, 52074, Germany – sequence: 2 givenname: Marc-André surname: Hermanns fullname: Hermanns, Marc-André email: hermanns@itc.rwth-aachen.de organization: RWTH Aachen University, ITC, Seffenter Weg 23, Aachen, 52074, Germany – sequence: 3 givenname: Matthias S. surname: Müller fullname: Müller, Matthias S. email: mueller@itc.rwth-aachen.de organization: RWTH Aachen University, ITC, Seffenter Weg 23, Aachen, 52074, Germany – sequence: 4 givenname: Van Man surname: Nguyen fullname: Nguyen, Van Man email: van-man.nguyen.ocre@cea.fr organization: CEA, DAM, DIF, Arpajon, F-91297, France – sequence: 5 givenname: Julien orcidid: 0000-0003-0084-1574 surname: Jaeger fullname: Jaeger, Julien email: julien.jaeger@cea.fr organization: CEA, DAM, DIF, Arpajon, F-91297, France – sequence: 6 givenname: Emmanuelle surname: Saillard fullname: Saillard, Emmanuelle email: emmanuelle.saillard@inria.fr organization: Inria, 200 avenue de la vieille tour, Talence, 33400, France – sequence: 7 givenname: Patrick surname: Carribault fullname: Carribault, Patrick email: patrick.carribault@cea.fr organization: CEA, DAM, DIF, Arpajon, F-91297, France – sequence: 8 givenname: Denis surname: Barthou fullname: Barthou, Denis email: denis.barthou@inria.fr organization: Inria, 200 avenue de la vieille tour, Talence, 33400, France |
| BackLink | https://cea.hal.science/cea-03537990$$DView record in HAL |
| BookMark | eNp9kM9KAzEQh4NUsK0-gZe9etiaP-1mc1AoRW2hood6DrOTLE3ZbkqyrfTmQ_iEPolbVzwKAwPD7xtmvgHp1b62hFwzOmKUZbeb0Q4C-hGnnLUTnk_UGemzXPJUCpH1SL9NyTRnil2QQYwbSmk2zmmf3D-_LhJjG8B18vXxmaz8OwQTE9g3fguNwwTiscZ18LXfx6TyCFWCfrurbON8fUnOS6iivfrtQ_L2-LCazdPly9NiNl2mKARv0gIMKyUUgpUgRHuXbYuDUXmZjRmXGRRjBFQZWp4VKjdKcmUks4VFYcZGDMlNt3cNld4Ft4Vw1B6cnk-XGi1oKiZCKkUPrM2KLovBxxhs-Qcwqk-69Eb_6NInXbrT1VJ3HWXbNw7OBh3R2RqtccFio413__LfAaN2-w |
| Cites_doi | 10.1177/1094342014548772 10.1016/j.jpdc.2019.12.005 10.1145/3127024.3127033 |
| ContentType | Journal Article |
| Copyright | 2021 Elsevier B.V. Distributed under a Creative Commons Attribution 4.0 International License |
| Copyright_xml | – notice: 2021 Elsevier B.V. – notice: Distributed under a Creative Commons Attribution 4.0 International License |
| DBID | AAYXX CITATION 1XC VOOES |
| DOI | 10.1016/j.parco.2021.102859 |
| DatabaseName | CrossRef Hyper Article en Ligne (HAL) Hyper Article en Ligne (HAL) (Open Access) |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-7336 |
| ExternalDocumentID | oai:HAL:cea-03537990v1 10_1016_j_parco_2021_102859 S0167819121001022 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1~. 1~5 29O 4.4 457 4G. 5VS 6OB 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM LG9 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCC SDF SDG SDP SES SEW SPC SPCBC SST SSV SSZ T5K WH7 WUQ XPP ZMT ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 1XC VOOES |
| ID | FETCH-LOGICAL-c332t-bad1f7ab31fa33187e87e2ad98f641276ab4cac96ce26b98d9729d71ebec3d4d3 |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000744183200007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0167-8191 |
| IngestDate | Tue Oct 14 20:44:30 EDT 2025 Sat Nov 29 07:22:52 EST 2025 Fri Feb 23 02:41:56 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Asynchronous communication OpenMP tasking Hybrid parallelism Message Passing Interface Code transformation Static analysis |
| Language | English |
| License | Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c332t-bad1f7ab31fa33187e87e2ad98f641276ab4cac96ce26b98d9729d71ebec3d4d3 |
| ORCID | 0000-0003-0084-1574 0000-0003-0640-8966 |
| OpenAccessLink | https://cea.hal.science/cea-03537990 |
| ParticipantIDs | hal_primary_oai_HAL_cea_03537990v1 crossref_primary_10_1016_j_parco_2021_102859 elsevier_sciencedirect_doi_10_1016_j_parco_2021_102859 |
| PublicationCentury | 2000 |
| PublicationDate | March 2022 2022-03-00 2022-03 |
| PublicationDateYYYYMMDD | 2022-03-01 |
| PublicationDate_xml | – month: 03 year: 2022 text: March 2022 |
| PublicationDecade | 2020 |
| PublicationTitle | Parallel computing |
| PublicationYear | 2022 |
| Publisher | Elsevier B.V Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier |
| References | Nguyen (b15) 2020 H. Ahmed, A. Skjellum, P. Bangalore, P. Pirkelbauer, Transforming Blocking MPI Collectives to Non-Blocking and Persistent Operations, in: Proceedings of the 24th European MPI Users’ Group Meeting, 2017, pp. 1–11. Grant, Dosanjh, Levenhagen, Brightwell, Skjellum (b3) 2019 Dinan, Grant, Balaji, Goodell, Miller, Snir, Thakur (b2) 2014; 28 Sala, Bellón, Farré, Teruel, Pérez, Peña, Holmes, Beltran, Labarta (b7) 2018 Baker (b10) 2017 Laguna, Marshall, Mohror, Ruefenacht, Skjellum, Sultana (b13) 2019 Lattner, Adve (b16) 2004 Protze, Hermanns, Demiralp, Müller, Kuhlen (b9) 2020 Schuchart, Tsugane, Gracia, Sato (b5) 2018 Forum (b1) 2015 Hermanns, Geimer, Mohr, Wolf (b11) 2017 Klinkenberg, Samfass, Bader, Terboven, Müller (b6) 2020; 138 OpenM.P. Architecture Review Board (b4) 2018 Kumar (b12) 2008 Lührs, Rohe, Schnurpfeil, Thust, Frings (b18) 2016; Vol. 27 Sala, Teruel, Pérez, Peña, Beltran, Labarta (b8) 2019 Wagner, López, Morillo, Cavazzoni, Affinito, Giménez, Labarta (b17) 2017 Lattner (10.1016/j.parco.2021.102859_b16) 2004 Klinkenberg (10.1016/j.parco.2021.102859_b6) 2020; 138 Dinan (10.1016/j.parco.2021.102859_b2) 2014; 28 Schuchart (10.1016/j.parco.2021.102859_b5) 2018 Grant (10.1016/j.parco.2021.102859_b3) 2019 Hermanns (10.1016/j.parco.2021.102859_b11) 2017 Protze (10.1016/j.parco.2021.102859_b9) 2020 Laguna (10.1016/j.parco.2021.102859_b13) 2019 Kumar (10.1016/j.parco.2021.102859_b12) 2008 Forum (10.1016/j.parco.2021.102859_b1) 2015 Sala (10.1016/j.parco.2021.102859_b7) 2018 OpenM.P. Architecture Review Board (10.1016/j.parco.2021.102859_b4) 2018 Wagner (10.1016/j.parco.2021.102859_b17) 2017 Lührs (10.1016/j.parco.2021.102859_b18) 2016; Vol. 27 10.1016/j.parco.2021.102859_b14 Baker (10.1016/j.parco.2021.102859_b10) 2017 Nguyen (10.1016/j.parco.2021.102859_b15) 2020 Sala (10.1016/j.parco.2021.102859_b8) 2019 |
| References_xml | – start-page: 97 year: 2017 end-page: 114 ident: b11 article-title: Trace-based detection of lock contention in MPI one-sided communication publication-title: Tools for High Performance Computing, Vol. 2016 – year: 2019 ident: b8 article-title: Integrating blocking and non-blocking MPI primitives with task-based programming models – start-page: 94 year: 2008 end-page: 103 ident: b12 article-title: The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer publication-title: Proc. of the 22nd Annual Intl. Conf. on Supercomputing, Vol. 2008 – start-page: 31:1 year: 2019 end-page: 31:14 ident: b13 article-title: A large-scale study of MPI usage in open-source HPC applications publication-title: SC – start-page: 75 year: 2004 end-page: 86 ident: b16 article-title: LLVM: A compilation framework for lifelong program analysis & transformation publication-title: International Symposium on Code Generation and Optimization, Vol. 2004 – start-page: 3 year: 2018 end-page: 17 ident: b5 article-title: The impact of taskyield on the design of tasks communicating through MPI publication-title: Evolving OpenMP for Evolving Architectures - Proc. of the 14th Intl. Workshop on OpenMP – year: 2018 ident: b4 article-title: OpenMP application program interface version 5.0 – volume: 138 start-page: 55 year: 2020 end-page: 64 ident: b6 article-title: CHAMELEON: reactive load balancing for hybrid MPI+OpenMP task-parallel applications publication-title: J. Parallel Distrib. Comput. – volume: 28 start-page: 390 year: 2014 end-page: 405 ident: b2 article-title: Enabling communication concurrency through flexible MPI endpoints publication-title: Int. J. Supercomput. Appl. High Perform. Comput. – year: 2017 ident: b10 article-title: OpenSHMEM specification 1.4 – reference: H. Ahmed, A. Skjellum, P. Bangalore, P. Pirkelbauer, Transforming Blocking MPI Collectives to Non-Blocking and Persistent Operations, in: Proceedings of the 24th European MPI Users’ Group Meeting, 2017, pp. 1–11. – volume: Vol. 27 start-page: 431 year: 2016 end-page: 438 ident: b18 article-title: Flexible and generic workflow management publication-title: Parallel Computing: On the Road to Exascale Intl. Conf. on Parallel Computing 2015, Edinburgh (United Kingdom), 1 Sep 2015 - 4 Sep 2015 – year: 2015 ident: b1 article-title: MPI: A message-passing interface standard, version 3.1 – start-page: 6:1 year: 2018 end-page: 6:11 ident: b7 article-title: Improving the interoperability between MPI and task-based programming models publication-title: Proc. of the 25th European MPI Users’ Group Meeting, Vol. 2018 – start-page: 330 year: 2019 end-page: 350 ident: b3 article-title: Finepoints: Partitioned multithreaded MPI communication publication-title: High Performance Computing - 34th Intl. Conf., ISC High Performance 2019, Frankfurt/Main, Germany, June 16-20, 2019, Proc. – start-page: 243 year: 2017 end-page: 250 ident: b17 article-title: Performance analysis and optimization of the FFTXlib on the intel knights landing architecture publication-title: ICPP Workshops – year: 2020 ident: b15 article-title: Automatic code motion to extend MPI nonblocking overlap window publication-title: High Performance Computing. ISC High Performance. Lecture Notes in Computer Science, Vol. 12321 – start-page: 71 year: 2020 end-page: 80 ident: b9 article-title: MPI detach - asynchronous local completion publication-title: EuroMPI – start-page: 71 year: 2020 ident: 10.1016/j.parco.2021.102859_b9 article-title: MPI detach - asynchronous local completion – start-page: 3 year: 2018 ident: 10.1016/j.parco.2021.102859_b5 article-title: The impact of taskyield on the design of tasks communicating through MPI – start-page: 6:1 year: 2018 ident: 10.1016/j.parco.2021.102859_b7 article-title: Improving the interoperability between MPI and task-based programming models – year: 2017 ident: 10.1016/j.parco.2021.102859_b10 – year: 2015 ident: 10.1016/j.parco.2021.102859_b1 – volume: 28 start-page: 390 issue: 4 year: 2014 ident: 10.1016/j.parco.2021.102859_b2 article-title: Enabling communication concurrency through flexible MPI endpoints publication-title: Int. J. Supercomput. Appl. High Perform. Comput. doi: 10.1177/1094342014548772 – year: 2018 ident: 10.1016/j.parco.2021.102859_b4 – start-page: 243 year: 2017 ident: 10.1016/j.parco.2021.102859_b17 article-title: Performance analysis and optimization of the FFTXlib on the intel knights landing architecture – start-page: 31:1 year: 2019 ident: 10.1016/j.parco.2021.102859_b13 article-title: A large-scale study of MPI usage in open-source HPC applications – start-page: 94 year: 2008 ident: 10.1016/j.parco.2021.102859_b12 article-title: The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer – start-page: 330 year: 2019 ident: 10.1016/j.parco.2021.102859_b3 article-title: Finepoints: Partitioned multithreaded MPI communication – year: 2020 ident: 10.1016/j.parco.2021.102859_b15 article-title: Automatic code motion to extend MPI nonblocking overlap window – year: 2019 ident: 10.1016/j.parco.2021.102859_b8 – start-page: 75 year: 2004 ident: 10.1016/j.parco.2021.102859_b16 article-title: LLVM: A compilation framework for lifelong program analysis & transformation – volume: 138 start-page: 55 year: 2020 ident: 10.1016/j.parco.2021.102859_b6 article-title: CHAMELEON: reactive load balancing for hybrid MPI+OpenMP task-parallel applications publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2019.12.005 – ident: 10.1016/j.parco.2021.102859_b14 doi: 10.1145/3127024.3127033 – volume: Vol. 27 start-page: 431 year: 2016 ident: 10.1016/j.parco.2021.102859_b18 article-title: Flexible and generic workflow management – start-page: 97 year: 2017 ident: 10.1016/j.parco.2021.102859_b11 article-title: Trace-based detection of lock contention in MPI one-sided communication |
| SSID | ssj0006480 |
| Score | 2.3321128 |
| Snippet | When aiming for large-scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high... When aiming for large scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high... |
| SourceID | hal crossref elsevier |
| SourceType | Open Access Repository Index Database Publisher |
| StartPage | 102859 |
| SubjectTerms | Asynchronous communication Code transformation Computer Science Distributed, Parallel, and Cluster Computing Hybrid parallelism Message Passing Interface OpenMP tasking Static analysis |
| Title | MPI detach — Towards automatic asynchronous local completion |
| URI | https://dx.doi.org/10.1016/j.parco.2021.102859 https://cea.hal.science/cea-03537990 |
| Volume | 109 |
| WOSCitedRecordID | wos000744183200007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-7336 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006480 issn: 0167-8191 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1fb9MwELdYxwMvjL9iDJCFeCuZGseN4xekMg1taEyVVqa-RY6d7I9EWjXptPHEh-AT8km4sx0XVIEACamKmlSN07tfz3fn8-8IecULzk1sWATTeRFxycpIGjg1fMgrqUSV2WrC0yNxfJxNp3LsV0wb205A1HV2fS3n_1XVcA2UjVtn_0Ld4aZwAd6D0uEIaofjHyn-w_iwj4Wh-rzfVTLw_sRWxzZ9tWxnnqS1uak1MuNiDayd0Vx5eRk05V3WsVpgvxX38bLtpjo0p4tZ-9nn4LEm89MqsYpbEXwiG8QbYdmkW5IPCsbTt3vdRkTbdvxCNf2T3ZCePlveOJt4igtKHsU-QQGxbajQclmztZ0zLpEJBhqDRTcPOeObCfD2E0eIEqyzZU9Yt_Qu6XC5O4dfgZs4WYwsFJmnF_-ZQvsER8PBWOw49DbIJhNDmfXI5uhwf_o-zN0pt732wtN1PFW2InBtqF_5MhvnXVbeeimTe-SuDy_oyMHiPrlV1g_IVte6g3pL_pC8AZRQhxL67ctX6vFBAz7oj_igFh90hY9H5OO7_cneQeRbaUQ6SVgbFcrElVBFElcqATMuSngxZWRWpTxmIlUF10pL7A-XFjIzEoIuI2L8iyeGm-Qx6dWzunxCaDlIkAm_0lUJsWsVF0yzQVoYLpXREB5vk9edUPK5Y0zJu1LCy9zKMEcZ5k6G2yTtBJd7p885czlo-vdffAliDkMgTfrB6CjXpcoHyTAR4GZdxU__9e475M4Ky89Ir10sy-fktr5qL5rFCw-b734Yhks |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MPI+detach+%E2%80%94+Towards+automatic+asynchronous+local+completion&rft.jtitle=Parallel+computing&rft.au=Protze%2C+Joachim&rft.au=Hermanns%2C+Marc-Andr%C3%A9&rft.au=M%C3%BCller%2C+Matthias+S.&rft.au=Nguyen%2C+Van+Man&rft.date=2022-03-01&rft.pub=Elsevier+B.V&rft.issn=0167-8191&rft.eissn=1872-7336&rft.volume=109&rft_id=info:doi/10.1016%2Fj.parco.2021.102859&rft.externalDocID=S0167819121001022 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon |