Argobots: A Lightweight Low-Level Threading and Tasking Framework.

Uloženo v:
Podrobná bibliografie
Název: Argobots: A Lightweight Low-Level Threading and Tasking Framework.
Autoři: Seo, Sangmin1, Amer, Abdelhalim1, Balaji, Pavan1, Bordage, Cyril2, Bosilca, George3, Brooks, Alex4, Carns, Philip1, Castello, Adrian5, Genet, Damien3, Herault, Thomas3, Iwasaki, Shintaro6, Jindal, Prateek4, Kale, Laxmikant V.4, Krishnamoorthy, Sriram7, Lifflander, Jonathan8, Lu, Huiwei9, Meneses, Esteban10, Snir, Marc4, Sun, Yanhua11, Taura, Kenjiro6
Zdroj: IEEE Transactions on Parallel & Distributed Systems. Mar2018, Vol. 29 Issue 3, p512-526. 15p.
Témata: *PORTABLE computers, *COMPUTER programming, THREADS (Computer programs), SYNCHRONIZATION, INTERFERENCE (Telecommunication)
Abstrakt: In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, either are too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models. We describe the design, implementation, and performance characterization of Argobots and present integrations with three high-level models: OpenMP, MPI, and colocated I/O services. Evaluations show that (1) Argobots, while providing richer capabilities, is competitive with existing simpler generic threading runtimes; (2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do; (3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency-hiding capabilities; and (4) I/O services with Argobots reduce interference with colocated applications while achieving performance competitive with that of a Pthreads approach. [ABSTRACT FROM AUTHOR]
Copyright of IEEE Transactions on Parallel & Distributed Systems is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Business Source Index
Popis
Abstrakt:In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, either are too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models. We describe the design, implementation, and performance characterization of Argobots and present integrations with three high-level models: OpenMP, MPI, and colocated I/O services. Evaluations show that (1) Argobots, while providing richer capabilities, is competitive with existing simpler generic threading runtimes; (2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do; (3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency-hiding capabilities; and (4) I/O services with Argobots reduce interference with colocated applications while achieving performance competitive with that of a Pthreads approach. [ABSTRACT FROM AUTHOR]
ISSN:10459219
DOI:10.1109/TPDS.2017.2766062