HeTM: Transactional Memory for Heterogeneous Systems

Modern heterogeneous computing architectures, which couple multi-core CPUs with discrete many-core GPUs (or other specialized hardware accelerators), enable unprecedented peak performance and energy efficiency levels. However, developing applications that can take full advantage of the potential of...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings / International Conference on Parallel Architectures and Compilation Techniques s. 232 - 244
Hlavní autoři:	Castro, Daniel, Romano, Paolo, Ilic, Aleksandar, Khan, Amin M.
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 01.09.2019
Témata:	Computer architecture computing CPU GPU Graphics processing units heterogeneous memory Performance evaluation Programming Synchronization system Task analysis transaction
ISSN:	2641-7936
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Modern heterogeneous computing architectures, which couple multi-core CPUs with discrete many-core GPUs (or other specialized hardware accelerators), enable unprecedented peak performance and energy efficiency levels. However, developing applications that can take full advantage of the potential of heterogeneous systems is a notoriously hard task. This work takes a step towards reducing the complexity of programming heterogeneous systems by introducing the abstraction of Heterogeneous Transactional Memory (HeTM). HeTM provides programmers with the illusion of a single memory region, shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with support for atomic transactions. Besides introducing the abstract semantics and programming model of HeTM, we present the design and evaluation of a concrete implementation of the proposed abstraction, referred herein as Speculative HeTM (SHeTM). SHeTM makes use of a novel design that leverages speculative techniques, which aims at hiding the inherently large communication latency between CPUs and discrete GPUs and at minimizing inter-device synchronization overhead. We demonstrate the efficiency of the SHeTM via an extensive quantitative study based both on synthetic benchmarks and on a popular object caching system.
ISSN:	2641-7936
DOI:	10.1109/PACT.2019.00026