SPGPU: Spatially Programmed GPU

Communication is a critical bottleneck for GPUs, manifesting as energy and performance overheads due to network-on-chip (NoC) delay and congestion. While many algorithms exhibit locality among thread blocks and accessed data, modern GPUs lack the interface to exploit this locality: GPU thread blocks...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE computer architecture letters Ročník 23; číslo 2; s. 223 - 226
Hlavní autoři: Zhu, Shizhuo, Shkirko, Illia, Levinson, Jacob, Wang, Zhengrong, Nowatzki, Tony
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.07.2024
Témata:
ISSN:1556-6056, 1556-6064
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Communication is a critical bottleneck for GPUs, manifesting as energy and performance overheads due to network-on-chip (NoC) delay and congestion. While many algorithms exhibit locality among thread blocks and accessed data, modern GPUs lack the interface to exploit this locality: GPU thread blocks are mapped to cores obliviously. In this work, we explore a simple extension to the conventional GPU programming interface to enable control over the spatial placement of data and threads, yielding new opportunities for aggressive locality optimizations within a GPU kernel. Across 7 workloads that can take advantage of these optimizations, for a 32 (or 128) SM GPU: we achieve a 1.28× (1.54×) speedup and 35% (44%) reduction in NoC traffic, compared to baseline non-spatial GPUs.
ISSN:1556-6056
1556-6064
DOI:10.1109/LCA.2024.3499339