Gravel fine-grain GPU-initiated network messages
Distributed systems incorporate GPUs because they provide massive parallelism in an energy-efficient manner. Unfortunately, existing programming models make it difficult to route a GPU-initiated network message. The traditional coprocessor model forces programmers to manually route messages through...
Saved in:
| Published in: | International Conference for High Performance Computing, Networking, Storage and Analysis (Online) pp. 1 - 12 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
New York, NY, USA
ACM
12.11.2017
|
| Series: | ACM Conferences |
| Subjects: | |
| ISBN: | 9781450351140, 145035114X |
| ISSN: | 2167-4337 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Distributed systems incorporate GPUs because they provide massive parallelism in an energy-efficient manner. Unfortunately, existing programming models make it difficult to route a GPU-initiated network message. The traditional coprocessor model forces programmers to manually route messages through the host CPU. Other models allow GPU-initiated communication, but are inefficient for small messages.
To enable fine-grain PGAS-style communication between threads executing on different GPUs, we introduce Gravel. GPU-initiated messages are offloaded through a GPU-efficient concurrent queue to an aggregator (implemented with CPU threads), which combines messages targeting to the same destination. Gravel leverages diverged work-group-level semantics to amortize synchronization across the GPU's data-parallel lanes.
Using Gravel, we can distribute six applications, each with frequent small messages, across a cluster of eight GPU-accelerated nodes. Compared to one node, these applications run 5.3x faster, on average. Furthermore, we show Gravel is more programmable and usually performs better than prior GPU networking models. |
|---|---|
| ISBN: | 9781450351140 145035114X |
| ISSN: | 2167-4337 |
| DOI: | 10.1145/3126908.3126914 |

