BCSR on GPU: A Way Forward Extreme-scale Graph Processing on Accelerator-enabled Frontier Supercomputer
Handling large graphs in a distributed environment requires effective partitioning across processors and efficient management of local partitions. In 2D partitioning, local graphs often become too sparse, making memory-efficient data structures crucial. Using the Compressed Sparse Row (CSR) format w...
Uložené v:
| Vydané v: | SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis s. 280 - 289 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
17.11.2024
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Handling large graphs in a distributed environment requires effective partitioning across processors and efficient management of local partitions. In 2D partitioning, local graphs often become too sparse, making memory-efficient data structures crucial. Using the Compressed Sparse Row (CSR) format wastes space, especially for > 83% of vertices with empty edges for the sparse graphs. This study explores bit-CSR (BCSR), a modified CSR representation, on GPUs to reduce memory usage in graph computations. We achieved 16.67% memory savings on a sparse rmat dataset with 268 million vertices and 357 million edges, without performance degradation, supported by both theoretical and experimental storage savings of 33%. However, we observed a 1.7× slowdown in degree lookup times due to bitwise operations on AMD CPUs. This analysis highlights the potential of BCSR on GPUs for improving Graph500 benchmark performance on GPU-accelerated systems, such as the Frontier supercomputer. |
|---|---|
| DOI: | 10.1109/SCW63240.2024.00044 |