Automatic Code Generation for High-Performance Graph Algorithms

Graph problems are common across many fields, from scientific computing to social sciences. Despite their importance and the attention received, implementing graph algorithms effectively on modern computing systems remains a challenging task that requires significant programming effort and generally...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) s. 14 - 26
Hlavní autoři: Peng, Zhen, Ashraf, Rizwan A., Guo, Luanzheng, Tian, Ruiqin, Kestor, Gokcen
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 21.10.2023
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Graph problems are common across many fields, from scientific computing to social sciences. Despite their importance and the attention received, implementing graph algorithms effectively on modern computing systems remains a challenging task that requires significant programming effort and generally results in customized implementations. Current computing and memory hierarchies are not architected for irregular computations, resulting performance that is far from the theoretical architectural peak. In this paper, we propose a compiler framework to simplify the development of graph algorihtm implementations that can achieve high performance on modern computing systems. We provide a high-level domain specific language (DSL) to represent graph algorithms through sparse linear algebra expressions and graph primitives including semiring and masking. The compiler leverages the semantics information expressed through the DSL during the optimization and code transformation passes, resulting in more efficient IR passed to the compiler backend. In particular, we introduce an Index Tree Dialect that preserves the semantic information of the graph algorithm to perform high-level, domain-specific optimizations, including workspace transformation, two-phase computation, and automatic parallelization. We demonstrate that this work outperforms state-of-the-art graph libraries LAGraph by up to 3.7 × speedup in semiring operations, 2.19 ×speedup in an important sparse computational kernel, and 9.05 × speedup in graph processing algorithms.
DOI:10.1109/PACT58117.2023.00010