Code Generation and Optimization of Distributed-memory Dense Linear Algebra Kernels

Design by Transformation (DxT) is an approach to software development that encodes domain-specific programs as graphs and expert design knowledge as graph transformations. The goal of DxT is to mechanize the generation of highly-optimized code. This paper demonstrates how DxT can be used to transfor...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Procedia computer science Ročník 18; s. 1282 - 1291
Hlavní autoři: Marker, Bryan, Batory, Don, van de Geijn, Robert
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 2013
Témata:
ISSN:1877-0509, 1877-0509
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Design by Transformation (DxT) is an approach to software development that encodes domain-specific programs as graphs and expert design knowledge as graph transformations. The goal of DxT is to mechanize the generation of highly-optimized code. This paper demonstrates how DxT can be used to transform sequential specifications of an important set of Dense Linear Algebra (DLA) kernels, the level-3 Basic Linear Algebra Subprograms (BLAS3), into high-performing library routines targeting distributed-memory (cluster) architectures. Getting good BLAS3 performance for such platforms requires deep domain knowledge, so their implementations are manually coded by experts. Unfortunately, there are few such experts and developing the full variety of BLAS3 implementations takes a lot of repetitive effort. A prototype tool, DxTer, automates this tedious task. We explain how we build on previous work to represent loops and multiple loop-based algorithms in DxTer. Performance results on a BlueGene/P parallel supercomputer show that the generated code meets or beats implementations that are hand-coded by a human expert and outperforms the widely used ScaLAPACK library.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2013.05.295