Combining analytical and empirical approaches in tuning matrix transposition

Matrix transposition is an important kernel used in many applications. Even though its optimization has been the subject of many studies, an optimization procedure that targets the characteristics of current processor architectures has not been developed. In this paper, we develop an integrated opti...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:PACT 2006 : proceedings of the Fifteenth International Conference on Parallel Architectures and Compilation Techniques : September 16-20, 2006, Seattle, Washington, USA. s. 233 - 242
Hlavní autoři: Lu, Qingda, Krishnamoorthy, Sriram, Sadayappan, P.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 01.09.2006
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Matrix transposition is an important kernel used in many applications. Even though its optimization has been the subject of many studies, an optimization procedure that targets the characteristics of current processor architectures has not been developed. In this paper, we develop an integrated optimization framework that addresses a number of issues, including tiling for the memory hierarchy, effective handling of memory misalignment, utilizing memory subsystem characteristics, and the exploitation of the parallelism provided by the vector instruction sets in current processors. A judicious combination of analytical and empirical approaches is used to determine the most appropriate optimizations. The absence of problem information until execution time is handled by generating multiple versions of the code - the best version is chosen at runtime, with assistance from minimal-overhead inspectors. The approach highlights aspects of empirical optimization that are important for similar computations with little temporal reuse. Experimental results on PowerPC G5 and Intel Pentium 4 demonstrate the effectiveness of the developed framework.
DOI:10.1145/1152154.1152190