Adaptive Strassen and ATLAS's DGEMM: a fast square-matrix multiply for modern high-performance systems

Strassen's algorithm has practical performance benefits for architectures with simple memory hierarchies, because it trades computationally expensive matrix multiplications (MM) with cheaper matrix additions (MA). However, it presents no advantages for high-performance architectures with deep m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Eighth International Conference on High Performance Computing in Asia Pacific Region : proceedings : 30 November - 3 December, Beijing, China S. 8 pp. - 52
Hauptverfasser: D'Alberto, P., Nicolau, A.
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 2005
Schlagworte:
ISBN:9780769524863, 0769524869
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Strassen's algorithm has practical performance benefits for architectures with simple memory hierarchies, because it trades computationally expensive matrix multiplications (MM) with cheaper matrix additions (MA). However, it presents no advantages for high-performance architectures with deep memory hierarchies, because MAs exploit limited data reuse. We present an easy-to-use adaptive algorithm combining Strassen's recursion and high-tuned version of ATLAS MM. In fact, we introduce a last step in the ATLAS-installation process that determines whether Strassen's may achieve any speedup. We present a recursive algorithm achieving up to 30% speed-up versus ATLAS alone. We show experimental results for 14 different systems
ISBN:9780769524863
0769524869
DOI:10.1109/HPCASIA.2005.18