Improving communication performance in dense linear algebra via topology aware collectives

Recent results have shown that topology aware mapping reduces network contention in communication-intensive kernels on massively parallel machines. We demonstrate that on mesh interconnects, topology aware mapping also allows for the utilization of highly-efficient topology aware collectives. We map...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) S. 1 - 11
Hauptverfasser: Solomonik, Edgar, Bhatele, Abhinav, Demmel, James
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: New York, NY, USA ACM 12.11.2011
IEEE
Schriftenreihe:ACM Conferences
Schlagworte:
ISBN:145030771X, 9781450307710
ISSN:2167-4329
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent results have shown that topology aware mapping reduces network contention in communication-intensive kernels on massively parallel machines. We demonstrate that on mesh interconnects, topology aware mapping also allows for the utilization of highly-efficient topology aware collectives. We map novel 2.5D dense linear algebra algorithms to exploit rectangular collectives on cuboid partitions allocated by a Blue Gene/P supercomputer. Our mappings allow the algorithms to exploit optimized line multicasts and reductions. Commonly used 2D algorithms cannot be mapped in this fashion. On 16,384 nodes (65,536 cores) of Blue Gene/P, 2.5D algorithms that exploit rectangular collectives are significantly faster than 2D matrix multiplication (MM) and LU factorization, up to 8.7x and 2.1x, respectively. These speed-ups are due to communication reduction (up to 95.6% for 2.5D MM with respect to 2D MM). We also derive LogP-based novel performance models for rectangular broadcasts and reductions. Using those, we model the performance of matrix multiplication and LU factorization on a hypothetical exascale architecture.
ISBN:145030771X
9781450307710
ISSN:2167-4329
DOI:10.1145/2063384.2063487