Combining compile-time and run-time support for efficient software distributed shared memory

We describe an integrated compile time and run time system for efficient shared memory parallel computing on distributed memory machines. The combined system presents the user with a shared memory programming model. The run time system implements a consistent shared memory abstraction using memory a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the IEEE Jg. 87; H. 3; S. 476 - 486
Hauptverfasser: Dwarkadas, S., Honghui Lu, Cox, A.L., Rajamony, R., Zwaenepoel, W.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 1999
Schlagworte:
ISSN:0018-9219
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We describe an integrated compile time and run time system for efficient shared memory parallel computing on distributed memory machines. The combined system presents the user with a shared memory programming model. The run time system implements a consistent shared memory abstraction using memory access detection and automatic data caching. The compiler improves the efficiency of the shared memory implementation by directing the run time system to exploit the message passing capabilities of the underlying hardware. To do so, the compiler analyzes shared memory accesses and transforms the code to insert calls to the run time system that provide it with the access information computed by the compiler. The run time system is augmented with the appropriate entry points to use this information to implement bulk data transfer and to reduce the overhead of run time consistency maintenance. In those cases where the compiler analysis succeeds for the entire program, we demonstrate that the combined system achieves performance comparable to that produced by compilers that directly target message passing. If the compiler analysis is successful only for parts of the program, for instance, because of irregular accesses to some of the arrays, the resulting optimizations can be applied to those parts for which the analysis succeeds. If the compiler analysis fails entirely, we rely on the run time maintenance of shared memory and thereby avoid the complexity and the limitations of compilers that directly target message passing. The result is a single system that combines efficient support for both regular and irregular memory access patterns.
Bibliographie:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0018-9219
DOI:10.1109/5.747868