PARDA: A Fast Parallel Reuse Distance Analysis Algorithm

Reuse distance is a well established approach to characterizing data cache locality based on the stack histogram model. This analysis so far has been restricted to offline use due to the high cost, often several orders of magnitude larger than the execution time of the analyzed code. This paper pres...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2012 IEEE 26th International Parallel and Distributed Processing Symposium S. 1284 - 1294
Hauptverfasser:	Qingpeng Niu, Dinan, J., Qingda Lu, Sadayappan, P.
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.05.2012
Schlagworte:	Algorithm design and analysis Analytical models Arrays Caching Data Locality Histograms LRU Stack Distance Optimization Parallel algorithms Performance Analysis Program processors Reuse Distance
ISBN:	1467309753, 9781467309752
ISSN:	1530-2075
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Reuse distance is a well established approach to characterizing data cache locality based on the stack histogram model. This analysis so far has been restricted to offline use due to the high cost, often several orders of magnitude larger than the execution time of the analyzed code. This paper presents the first parallel algorithm to compute accurate reuse distances by analysis of memory address traces. The algorithm uses a tunable parameter that enables faster analysis when the maximum needed reuse distance is limited by a cache size upper bound. Experimental evaluation using the SPEC CPU 2006 benchmark suite shows that, using 64 processors and a cache bound of 8 MB, it is possible to perform reuse distance analysis with full accuracy within a factor of 13 to 50 times the original execution times of the benchmarks.
ISBN:	1467309753 9781467309752
ISSN:	1530-2075
DOI:	10.1109/IPDPS.2012.117