UDP: Utility-Driven Fetch Directed Instruction Prefetching

Datacenter applications exhibit large instruction footprints causing significant instruction cache misses and, as a result, frontend stalls. To address this issue, instruction prefetching mechanisms have been proposed, including state-of-the-art techniques such as fetch-directed instruction prefetch...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) S. 1188 - 1201
Hauptverfasser: Oh, Surim, Xu, Mingsheng, Khan, Tanvir Ahmed, Kasikci, Baris, Litz, Heiner
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 29.06.2024
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Datacenter applications exhibit large instruction footprints causing significant instruction cache misses and, as a result, frontend stalls. To address this issue, instruction prefetching mechanisms have been proposed, including state-of-the-art techniques such as fetch-directed instruction prefetching. However, our study shows that existing implementations still fall far short of an ideal system with a perfect instruction cache. In particular, up to 588.47 \% of potential IPC speedup of existing processors hides due to frontend stalls, and these frontend stalls are due to inaccurate and untimely instruction prefetches. We quantify the impact of these individual effects, observing that applications exhibit different characteristics that call for adaptive application-specific optimizations. Based on these insights, we propose two novel mechanisms, UDP and UFTQ, to improve the accuracy of FDIP without negatively affecting timeliness while leveraging prefetches on the wrong path. We evaluate our technique on 10 data center workloads showing a maximal IPC improvement of 16.1 \% and an average IPC improvement of 3.6 \%. Our techniques only introduce moderate hardware modifications and a storage cost of 8 KB.
DOI:10.1109/ISCA59077.2024.00089