Bouquet of Instruction Pointers: Instruction Pointer Classifier-based Spatial Hardware Prefetching

Hardware prefetching is one of the common off-chip DRAM latency hiding techniques. Though hardware prefetchers are ubiquitous in the commercial machines and prefetching techniques are well studied in the computer architecture community, the "memory wall" problem still exists after decades...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) s. 118 - 131
Hlavní autoři: Pakalapati, Samuel, Panda, Biswabandan
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2020
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Hardware prefetching is one of the common off-chip DRAM latency hiding techniques. Though hardware prefetchers are ubiquitous in the commercial machines and prefetching techniques are well studied in the computer architecture community, the "memory wall" problem still exists after decades of microarchitecture research and is considered to be an essential problem to solve. In this paper, we make a case for breaking the memory wall through data prefetching at the L1 cache.We propose a bouquet of hardware prefetchers that can handle a variety of access patterns driven by the control flow of an application. We name our proposal Instruction Pointer Classifier based spatial Prefetching (IPCP). We propose IPCP in two flavors: (i) an L1 spatial data prefetcher that classifies instruction pointers at the L1 cache level, and issues prefetch requests based on the classification, and (ii) a multi-level IPCP where the IPCP at the L1 communicates the classification information to the L2 IPCP so that it can kick-start prefetching based on this classification done at the L1. Overall, IPCP is a simple, lightweight, and modular framework for L1 and multi-level spatial prefetching. IPCP at the L1 and L2 incurs a storage overhead of 740 bytes and 155 bytes, respectively.Our empirical results show that, for memory-intensive single-threaded SPEC CPU 2017 benchmarks, compared to a baseline system with no prefetching, IPCP provides an average performance improvement of 45.1%. For the entire SPEC CPU 2017 suite, it provides an improvement of 22%. In the case of multicore systems, IPCP provides an improvement of 23.4% (evaluated over more than 1000 mixes). IPCP outperforms the already high-performing state-of-the-art prefetchers like SPP with PPF and Bingo by demanding 30X to 50X less storage.
AbstractList Hardware prefetching is one of the common off-chip DRAM latency hiding techniques. Though hardware prefetchers are ubiquitous in the commercial machines and prefetching techniques are well studied in the computer architecture community, the "memory wall" problem still exists after decades of microarchitecture research and is considered to be an essential problem to solve. In this paper, we make a case for breaking the memory wall through data prefetching at the L1 cache.We propose a bouquet of hardware prefetchers that can handle a variety of access patterns driven by the control flow of an application. We name our proposal Instruction Pointer Classifier based spatial Prefetching (IPCP). We propose IPCP in two flavors: (i) an L1 spatial data prefetcher that classifies instruction pointers at the L1 cache level, and issues prefetch requests based on the classification, and (ii) a multi-level IPCP where the IPCP at the L1 communicates the classification information to the L2 IPCP so that it can kick-start prefetching based on this classification done at the L1. Overall, IPCP is a simple, lightweight, and modular framework for L1 and multi-level spatial prefetching. IPCP at the L1 and L2 incurs a storage overhead of 740 bytes and 155 bytes, respectively.Our empirical results show that, for memory-intensive single-threaded SPEC CPU 2017 benchmarks, compared to a baseline system with no prefetching, IPCP provides an average performance improvement of 45.1%. For the entire SPEC CPU 2017 suite, it provides an improvement of 22%. In the case of multicore systems, IPCP provides an improvement of 23.4% (evaluated over more than 1000 mixes). IPCP outperforms the already high-performing state-of-the-art prefetchers like SPP with PPF and Bingo by demanding 30X to 50X less storage.
Author Panda, Biswabandan
Pakalapati, Samuel
Author_xml – sequence: 1
  givenname: Samuel
  surname: Pakalapati
  fullname: Pakalapati, Samuel
  organization: Birla Institute of Technology and Science, Pilani,Intel Technology Private Limited,Hyderabad,India
– sequence: 2
  givenname: Biswabandan
  surname: Panda
  fullname: Panda, Biswabandan
  organization: Indian Institute of Technology Kanpur,Dept. of Computer Science and Engineering,Kanpur,India
BookMark eNptj9FKwzAUhiMo6OaeQC_yAq05SZOm3s2irjBwML0ep-2JBmo7kwzx7Z3opVc_fPB98M_Y6TiNxNg1iBxAVDfNtl4W2lRlLoUUuRBCwgmbQSktFMaAPmeLGH0rCtDKSoAL1t5Nh48DJT453owxhUOX_DTyzeTHRCHe_kd5PeCx4zyFrMVIPd_uMXkc-ApD_4mB-CaQo9S9-fH1kp05HCIt_nbOXh7un-tVtn56bOrlOkOlIWW9FEYrMEcPleyklqhKpB6Fsa3onK6gNT9XLIE0ypTOkuvISqslKHJqzq5-u56Idvvg3zF87SpQtipBfQM-41cX
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ISCA45697.2020.00021
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1728146615
9781728146614
EndPage 131
ExternalDocumentID 9138971
Genre orig-research
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
GUFHI
LHSKQ
RIE
RIO
ID FETCH-LOGICAL-a351t-d2065316efea32c252a37aeda068b0cf591b617288e126367f8efce8285213ef3
IEDL.DBID RIE
ISICitedReferencesCount 72
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000617734800010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 06 17:54:04 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a351t-d2065316efea32c252a37aeda068b0cf591b617288e126367f8efce8285213ef3
PageCount 14
ParticipantIDs ieee_primary_9138971
PublicationCentury 2000
PublicationDate 2020-05-01
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-05-01
  day: 01
PublicationDecade 2020
PublicationTitle 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)
PublicationTitleAbbrev ISCA
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib041538211
Score 2.4294248
Snippet Hardware prefetching is one of the common off-chip DRAM latency hiding techniques. Though hardware prefetchers are ubiquitous in the commercial machines and...
SourceID ieee
SourceType Publisher
StartPage 118
SubjectTerms Benchmark testing
Caching
Computer architecture
Hardware
Hardware Prefetching
Microarchitecture
Multicore processing
Prefetching
Proposals
Random access memory
Spatial databases
Title Bouquet of Instruction Pointers: Instruction Pointer Classifier-based Spatial Hardware Prefetching
URI https://ieeexplore.ieee.org/document/9138971
WOSCitedRecordID wos000617734800010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6QePCkBozv9ODRSrdl-_CmRCKJISQ-wo1022nChTWw6N-3syB44OJt02wy2c7sftOd-b4h5IabbtQid8zGvGDdoBVzCSZYIaz2KZ7A11yYjxc9HJrx2I4a5HbDhQGAuvkM7vCyruWH0i_xV1nHYlUNCeN7WqsVV-s3drr45qbDzJodl3HbGbz2HlJ6YHU6BQps4OIoCPpnhkoNIf3D_xk_Iu0tF4-ONihzTBowa5HisVymD3pFy0gHWw1YOipR_mG-uN-1SusBmNOYgJAheAWK84hT_FEs33-7ORqCiH5Mltrkvf_01ntm62kJzMk8q1gQKDObqXSfk8KLXDipHQTHlSm4j7nNCkxXjIFMKKl0NBA9oIKdyCREeUKas3IGp4S6aIxVUUkIISVsuY1eh-BT6udSvsL5GWnh_kw-V4IYk_XWnO9eviAH6IBVl-Alaaanhyuy77-q6WJ-XXvxB7GJn38
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4ImuhJDRjf9uDRym6X7cObEglEJCSi4Ua67TThwhIe-vftLCt44OJt02wyycy0M-3M9w0hd5FqeslTw7RPM9Z0UjATwgTLuJY2-BPYAgvz2ZP9vhqN9KBC7jdYGAAoms_gAT-LWr7L7Qqfyhoaq2oIGN_DyVklWuvXe5q4d8N1psTHxZFudN9bTyFB0DLcAzm2cEVICfpnikoRRNpH_xN_TOpbNB4dbOLMCanAtEay53wVjvQlzT3tbllg6SBHAoj54nHXKi1GYE58CIUMw5ejOJE4eCDFAv63maMg8GjJIKlOPtovw1aHlfMSmEnSeMkcR6LZWIT_TMItT7lJpAFnIqGyyPpUxxkmLEpBzEUipFfgLSCHHY8T8MkpqU7zKZwRarxSWniRgHMhZUu1t9I5G5I_EzKWKDonNdTPeLamxBiXqrnYvXxLDjrDt9641-2_XpJDNMa6Z_CKVIMm4Jrs26_lZDG_KSz6Ay5Rosg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+ACM%2FIEEE+47th+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=Bouquet+of+Instruction+Pointers%3A+Instruction+Pointer+Classifier-based+Spatial+Hardware+Prefetching&rft.au=Pakalapati%2C+Samuel&rft.au=Panda%2C+Biswabandan&rft.date=2020-05-01&rft.pub=IEEE&rft.spage=118&rft.epage=131&rft_id=info:doi/10.1109%2FISCA45697.2020.00021&rft.externalDocID=9138971