IMP: Indirect memory prefetcher
Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses resulting from following edges in a graph or non-zero elements in a sparse matrix. These accesses have little temporal or spatial locality, and thus incur long memory stalls and...
Gespeichert in:
| Veröffentlicht in: | 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) S. 178 - 190 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
ACM
01.12.2015
|
| Schlagworte: | |
| ISSN: | 2379-3155 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses resulting from following edges in a graph or non-zero elements in a sparse matrix. These accesses have little temporal or spatial locality, and thus incur long memory stalls and large bandwidth requirements. A traditional streaming or striding prefetcher cannot capture these irregular access patterns. A majority of these irregular accesses come from indirect patterns of the form A[B[j]]. We propose an efficient hardware indirect memory prefetcher (IMP) to capture this access pattern and hide latency. We also propose a partial cacheline accessing mechanism for these prefetches to reduce the network and DRAM bandwidth pressure from the lack of spatial locality. Evaluated on 7 applications, IMP shows 56% speedup on average (up to 2.3×) compared to a baseline 64 core system with streaming prefetchers. This is within 23% of an idealized system. With partial cacheline accessing, we see another 9.4% speedup on average (up to 46.6%). |
|---|---|
| AbstractList | Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses resulting from following edges in a graph or non-zero elements in a sparse matrix. These accesses have little temporal or spatial locality, and thus incur long memory stalls and large bandwidth requirements. A traditional streaming or striding prefetcher cannot capture these irregular access patterns. A majority of these irregular accesses come from indirect patterns of the form A[B[j]]. We propose an efficient hardware indirect memory prefetcher (IMP) to capture this access pattern and hide latency. We also propose a partial cacheline accessing mechanism for these prefetches to reduce the network and DRAM bandwidth pressure from the lack of spatial locality. Evaluated on 7 applications, IMP shows 56% speedup on average (up to 2.3×) compared to a baseline 64 core system with streaming prefetchers. This is within 23% of an idealized system. With partial cacheline accessing, we see another 9.4% speedup on average (up to 46.6%). |
| Author | Hughes, Christopher J. Xiangyao Yu Satish, Nadathur Devadas, Srinivas |
| Author_xml | – sequence: 1 surname: Xiangyao Yu fullname: Xiangyao Yu email: yxy@mit.edu organization: Massachusetts Inst. of Technol., Cambridge, MA, USA – sequence: 2 givenname: Christopher J. surname: Hughes fullname: Hughes, Christopher J. email: christopher.j.hughes@intel.com organization: Parallel Comput. Lab., Intel Labs., Santa Clara, CA, USA – sequence: 3 givenname: Nadathur surname: Satish fullname: Satish, Nadathur email: nadathur.rajagopalan.satish@intel.com organization: Parallel Comput. Lab., Intel Labs., Santa Clara, CA, USA – sequence: 4 givenname: Srinivas surname: Devadas fullname: Devadas, Srinivas email: devadas@mit.edu organization: Massachusetts Inst. of Technol., Cambridge, MA, USA |
| BookMark | eNotjE1Lw0AQQFdRsK09e_Bg_kDq7M5uZuJNStVARQ96LvsxwYBJyyaX_nsrCg8evMObq4thP4hSNxpWWlt3bxiByKx-zUBnan6qgPaEOVczg1SXqJ27Ustx7AIgGOQKzUzdNa_vD0UzpC5LnIpe-n0-FocsrUzxS_K1umz99yjLfy_U59PmY_1Sbt-em_XjtvTG0lQySnBct5X2PjijkzUUg_PccrIx1HUI7LxASgzRA1UWk8REDNYCmRYX6vbv24nI7pC73ufjjthVrib8AVNGPnw |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/2830772.2830807 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 1450340342 9781450340342 |
| EISSN | 2379-3155 |
| EndPage | 190 |
| ExternalDocumentID | 7856597 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL ABLEC ALMA_UNASSIGNED_HOLDINGS CBEJK IEGSK RIE RIL |
| ID | FETCH-LOGICAL-a247t-83eb589f61aab521d427cb5a8f8d4cb99bb85ae0dd80ca07643decd78044072f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 112 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000393287300015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:02:01 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a247t-83eb589f61aab521d427cb5a8f8d4cb99bb85ae0dd80ca07643decd78044072f3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_7856597 |
| PublicationCentury | 2000 |
| PublicationDate | 2015-Dec. |
| PublicationDateYYYYMMDD | 2015-12-01 |
| PublicationDate_xml | – month: 12 year: 2015 text: 2015-Dec. |
| PublicationDecade | 2010 |
| PublicationTitle | 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) |
| PublicationTitleAbbrev | MICRO |
| PublicationYear | 2015 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssib030238632 ssib023363937 ssib042476800 |
| Score | 2.3818152 |
| Snippet | Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses resulting from following edges in a... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 178 |
| SubjectTerms | Arrays Bandwidth Hardware Indexes Multicore processing Prefetching Sparse matrices |
| Title | IMP: Indirect memory prefetcher |
| URI | https://ieeexplore.ieee.org/document/7856597 |
| WOSCitedRecordID | wos000393287300015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB3a4sGTSit-uwePbpvdZPPhVSwWtPSg0luZfIEHt6VuBf-9yW5tFbx4SsghzJDAeyTz3gBcOWkwNxlNERVPWYDkVMnohKm4ZRGyG7umlwcxHsvpVE1acL3Rwjjn6uIz14_T-i_fzs0qPpUNhAz0Q4k2tIXgjVbr--7klHL6A2pjLxzJt5pJlrNArAlZu_tkrBhE66vALftxlOR3e5UaXYZ7_4trH3pbmV4y2QDQAbRc2YXL0ePkJhmVDVYlb7GS9jNZxHYi9QH14Hl493R7n667IKQYYq5SSZ0upPI8Q9QBbC3LhdEFSi8tM1oprWWBjlgriUEiAsWwzthoLBTNzzw9hE45L90RJBw1ReW9tZgx5xRyYnOfkbCT9jnHY-jG5GaLxuhits7r5O_lU9gN7KFoajvOoFMtV-4cdsxH9fq-vKhP5wuUW4zK |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1qFfSk0orf3YNHt83uZrOJV7G02JYeqvRWJl_gwW3ph-C_N9ldWwUvnhJyCDMk8B7JzHsAd4YrjFWUhIiChdRBcii4V8IUTFMP2aVc0-sgG434dCrGNbjf9sIYY4riM9P20-IvX8_Vxj-VdTLu6IfI9mDfO2dV3VrftydOEpb8AFvvhsPZrmuSxtRRa0IqfZ-Iph0vfuXYZduPnPw2WCnwpXv8v8hOoLlr1AvGWwg6hZrJG9DqD8cPQT8v0Sp497W0n8HCG4oUR9SEl-7T5LEXVj4IIbqY1yFPjEy5sCxClA5uNY0zJVPklmuqpBBS8hQN0ZoThSRzJEMbpb20kJc_s8kZ1PN5bs4hYCgTFNZqjRE1RiAjOrYRcTtJGzO8gIZPbrYopS5mVV6Xfy-34LA3GQ5mg_7o-QqOHJdIy0qPa6ivlxtzAwfqY_22Wt4WJ_UFIJCQEw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+48th+Annual+IEEE%2FACM+International+Symposium+on+Microarchitecture+%28MICRO%29&rft.atitle=IMP%3A+Indirect+memory+prefetcher&rft.au=Xiangyao+Yu&rft.au=Hughes%2C+Christopher+J.&rft.au=Satish%2C+Nadathur&rft.au=Devadas%2C+Srinivas&rft.date=2015-12-01&rft.pub=ACM&rft.eissn=2379-3155&rft.spage=178&rft.epage=190&rft_id=info:doi/10.1145%2F2830772.2830807&rft.externalDocID=7856597 |