Load value prediction via path-based address prediction avoiding mispredictions due to conflicting stores

Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is inherently limited by true data dependencies. Value prediction was proposed to address this limitation. Many challenges face value prediction, in thi...

Full description

Saved in:
Bibliographic Details
Published in:MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA pp. 423 - 435
Main Authors: Sheikh, Rami, Cain, Harold W., Damodaran, Raguram
Format: Conference Proceeding
Language:English
Published: New York, NY, USA ACM 14.10.2017
Series:ACM Conferences
Subjects:
ISBN:1450349528, 9781450349529
ISSN:2379-3155
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is inherently limited by true data dependencies. Value prediction was proposed to address this limitation. Many challenges face value prediction, in this work we focus on two of them. Challenge #1: store instructions change the values in memory, rendering the values in the value predictor stale, and resulting in value mispredictions and a retraining penalty. Challenge #2: value mispredictions trigger costly pipeline flushes. To minimize the number of pipeline flushes, value predictors employ stringent, yet necessary, high confidence requirements to guarantee high prediction accuracy. Such requirements can negatively impact training time and coverage. In this work, we propose Decoupled Load Value Prediction (DLVP), a technique that targets the value prediction challenges for load instructions. DLVP mitigates the stale state caused by stores by replacing value prediction with memory address prediction. Then, it opportunistically probes the data cache to retrieve the value(s) corresponding to the predicted address(es) early enough so value prediction can take place. Since the values captured in the data cache mirror the current program data (except for in-flight stores), this addresses the first challenge. Regarding the second challenge, DLVP reduces pipeline flushes by using a new context-based address prediction scheme that leverages load-path history to deliver high address prediction accuracy (over 99%) with relaxed confidence requirements. We call this address prediction scheme Path-based Address Prediction (PAP). With a modest 8KB prediction table, DLVP improves performance by up to 71%, and 4.8% on average, without increasing the core energy consumption.
AbstractList Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is inherently limited by true data dependencies. Value prediction was proposed to address this limitation. Many challenges face value prediction, in this work we focus on two of them. Challenge #1: store instructions change the values in memory, rendering the values in the value predictor stale, and resulting in value mispredictions and a retraining penalty. Challenge #2: value mispredictions trigger costly pipeline flushes. To minimize the number of pipeline flushes, value predictors employ stringent, yet necessary, high confidence requirements to guarantee high prediction accuracy. Such requirements can negatively impact training time and coverage. In this work, we propose Decoupled Load Value Prediction (DLVP), a technique that targets the value prediction challenges for load instructions. DLVP mitigates the stale state caused by stores by replacing value prediction with memory address prediction. Then, it opportunistically probes the data cache to retrieve the value(s) corresponding to the predicted address(es) early enough so value prediction can take place. Since the values captured in the data cache mirror the current program data (except for in-flight stores), this addresses the first challenge. Regarding the second challenge, DLVP reduces pipeline flushes by using a new context-based address prediction scheme that leverages load-path history to deliver high address prediction accuracy (over 99%) with relaxed confidence requirements. We call this address prediction scheme Path-based Address Prediction (PAP). With a modest 8KB prediction table, DLVP improves performance by up to 71%, and 4.8% on average, without increasing the core energy consumption. CCS CONCEPTS * Computer systems organization → Superscalar architectures; Pipeline computing; Reduced instruction set computing;
Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is inherently limited by true data dependencies. Value prediction was proposed to address this limitation. Many challenges face value prediction, in this work we focus on two of them. Challenge #1: store instructions change the values in memory, rendering the values in the value predictor stale, and resulting in value mispredictions and a retraining penalty. Challenge #2: value mispredictions trigger costly pipeline flushes. To minimize the number of pipeline flushes, value predictors employ stringent, yet necessary, high confidence requirements to guarantee high prediction accuracy. Such requirements can negatively impact training time and coverage. In this work, we propose Decoupled Load Value Prediction (DLVP), a technique that targets the value prediction challenges for load instructions. DLVP mitigates the stale state caused by stores by replacing value prediction with memory address prediction. Then, it opportunistically probes the data cache to retrieve the value(s) corresponding to the predicted address(es) early enough so value prediction can take place. Since the values captured in the data cache mirror the current program data (except for in-flight stores), this addresses the first challenge. Regarding the second challenge, DLVP reduces pipeline flushes by using a new context-based address prediction scheme that leverages load-path history to deliver high address prediction accuracy (over 99%) with relaxed confidence requirements. We call this address prediction scheme Path-based Address Prediction (PAP). With a modest 8KB prediction table, DLVP improves performance by up to 71%, and 4.8% on average, without increasing the core energy consumption.
Author Sheikh, Rami
Damodaran, Raguram
Cain, Harold W.
Author_xml – sequence: 1
  givenname: Rami
  surname: Sheikh
  fullname: Sheikh, Rami
  email: ralsheik@qti.qualcomm.com
  organization: Qualcomm Technologies, Inc
– sequence: 2
  givenname: Harold W.
  surname: Cain
  fullname: Cain, Harold W.
  email: tcain@qti.qualcomm.com
  organization: Qualcomm Datacenter Technologies, Inc
– sequence: 3
  givenname: Raguram
  surname: Damodaran
  fullname: Damodaran, Raguram
  email: raguramd@qti.qualcomm.com
  organization: Qualcomm Technologies, Inc
BookMark eNqNkDtPwzAUhc1Loi2dGVgysiT4-jp2PKKqPKRILDBbfkUY2iSKQyX-PYF26Mh0pPudc4dvTs7brg2EXAMtAHh5h8BQoSr-soQTMp-uFLkqWXVKZgylyhHK8uwYXJJlSh-UUgZSCcAZkXVnfLYzm6-Q9UPw0Y2xa7NdNFlvxvfcmhR8ZrwfQkpHjSty0ZhNCstDLsjbw_p19ZTXL4_Pq_s6N4zLMffMS24DSOmdFxIEFazh6FAC5RSFKRvlrAInrVTKTUz4hlpGJyadQFyQm_3fGELQ_RC3ZvjWlaiEUNVEiz01bqtt130mDVT_-tEHP_rgR9shhmYa3P5zgD-5LWFw
ContentType Conference Proceeding
Copyright 2017 ACM
Copyright_xml – notice: 2017 ACM
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3123939.3123951
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1450349528
9781450349529
EISSN 2379-3155
EndPage 435
ExternalDocumentID 8686698
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAJGR
ABLEC
ACM
ADPZR
ALMA_UNASSIGNED_HOLDINGS
APO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
GUFHI
IEGSK
OCL
RIB
RIC
RIE
RIL
AAWTH
LHSKQ
ID FETCH-LOGICAL-a247t-d2d74be177dcd6716062f43c37104036a5f9cb91c7b799c2f46df0b200367c633
IEDL.DBID RIE
ISBN 1450349528
9781450349529
ISICitedReferencesCount 11
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000455679300032&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Thu May 29 05:57:38 EDT 2025
Wed Jan 31 06:40:42 EST 2024
IsPeerReviewed false
IsScholarly true
Keywords value prediction
address prediction
path-based predictor
microarchitecture
Language English
License Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org
LinkModel DirectLink
MeetingName MICRO-50: The 50th Annual IEEE/ACM International Symposium on Microarchitecture
MergedId FETCHMERGED-LOGICAL-a247t-d2d74be177dcd6716062f43c37104036a5f9cb91c7b799c2f46df0b200367c633
PageCount 13
ParticipantIDs acm_books_10_1145_3123939_3123951_brief
acm_books_10_1145_3123939_3123951
ieee_primary_8686698
PublicationCentury 2000
PublicationDate 20171014
2017-Oct.
PublicationDateYYYYMMDD 2017-10-14
2017-10-01
PublicationDate_xml – month: 10
  year: 2017
  text: 20171014
  day: 14
PublicationDecade 2010
PublicationPlace New York, NY, USA
PublicationPlace_xml – name: New York, NY, USA
PublicationSeriesTitle ACM Conferences
PublicationTitle MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA
PublicationTitleAbbrev MICRO
PublicationYear 2017
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0002179613
ssib030238632
ssib042476800
ssib023363937
Score 2.174444
Snippet Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is...
SourceID ieee
acm
SourceType Publisher
StartPage 423
SubjectTerms Address Prediction
Computer systems organization -- Architectures -- Serial architectures -- Pipeline computing
Computer systems organization -- Architectures -- Serial architectures -- Reduced instruction set computing
Computer systems organization -- Architectures -- Serial architectures -- Superscalar architectures
History
Machinery
Microarchitecture
Path-based Predictor
Pipelines
Prefetching
Registers
Value Prediction
Subtitle avoiding mispredictions due to conflicting stores
Title Load value prediction via path-based address prediction
URI https://ieeexplore.ieee.org/document/8686698
WOSCitedRecordID wos000455679300032&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEB6sePDkoxXrixUEL6Y22WQ3602K4kFLwQe9hezOBiLaSJP29zubplVBEE8Jm2EIM5t8MzsvgLM00jzTIvJ8Y5QXokRPI5LXitYKlBIxyOphE3I4jMdjNVqDi1UtjLW2Tj6zPXdbx_KxMDN3VHYZi1gIFbegJaVY1Got907AueDfoNbNwonFV81kGIRkWDet6NxfmkxxRVDWdPvxw-iS-64dmOrVVxe4bKXm_cfQlRpzbrf-97bb0Pkq3mOjFSztwJqd7MLWcnoDaz7mNrzeFymyl_Rt5uhdvMbpiM3zlI3ILPQcviG7RnQO-TeKK3Y9L3LHmz3k5cdqvWRInKqCDZaVJkTxSC69LTvwfHvzNLjzmskLXkpyqjwMUIba-qQrg4Jcqr4IspAbTvZISJiXRpkyWvlGaqmUoWcCs752iW5CGsH5HqxPiondBxbESNJFn5uIWEdcRxl5hNjnmSXjJ5ZdOCUxJ86lKJNFlXSUNKpIGlV04fxPmkRPc5t1oe0UkXwsWnUkjQ4Ofl8-hM3AoXSdm3cE69V0Zo9hw8yrvJye1PvrE_IqyBs
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3rS-NAEB_UE_STdz6w3msFwS9Gm-wre99ErvS4Wgo-8FvI7mwgctdI0_bvv9k0rQoH4qeEzTCEmU1-MzsvgJNcWl5YJaPYORMJ1BhZRPJa0XuFWiMmRTNsQg-H6cODGa3B2aoWxnvfJJ_583DbxPKxcrNwVHaRqlQpk67DBylE0l1Uay13T8K54i_ANkzDSdVz1aRIBJnWbTO68J8mY9wQmLX9fmIhL3gcGoKZ8-YaQpfrufv7auxKgzq9nfe970fYfy7fY6MVMH2CNT_ehZ3l_AbWfs578DiocmT3-Z9ZoA8Rm6AlNi9zNiLDMAoIh-wSMbjkLyh-sMt5VQbe7Lqsn1brNUPiNK3Y1bLWhChuyKn39T7c9X7eXvWjdvZClJOcphEmqIX1MWnLoSKnqquSQnDHySIRhHq5LIyzJnbaamMcPVNYdG1IdVPaKc4PYGNcjf0hsCRFki7G3EliLbmVBfmE2OWFJ_Mn1R04JjFnwamos0WdtMxaVWStKjpw-iZNZielLzqwFxSRPS2adWStDo7-v_wdtvq314Ns8Gv4-zNsJwGzm0y9L7Axncz8V9h082lZT741e-0fjafLYg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=MICRO-50+%3A+the+50th+annual+IEEE%2FACM+International+Symposium+on+Microarchitecture+%3A+proceedings+%3A+October+14-18%2C+2017%2C+Cambridge%2C+MA&rft.atitle=Load+Value+Prediction+via+Path-based+Address+Prediction%3A+Avoiding+Mispredictions+due+to+Conflicting+Stores&rft.au=Sheikh%2C+Rami&rft.au=Cain%2C+Harold+W.&rft.au=Damodaran%2C+Raguram&rft.date=2017-10-01&rft.pub=ACM&rft.eissn=2379-3155&rft.spage=423&rft.epage=435&rft_id=info:doi/10.1145%2F3123939.3123951&rft.externalDocID=8686698
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781450349529/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781450349529/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781450349529/sc.gif&client=summon&freeimage=true