Rubik: Fast analytical power management for latency-critical systems

Latency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95 th percentile) latencies of a few milliseconds. Servers running these workloads are kept lightly loaded to meet these stringent latency targets. This low utilization wastes billions of dollars in ener...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) s. 598 - 610
Hlavní autoři: Kasture, Harshad, Bartolini, Davide B., Beckmann, Nathan, Sanchez, Daniel
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 01.12.2015
Témata:
ISSN:2379-3155
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Latency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95 th percentile) latencies of a few milliseconds. Servers running these workloads are kept lightly loaded to meet these stringent latency targets. This low utilization wastes billions of dollars in energy and equipment annually. Applying dynamic power management to latency-critical workloads is challenging. The fundamental issue is coping with their inherent short-term variability: requests arrive at unpredictable times and have variable lengths. Without knowledge of the future, prior techniques either adapt slowly and conservatively or rely on application-specific heuristics to maintain tail latency. We propose Rubik, a fine-grain DVFS scheme for latency-critical workloads. Rubik copes with variability through a novel, general, and efficient statistical performance model. This model allows Rubik to adjust frequencies at sub-millisecond granularity to save power while meeting the target tail latency. Rubik saves up to 66% of core power, widely outperforms prior techniques, and requires no application-specific tuning. Beyond saving core power, Rubik robustly adapts to sudden changes in load and system performance. We use this capability to design RubikColoc, a co-location scheme that uses Rubik to allow batch and latency-critical work to share hardware resources more aggressively than prior techniques. RubikColoc reduces data-center power by up to 31% while using 41% fewer servers than a datacenter that segregates latency-critical and batch work, and achieves 100% core utilization.
AbstractList Latency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95 th percentile) latencies of a few milliseconds. Servers running these workloads are kept lightly loaded to meet these stringent latency targets. This low utilization wastes billions of dollars in energy and equipment annually. Applying dynamic power management to latency-critical workloads is challenging. The fundamental issue is coping with their inherent short-term variability: requests arrive at unpredictable times and have variable lengths. Without knowledge of the future, prior techniques either adapt slowly and conservatively or rely on application-specific heuristics to maintain tail latency. We propose Rubik, a fine-grain DVFS scheme for latency-critical workloads. Rubik copes with variability through a novel, general, and efficient statistical performance model. This model allows Rubik to adjust frequencies at sub-millisecond granularity to save power while meeting the target tail latency. Rubik saves up to 66% of core power, widely outperforms prior techniques, and requires no application-specific tuning. Beyond saving core power, Rubik robustly adapts to sudden changes in load and system performance. We use this capability to design RubikColoc, a co-location scheme that uses Rubik to allow batch and latency-critical work to share hardware resources more aggressively than prior techniques. RubikColoc reduces data-center power by up to 31% while using 41% fewer servers than a datacenter that segregates latency-critical and batch work, and achieves 100% core utilization.
Author Bartolini, Davide B.
Kasture, Harshad
Sanchez, Daniel
Beckmann, Nathan
Author_xml – sequence: 1
  givenname: Harshad
  surname: Kasture
  fullname: Kasture, Harshad
  email: hkasture@csail.mit.edu
– sequence: 2
  givenname: Davide B.
  surname: Bartolini
  fullname: Bartolini, Davide B.
  email: db2@csail.mit.edu
– sequence: 3
  givenname: Nathan
  surname: Beckmann
  fullname: Beckmann, Nathan
  email: beckmann@csail.mit.edu
– sequence: 4
  givenname: Daniel
  surname: Sanchez
  fullname: Sanchez, Daniel
  email: sanchez@csail.mit.edu
BookMark eNotjN1Kw0AQRldRsK299sKbvEDq7Ew2u_FOaqtCQRC9LpPtrATzU7IRydsbrPDBgcPhm6uLtmtFqRsNK60zc4eOwFpc_bGwZ2o-WaBsGp6rGZItUtLGXKlljFUJBEguJ5ypx7fvsvq6T7Ych4Rbrseh8lwnx-5H-qSZzKc00g5J6Pqk5kFaP6a-r05VHOMgTbxWl4HrKMt_LtTHdvO-fk53r08v64ddypjZIdXA3gcMOifDmTsA-mADSOk4d9YXJBYZEJzxHoHRUyGYHcCEXDuUghbq9vRbicj-2FcN9-PeOpPnBPQLoPVMXg
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/2830772.2830797
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1450340342
9781450340342
EISSN 2379-3155
EndPage 610
ExternalDocumentID 7856630
Genre orig-research
GroupedDBID 6IE
6IL
ABLEC
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IEGSK
RIE
RIL
ID FETCH-LOGICAL-a247t-10accf2f1635a48d02cf7f0eb8a687c93e72a02085cc20a2c39e24d05f6182e93
IEDL.DBID RIE
ISICitedReferencesCount 95
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000393287300048&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:02:06 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a247t-10accf2f1635a48d02cf7f0eb8a687c93e72a02085cc20a2c39e24d05f6182e93
PageCount 13
ParticipantIDs ieee_primary_7856630
PublicationCentury 2000
PublicationDate 2015-Dec.
PublicationDateYYYYMMDD 2015-12-01
PublicationDate_xml – month: 12
  year: 2015
  text: 2015-Dec.
PublicationDecade 2010
PublicationTitle 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
PublicationTitleAbbrev MICRO
PublicationYear 2015
Publisher ACM
Publisher_xml – name: ACM
SSID ssib030238632
ssib023363937
ssib042476800
Score 2.3789458
Snippet Latency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95 th percentile) latencies of a few milliseconds. Servers...
SourceID ieee
SourceType Publisher
StartPage 598
SubjectTerms Adaptation models
colocation
Delays
DVFS
interference
isolation
latency-critical
Load modeling
power management
quality of service
Servers
tail latency
Time factors
Uncertainty
Voltage control
Title Rubik: Fast analytical power management for latency-critical systems
URI https://ieeexplore.ieee.org/document/7856630
WOSCitedRecordID wos000393287300048&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA61ePCk0opvcvDottu841UtnkoRhd7KbDILRW2L3Qr-e5Ps2ip48ZSQU97zTTLzfYRcSS69UiZy7EuZiUKJDED5zAZjYJkupC0hiU3o0chMJnbcItebXBhETMFn2IvV9JfvF24dn8r62gTwwYODvqO1qnO1vvcO41zxH6Y2auEYtc2ZFEwEYJ3nDbvPQMh-pL4K2LKXSvtbXiVZl-H-__p1QLrbND063higQ9LCeYfcPa6L2csNHcKqohApR9JrNV1GOTT6tol2oQGt0leIkPkzc43gAa2JnVdd8jy8f7p9yBqphAzCwKpwmYJzJSsDupIgjM-ZK3WZY2FAGe0sR80g6XE6x3JgjltkwueyVMHBQMuPSHu-mOMxoUrYgReAyhsjQAbP2ysbzn0RFrVAHJyQTpyB6bJmw5g2gz_9u_mM7AWIIesAkHPSrt7XeEF23Uc1W71fpiX8AmivmfY
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4ImuhJDRjf9uDRhaWvbb2qBCNuiMGE26bbziZEBQKLif_etqygiRdPbXrqe75pZ74PoStOuRVCeo59ziOWCxZpLWyknDFQJMm5KnQQm0jSVI5GalBD1-tcGAAIwWfQ8tXwl2-nZumfytqJdOCDOgd9yytnVdla37uHUCroD2Pr1XCk2GRNMsIctI7jit-nw3jbk185dNkKpfotsBLsS3fvfz3bR81Noh4erE3QAarBpIHunpf5-PUGd_WixNqTjoT3ajzzgmj4fR3vgh1exW_ag-bPyFSSB3hF7bxoopfu_fC2F1ViCZF2AyvddaqNKUjh8BXXTNqYmCIpYsilFjIxikJCdFDkNIbEmhiqgDAb80I4FwMUPUT1yXQCRwgLpjqWaRBWSqa5872tUO7k525Zc4DOMWr4GchmKz6MrBr8yd_Nl2inN3zqZ_2H9PEU7TrAwVfhIGeoXs6XcI62zUc5XswvwnJ-AaEtnT8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+48th+Annual+IEEE%2FACM+International+Symposium+on+Microarchitecture+%28MICRO%29&rft.atitle=Rubik%3A+Fast+analytical+power+management+for+latency-critical+systems&rft.au=Kasture%2C+Harshad&rft.au=Bartolini%2C+Davide+B.&rft.au=Beckmann%2C+Nathan&rft.au=Sanchez%2C+Daniel&rft.date=2015-12-01&rft.pub=ACM&rft.eissn=2379-3155&rft.spage=598&rft.epage=610&rft_id=info:doi/10.1145%2F2830772.2830797&rft.externalDocID=7856630