Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations

As much of the world's computing continues to move into the cloud, the overprovisioning of computing resources to ensure the performance isolation of latency-sensitive tasks, such as web search, in modern datacenters is a major contributor to low machine utilization. Being unable to accurately...

Full description

Saved in:
Bibliographic Details
Published in:MICRO 44 : Proceedings of the 44th Annual IEEE/ACM Symposium on Microarchitecture, December 4 - 7, 2011 Porto Alegre, RS - Brazil pp. 248 - 259
Main Authors: Mars, Jason, Lingjia Tang, Hundt, Robert, Skadron, Kevin, Soffa, Mary Lou
Format: Conference Proceeding
Language:English
Published: ACM 01.12.2011
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As much of the world's computing continues to move into the cloud, the overprovisioning of computing resources to ensure the performance isolation of latency-sensitive tasks, such as web search, in modern datacenters is a major contributor to low machine utilization. Being unable to accurately predict performance degradation due to contention for shared resources on multicore systems has led to the heavy handed approach of simply disallowing the co-location of high-priority, latency-sensitive tasks with other tasks. Performing this precise prediction has been a challenging and unsolved problem. In this paper, we present Bubble-Up, a characterization methodology that enables the accurate prediction of the performance degradation that results from contention for shared resources in the memory subsystem. By using a bubble to apply a tunable amount of "pressure" to the memory subsystem on processors in production datacenters, our methodology can predict the performance interference between co-locate applications with an accuracy within 1% to 2% of the actual performance degradation. Using this methodology to arrive at "sensible" co-locations in Google's production datacenters with real-world large-scale applications, we can improve the utilization of a 500-machine cluster by 50% to 90% while guaranteeing a high quality of service of latency-sensitive applications.
DOI:10.1145/2155620.2155650