ReQoS
- 16 March 2013
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- Vol. 48 (4), 89-100
- https://doi.org/10.1145/2451116.2451126
Abstract
As multicore processors with expanding core counts continue to dominate the server market, the overall utilization of the class of datacenters known as warehouse scale computers (WSCs) depends heavily on colocation of multiple workloads on each server to take advantage of the computational power provided by modern processors. However, many of the applications running in WSCs, such as websearch, are user-facing and have quality of service (QoS) requirements. When multiple applications are co-located on a multicore machine, contention for shared memory resources threatens application QoS as severe cross-core performance interference may occur. WSC operators are left with two options: either disregard QoS to maximize WSC utilization, or disallow the co-location of high-priority user-facing applications with other applications, resulting in low machine utilization and millions of dollars wasted. This paper presents ReQoS, a static/dynamic compilation approach that enables low-priority applications to adaptively manipulate their own contentiousness to ensure the QoS of high-priority co-runners. ReQoS is composed of a profile guided compilation technique that identifies and inserts markers in contentious code regions in low-priority applications, and a lightweight runtime that monitors the QoS of high-priority applications and reactively reduces the pressure low-priority applications generate to the memory subsystem when cross-core interference is detected. In this work, we show that ReQoS can accurately diagnose contention and significantly reduce performance interference to ensure application QoS. Applying ReQoS to SPEC2006 and SmashBench workloads on real multicore machines, we are able to improve machine utilization by more than 70% in many cases, and more than 50% on average, while enforcing a 90% QoS threshold. We are also able to improve the energy efficiency of modern multicore machines by 47% on average over a policy of disallowing co-locations.Keywords
This publication has 29 references indexed in Scilit:
- Bubble-UpPublished by Association for Computing Machinery (ACM) ,2011
- The impact of memory subsystem resource sharing on datacenter applicationsPublished by Association for Computing Machinery (ACM) ,2011
- Contention aware executionPublished by Association for Computing Machinery (ACM) ,2010
- Combining Locality Analysis with Online Proactive Job Co-scheduling in Chip MultiprocessorsLecture Notes in Computer Science, 2010
- The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale MachinesSynthesis Lectures on Computer Architecture, 2009
- Adaptive set pinningPublished by Association for Computing Machinery (ACM) ,2008
- QoS policies and architecture for cache/memory in CMP platformsPublished by Association for Computing Machinery (ACM) ,2007
- Managing Distributed, Shared L2 Caches through OS-Level Page Allocation40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), 2006
- Fair Queuing Memory SystemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- PinPublished by Association for Computing Machinery (ACM) ,2005