Quality of service shared cache management in chip multiprocessor architecture
Open Access
- 30 December 2010
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 7 (3), 1-33
- https://doi.org/10.1145/1880037.1880039
Abstract
The trends in enterprise IT toward service-oriented computing, server consolidation, and virtual computing point to a future in which workloads are becoming increasingly diverse in terms of performance, reliability, and availability requirements. It can be expected that more and more applications with diverse requirements will run on a Chip Multi-Processor (CMP) and share platform resources such as the lowest level cache and off-chip bandwidth. In this environment, it is desirable to have microarchitecture and software support that can provide a guarantee of a certain level of performance, which we refer to as performance Quality of Service . In this article, we investigated a framework would be needed to manage the shared cache resource for fully providing QoS in a CMP. We found in order to fully provide QoS, we need to specify an appropriate QoS target for each job and apply an admission control policy to accept jobs only when their QoS targets can be satisfied. We also found that providing strict QoS often leads to a significant reduction in throughput due to resource fragmentation. We proposed throughput optimization techniques that include: (1) exploiting various QoS execution modes, and (2) a microarchitecture technique, which we refer to as resource stealing, that detects and reallocates excess cache capacity from a job while preserving its QoS target. We designed and evaluated three algorithms for performing resource stealing, which differ in how aggressive they are in stealing excess cache capacity, and in the degree of confidence in meeting QoS targets. In addition, we proposed a mechanism to dynamically enable or disable resource stealing depending on whether other jobs can benefit from additional cache capacity. We evaluated our QoS framework with a full system simulation of a 4-core CMP and a recent version of the Linux Operating System. We found that compared to an unoptimized scheme, the throughput can be improved by up to 47%, making the throughput significantly closer to a non-QoS CMP.Keywords
Funding Information
- Division of Computer and Network Systems (CNS-0406306CCF-0347425)
- Division of Computing and Communication Foundations (CNS-0406306CCF-0347425)
This publication has 22 references indexed in Scilit:
- Integrating Hard/Soft Real-Time Tasks and Best-Effort Jobs on Multiprocessors19th Euromicro Conference on Real-Time Systems (ECRTS'07), 2007
- QoS policies and architecture for cache/memory in CMP platformsPublished by Association for Computing Machinery (ACM) ,2007
- Virtual private cachesPublished by Association for Computing Machinery (ACM) ,2007
- Communist, utilitarian, and capitalist cache policies on CMPsPublished by Association for Computing Machinery (ACM) ,2006
- Architectural support for operating system-driven CMP cache managementPublished by Association for Computing Machinery (ACM) ,2006
- IntroductionCommunications of the ACM, 2003
- Simics: A full system simulation platformComputer, 2002
- Elastic scheduling for flexible workload managementIEEE Transactions on Computers, 2002
- Analytical cache models with applications to cache partitioningPublished by Association for Computing Machinery (ACM) ,2001
- Development and validation of a hierarchical memory model incorporating CPU- and memory-operation overlap modelPublished by Association for Computing Machinery (ACM) ,1998