Operating system support for improving data locality on CC-NUMA compute servers
- 1 September 1996
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGOPS Operating Systems Review
- Vol. 30 (5), 279-289
- https://doi.org/10.1145/248208.237205
Abstract
The dominant architecture for the next generation of shared-memory multiprocessors is CC-NUMA (cache-coherent non-uniform memory architecture). These machines are attractive as compute servers because they provide transparent access to local and remote memory. However, the access latency to remote memory is 3 to 5 times the latency to local memory. CC-NOW machines provide the benefits of cache coherence to networks of workstations, at the cost of even higher remote access latency. Given the large remote access latencies of these architectures, data locality is potentially the most important performance issue. Using realistic workloads, we study the performance improvements provided by OS supported dynamic page migration and replication. Analyzing our kernel-based implementation, we provide a detailed breakdown of the costs. We show that sampling of cache misses can be used to reduce cost without compromising performance, and that TLB misses may not be a consistent approximation for cache misses. Finally, our experiments show that dynamic page migration and replication can substantially increase application performance, as much as 30%, and reduce contention for resources in the NUMA memory system.Keywords
This publication has 15 references indexed in Scilit:
- Competitive management of distributed shared memoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- STiNGPublished by Association for Computing Machinery (ACM) ,1996
- Complete computer system simulation: the SimOS approachIEEE Parallel & Distributed Technology: Systems & Applications, 1995
- SPLASHACM SIGARCH Computer Architecture News, 1992
- The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessorsPublished by Association for Computing Machinery (ACM) ,1991
- NUMA policies and their relation to memory architecturePublished by Association for Computing Machinery (ACM) ,1991
- Munin: distributed shared memory based on type-specific memory coherencePublished by Association for Computing Machinery (ACM) ,1990
- The directory-based cache coherence protocol for the DASH multiprocessorPublished by Association for Computing Machinery (ACM) ,1990
- The implementation of a coherent memory abstraction on a NUMA multiprocessor: experiences with platinumPublished by Association for Computing Machinery (ACM) ,1989
- Reference history, page size, and migration daemons in local/remote architecturesPublished by Association for Computing Machinery (ACM) ,1989