Hybrid TLB Coalescing

24 June 2017

journal article
conference paper
Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News

Vol. 45 (2), 444-456
https://doi.org/10.1145/3140659.3080217

Abstract

To mitigate excessive TLB misses in large memory applications, techniques such as large pages, variable length segments, and HW coalescing, increase the coverage of limited hardware translation entries by exploiting the contiguous memory allocation. However, recent studies show that in non-uniform memory systems, using large pages often leads to performance degradation, or allocating large chunks of memory becomes more difficult due to memory fragmentation. Although each of the prior techniques favors its own best chunk size, diverse contiguity of memory allocation in real systems cannot always provide the optimal chunk of each technique. Under such fragmented and diverse memory allocations, this paper proposes a novel HW-SW hybrid translation architecture, which can adapt to different memory mappings efficiently. In the proposed hybrid coalescing technique, the operating system encodes memory contiguity information in a subset of page table entries, called anchor entries. During address translation through TLBs, an anchor entry provides translation for contiguous pages following the anchor entry. As a smaller number of anchor entries can cover a large portion of virtual address space, the efficiency of TLB can be significantly improved. The most important benefit of hybrid coalescing is its ability to change the coverage of the anchor entry dynamically, reflecting the current allocation contiguity status. By using the contiguity information directly set by the operating system, the technique can provide scalable translation coverage improvements with minor hardware changes, while allowing the flexibility of memory allocation. Our experimental results show that across diverse allocation scenarios with different distributions of contiguous memory chunks, the proposed scheme can effectively reap the potential translation coverage improvement from the existing contiguity.

Keywords

This publication has 26 references indexed in Scilit:

Data tiering in heterogeneous memory systems
Published by Association for Computing Machinery (ACM) ,2016
Redundant memory mappings for fast access to large memories
Published by Association for Computing Machinery (ACM) ,2015
Prediction-based superpage-friendly TLB designs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Supporting superpages in non-contiguous physical memory
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Efficient Memory Virtualization: Reducing Dimensionality of Nested Page Walks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Large-reach memory management unit caches
Published by Association for Computing Machinery (ACM) ,2013
Accelerating two-dimensional page walks for virtualized systems
Published by Association for Computing Machinery (ACM) ,2008
Pin
Published by Association for Computing Machinery (ACM) ,2005
Practical, transparent operating system support for superpages
ACM SIGOPS Operating Systems Review, 2002
Surpassing the TLB performance of superpages with less operating system support
Published by Association for Computing Machinery (ACM) ,1994

Cited by 4 articles