Hybrid TLB Coalescing
- 24 June 2017
- journal article
- conference paper
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 45 (2), 444-456
- https://doi.org/10.1145/3140659.3080217
Abstract
To mitigate excessive TLB misses in large memory applications, techniques such as large pages, variable length segments, and HW coalescing, increase the coverage of limited hardware translation entries by exploiting the contiguous memory allocation. However, recent studies show that in non-uniform memory systems, using large pages often leads to performance degradation, or allocating large chunks of memory becomes more difficult due to memory fragmentation. Although each of the prior techniques favors its own best chunk size, diverse contiguity of memory allocation in real systems cannot always provide the optimal chunk of each technique. Under such fragmented and diverse memory allocations, this paper proposes a novel HW-SW hybrid translation architecture, which can adapt to different memory mappings efficiently. In the proposed hybrid coalescing technique, the operating system encodes memory contiguity information in a subset of page table entries, called anchor entries. During address translation through TLBs, an anchor entry provides translation for contiguous pages following the anchor entry. As a smaller number of anchor entries can cover a large portion of virtual address space, the efficiency of TLB can be significantly improved. The most important benefit of hybrid coalescing is its ability to change the coverage of the anchor entry dynamically, reflecting the current allocation contiguity status. By using the contiguity information directly set by the operating system, the technique can provide scalable translation coverage improvements with minor hardware changes, while allowing the flexibility of memory allocation. Our experimental results show that across diverse allocation scenarios with different distributions of contiguous memory chunks, the proposed scheme can effectively reap the potential translation coverage improvement from the existing contiguity.Keywords
This publication has 26 references indexed in Scilit:
- Data tiering in heterogeneous memory systemsPublished by Association for Computing Machinery (ACM) ,2016
- Redundant memory mappings for fast access to large memoriesPublished by Association for Computing Machinery (ACM) ,2015
- Prediction-based superpage-friendly TLB designsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Supporting superpages in non-contiguous physical memoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Efficient Memory Virtualization: Reducing Dimensionality of Nested Page WalksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Large-reach memory management unit cachesPublished by Association for Computing Machinery (ACM) ,2013
- Accelerating two-dimensional page walks for virtualized systemsPublished by Association for Computing Machinery (ACM) ,2008
- PinPublished by Association for Computing Machinery (ACM) ,2005
- Practical, transparent operating system support for superpagesACM SIGOPS Operating Systems Review, 2002
- Surpassing the TLB performance of superpages with less operating system supportPublished by Association for Computing Machinery (ACM) ,1994