HeMem
Open Access
- 26 October 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
Abstract
High-capacity non-volatile memory (NVM) is a new main memory tier. Tiered DRAM+NVM servers increase total memory capacity by up to 8x, but can diminish memory bandwidth by up to 7x and inflate latency by up to 63% if not managed well. We study existing hardware and software tiered memory management systems on the recently available Intel Optane DC NVM with big data applications and find that no existing system maximizes application performance on real NVM. Based on our findings, we present HeMem, a tiered main memory management system designed from scratch for commercially available NVM and the big data applications that use it. HeMem manages tiered memory asynchronously, batching and amortizing memory access tracking, migration, and associated TLB synchronization overheads. HeMem monitors application memory use by sampling memory access via CPU events, rather than page tables. This allows HeMem to scale to terabytes of memory, keeping small and ephemeral data structures in fast memory, and allocating scarce, asymmetric NVM bandwidth according to access patterns. Finally, HeMem is flexible by placing per-application memory management policy at user-level. On a system with Intel Optane DC NVM, HeMem outperforms hardware, OS, and PL-based tiered memory management, providing up to 50% runtime reduction for the GAP graph processing benchmark, 13% higher throughput for TPC-C on the Silo in-memory database, 16% lower tail-latency under performance isolation for a key-value store, and up to 10x less NVM wear than the next best solution, without application modification.Keywords
Funding Information
- NSF (National Science Foundation) (1719061,2008884)
This publication has 15 references indexed in Scilit:
- UnimemPublished by Association for Computing Machinery (ACM) ,2017
- Data tiering in heterogeneous memory systemsPublished by Association for Computing Machinery (ACM) ,2016
- High Performance Packet Processing with FlexNICPublished by Association for Computing Machinery (ACM) ,2016
- Exploiting Program Semantics to Place Data in Hybrid MemoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge ServerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Profiling a warehouse-scale computerPublished by Association for Computing Machinery (ACM) ,2015
- Speedy transactions in multicore in-memory databasesPublished by Association for Computing Machinery (ACM) ,2013
- Whare-mapPublished by Association for Computing Machinery (ACM) ,2013
- Workload analysis of a large-scale key-value storePublished by Association for Computing Machinery (ACM) ,2012
- TPC-E vs. TPC-CACM SIGMOD Record, 2011