Understanding and utilizing hardware transactional memory capacity
- 22 June 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM) in Proceedings of the 2021 ACM SIGPLAN International Symposium on Memory Management
Abstract
Hardware transactional memory (HTM) provides a simpler programming model than lock-based synchronization. However, HTM has limits that mean that transactions may suffer costly capacity aborts. Understanding HTM capacity is therefore critical. Unfortunately, crucial implementation details are undisclosed. In practice HTM capacity can manifest in puzzling ways. It is therefore unsurprising that the literature reports results that appear to be highly contradictory, reporting capacities that vary by nearly three orders of magnitude. We conduct an in-depth study into the causes of HTM capacity aborts using four generations of Intel's Transactional Synchronization Extensions (TSX). We identify the apparent contradictions among prior work, and shed new light on the causes of HTM capacity aborts. In doing so, we reconcile the apparent contradictions. We focus on how replacement policies and the status of the cache can affect HTM capacity. One source of surprising behavior appears to be the cache replacement policies used by the processors we evaluated. Both invalidating the cache and warming it up with the transactional working set can significantly improve the read capacity of transactions across the microarchitectures we tested. A further complication is that a physically indexed LLC will typically yield only half the total LLC capacity. We found that methodological differences in the prior work led to different warmup states and thus to their apparently contradictory findings. This paper deepens our understanding of how the underlying implementation and cache behavior affect the apparent capacity of HTM. Our insights on how to increase the read capacity of transactions can be used to optimize HTM applications, particularly those with large read-mostly transactions, which are common in the context of optimistic parallelization.Keywords
Funding Information
- Australian Research Council (DP190103367)
- National Science Foundation (XPS-1629126)
This publication has 33 references indexed in Scilit:
- Quantitative comparison of hardware transactional memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8Published by Association for Computing Machinery (ACM) ,2015
- Performance and Energy Analysis of the Restricted Transactional Memory Implementation on HaswellPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Reverse engineering of cache replacement policies in Intel microprocessors and their evaluationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Software Transactional Memory: Why Is It Only a Research Toy?Queue, 2008
- Concurrent GC leveraging transactional memoryPublished by Association for Computing Machinery (ACM) ,2008
- Making the fast case common and the uncommon case simple in unbounded transactional memoryPublished by Association for Computing Machinery (ACM) ,2007
- Hybrid transactional memoryPublished by Association for Computing Machinery (ACM) ,2006
- Performance evaluation of cache replacement policies for the SPEC CPU2000 benchmark suitePublished by Association for Computing Machinery (ACM) ,2004
- Language support for lightweight transactionsPublished by Association for Computing Machinery (ACM) ,2003
- Expected Length of the Longest Probe Sequence in Hash Code SearchingJournal of the ACM, 1981