An HTM-based update-side synchronization for RCU on NUMA systems

Abstract
Read-copy update (RCU) can provide ideal scalability for read-mostly workloads, but some believe that it provides only poor performance for updates. This belief is due to the lack of RCU-centric update synchronization mechanisms. RCU instead works with a range of update-side mechanisms, such as locking. In fact, many developers embrace simplicity by using global locking. Logging, hardware transactional memory, or fine-grained locking can provide better scalability, but each of these approaches has limitations, such as imposing overhead on readers or poor scalability on non-uniform memory access (NUMA) systems, mainly due to their lack of NUMA-aware design principles. This paper introduces an RCU extension (RCX) that provides highly scalable RCU updates on NUMA systems while retaining RCU's read-side benefits. RCX is a software-based synchronization mechanism combining hardware transactional memory (HTM) and traditional locking based on our NUMA-aware design principles for RCU. Micro-bench-marks on a NUMA system having 144 hardware threads show RCX has up to 22.6 times better performance and up to 145 times lower HTM abort rates compared to a state-of-the-art RCU/HTM combination. To demonstrate the effectiveness and applicability of RCX, we have applied RCX to parallelize some of the Linux kernel memory management system and an in-memory database system. The optimized kernel and the database show up to 24 and 17 times better performance compared to the original version, respectively.
Funding Information
  • National Research Foundation of Korea (NRF) by the Korea government (MSIP) (NRF-2015M3C4A7065646)

This publication has 31 references indexed in Scilit: