Correct and efficient work-stealing for weak memory models

Abstract
International audienceChase and Lev's concurrent deque is a key data structure in shared- memory parallel programming and plays an essential role in work- stealing schedulers. We provide the first correctness proof of an optimized implementation of Chase and Lev's deque on top of the POWER and ARM architectures: these provide very relaxed mem- ory models, which we exploit to improve performance but consider- ably complicate the reasoning. We also study an optimized x86 and a portable C11 implementation, conducting systematic experiments to evaluate the impact of memory barrier optimizations. Our results demonstrate the benefits of hand tuning the deque code when run- ning on top of relaxed memory models

This publication has 9 references indexed in Scilit: