Correct and efficient work-stealing for weak memory models

23 February 2013

conference paper
conference paper
Published by Association for Computing Machinery (ACM) in Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '13

Vol. 48 (8), 69-80
https://doi.org/10.1145/2442516.2442524

Abstract

International audienceChase and Lev's concurrent deque is a key data structure in shared- memory parallel programming and plays an essential role in work- stealing schedulers. We provide the first correctness proof of an optimized implementation of Chase and Lev's deque on top of the POWER and ARM architectures: these provide very relaxed mem- ory models, which we exploit to improve performance but consider- ably complicate the reasoning. We also study an optimized x86 and a portable C11 implementation, conducting systematic experiments to evaluate the impact of memory barrier optimizations. Our results demonstrate the benefits of hand tuning the deque code when run- ning on top of relaxed memory models

Keywords

This publication has 9 references indexed in Scilit:

An Axiomatic Memory Model for POWER Multiprocessors
Lecture Notes in Computer Science, 2012
Understanding POWER multiprocessors
Published by Association for Computing Machinery (ACM) ,2011
x86-TSO
Communications of the ACM, 2010
Idempotent work stealing
Published by Association for Computing Machinery (ACM) ,2009
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
Lecture Notes in Computer Science, 2009
KAAPI
Published by Association for Computing Machinery (ACM) ,2007
Dynamic circular work-stealing deque
Published by Association for Computing Machinery (ACM) ,2005
Scheduling multithreaded computations by work stealing
Journal of the ACM, 1999
The implementation of the Cilk-5 multithreaded language
Published by Association for Computing Machinery (ACM) ,1998

Cited by 28 articles