mpi4py.futures: MPI-Based Asynchronous Task Execution for Python
- 29 November 2022
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems
- Vol. 34 (2), 611-622
- https://doi.org/10.1109/tpds.2022.3225481
Abstract
We present mpi4py.futures, a lightweight, asynchronous task execution framework targeting the Python programming language and using the Message Passing Interface (MPI) for interprocess communication. mpi4py.futures follows the interface of the concurrent.futures package from the Python standard library and can be used as its drop-in replacement, while allowing applications to scale over multiple compute nodes. We discuss the design, implementation, and feature set of mpi4py.futures and compare its performance to other solutions on both shared and distributed memory architectures. On a shared-memory system, we show mpi4py.futures to consistently outperform Python's concurrent.futures with speedup ratios between 1.4X and 3.7X in throughput (tasks per second) and between 1.9X and 2.9X in bandwidth. On a Cray XC40 system, we compare mpi4py.futures to Dask – a well-known Python parallel computing package. Although we note more varied results, we show mpi4py.futures to outperform Dask in most scenarios.Keywords
Funding Information
- King Abdullah University of Science and Technology
This publication has 25 references indexed in Scilit:
- Policy tree optimization for threshold-based water resources management over multiple timescalesEnvironmental Modelling & Software, 2018
- Run-to-run variability on Xeon Phi based cray XC systemsPublished by Association for Computing Machinery (ACM) ,2017
- PyCOMPSs: Parallel computational workflows in PythonThe International Journal of High Performance Computing Applications, 2015
- A Runtime Library for Platform-Independent Task ParallelismPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- yt: A MULTI-CODE ANALYSIS TOOLKIT FOR ASTROPHYSICAL SIMULATION DATAThe Astrophysical Journal Supplement Series, 2010
- MapReduceCommunications of the ACM, 2008
- PyCogent: a toolkit for making sense from sequenceGenome Biology, 2007
- MPI for PythonJournal of Parallel and Distributed Computing, 2005
- MULTILISP: a language for concurrent symbolic computationACM Transactions on Programming Languages and Systems, 1985
- The incremental garbage collection of processesPublished by Association for Computing Machinery (ACM) ,1977