A performance portability framework for Python
- 3 June 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM) in Proceedings of the ACM International Conference on Supercomputing
Abstract
Kokkos is a programming model for writing performance portable applications for all major high performance computing platforms. It provides abstractions for data management and common parallel operations, allowing developers to write portable high performance code with minimal knowledge of architecture-specific details. Kokkos is implemented as a heavily-templated C++ library. However, C++ is not ideal for rapid prototyping and quick algorithmic exploration. An increasing number of developers use Python for scientific computing, machine learning, and data analytics. In this paper, we present a new Python framework, dubbed PyKokkos, for writing performance portable applications entirely in Python. PyKokkos provides Kokkos-like abstractions that are easier to use and more concise than the C++ interface. We implemented PyKokkos by building a translator from a subset of Python to C++ Kokkos and bridging necessary function calls via automatically generated Python bindings. PyKokkos is also compatible with NumPy, a widely-used high performance Python library. By porting several existing Kokkos applications to PyKokkos, including ExaMiniMD (∼3k lines of code in C++), we show that the latter can achieve efficient execution with low performance overhead.Keywords
Funding Information
- Department of Energy, National Nuclear Security Administration (DE-NA0003969)
This publication has 11 references indexed in Scilit:
- Julia: A Fresh Approach to Numerical ComputingSIAM Review, 2017
- A compiler for throughput optimization of graph algorithms on GPUsPublished by Association for Computing Machinery (ACM) ,2016
- GPU-STREAM v2.0: Benchmarking the Achievable Memory Bandwidth of Many-Core Processors Across Diverse Parallel Programming ModelsPublished by Springer Science and Business Media LLC ,2016
- NumbaPublished by Association for Computing Machinery (ACM) ,2015
- HJ-OpenCLPublished by Association for Computing Machinery (ACM) ,2015
- Dask: Parallel Computation with Blocked algorithms and Task SchedulingPublished by SciPy ,2015
- Kokkos: Enabling manycore performance portability through polymorphic memory access patternsJournal of Parallel and Distributed Computing, 2014
- Accelerating Habanero-Java programs with OpenCL generationPublished by Association for Computing Machinery (ACM) ,2013
- HalidePublished by Association for Computing Machinery (ACM) ,2013
- Python for Scientific ComputingComputing in Science & Engineering, 2007