OSKI: A library of automatically tuned sparse matrix kernels

Abstract
The Optimized Sparse Kernel Interface (OSKI) is a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. These kernels include sparse matrix-vector multiply and sparse triangular solve, among others. The primary aim of this interface is to hide the complex decision- making process needed to tune the performance of a kernel implementation for a particular user's sparse matrix and machine, while also exposing the steps and potentially non-trivial costs of tuning at run-time. This paper provides an overview of OSKI, which is based on our research on automatically tuned sparse kernels for modern cache-based superscalar machines. 1. Goals and Motivation We describe the Optimized Sparse Kernel Interface (OSKI), a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. The kernels include sparse matrix-vector multiply (SpMV) and sparse triangular solve (SpTS), among others; "tuning" refers to the process of selecting the data structure and code transformations that lead to the fastest implementation of a kernel, given a machine and matrix. While conventional implementations of SpMV have historically run at 10% of machine peak or less, careful tuning can achieve up to 31% of peak and 4◊ speedups (1, Chap. 1). The challenge is that we must often defer tuning until run-time, since the matrix may be unknown until then. The need for run-time tuning diers from the case of dense kernels

This publication has 13 references indexed in Scilit: