moTuner

Abstract
Arithmetic operators are now used in a wide spectrum of domains, including artificial intelligence, data analytics and scientific computing. Meanwhile, specialized hardware components to enable low-precision computing are increasingly deployed in GPUs and accelerators. Whereas promising to boost performance, accelerating the operators on the hardware necessitates manually tuning the mixed-precision knobs to balance the performance and accuracy, which can be extremely challenging in real practices. To address the issue, we present moTuner, an automatic framework for efficiently tuning mixed-precision operators. moTuner works on compiler-level to automatically enable the mixed-precision computation, without involving any manual modifications of source code and/or the operator library, thus significantly alleviating the programming burden. Owing to be implemented in compilation phase, moTuner can be more widely applicable with lessened efforts on the libraries. Further, moTuner adopts optimized search strategy in tuning to effectively narrow down the configuration space. The evaluations on GEMM operators and real applications demonstrate that moTuner achieves performance improvement up to 3.13x and 1.15x respectively, while guaranteeing considerably high accuracy.
Funding Information
  • National Natural Science Foundation of China (62102465, U1811461)
  • The Program for Guangdong Introducing Innovative and Entrepreneurial Teams (2016ZT06D211)
  • the Guangdong Natural Science Foundation (2018B030312002)
  • the Major Program of Guangdong Basic and Applied Research (2019B030302002)
  • CCF-Baidu Open Fund (CCF-BAIDU OF2021032)

This publication has 19 references indexed in Scilit: