GEVO
- 25 November 2020
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 17 (4), 1-28
- https://doi.org/10.1145/3418055
Abstract
GPUs are a key enabler of the revolution in machine learning and high-performance computing, functioning as de facto co-processors to accelerate large-scale computation. As the programming stack and tool support have matured, GPUs have also become accessible to programmers, who may lack detailed knowledge of the underlying architecture and fail to fully leverage the GPU’s computation power. GEVO (Gpu optimization using EVOlutionary computation) is a tool for automatically discovering optimization opportunities and tuning the performance of GPU kernels in the LLVM representation. GEVO uses population-based search to find edits to GPU code compiled to LLVM-IR and improves performance on desired criteria while retaining required functionality. We demonstrate that GEVO improves the execution time of general-purpose GPU programs and machine learning (ML) models on NVIDIA Tesla P100. For the Rodinia benchmarks, GEVO improves GPU kernel runtime performance by an average of 49.48% and by as much as 412% over the fully compiler-optimized baseline. If kernel output accuracy is relaxed to tolerate up to 1% error, GEVO can find kernel variants that outperform the baseline by an average of 51.08%. For the ML workloads, GEVO achieves kernel performance improvement for SVM on the MNIST handwriting recognition (3.24×) and the a9a income prediction (2.93×) datasets with no loss of model accuracy. GEVO achieves 1.79× kernel performance improvement on image classification using ResNet18/CIFAR-10, with less than 1% model accuracy reduction.Keywords
Funding Information
- Defense Advanced Research Projects Agency (FA8750-19C-0003)
- National Science Foundation (CCF-1618039, SHF-1652132, CCF-1908633)
- Air Force Research Laboratory (FA8750-19-1-0501)
This publication has 69 references indexed in Scilit:
- Software mutational robustnessGenetic Programming and Evolvable Machines, 2013
- Synthesis of loop-free programsACM SIGPLAN Notices, 2011
- LIBSVMACM Transactions on Intelligent Systems and Technology, 2011
- A Hypercube-Based Encoding for Evolving Large-Scale Neural NetworksArtificial Life, 2009
- General purpose molecular dynamics simulations fully implemented on graphics processing unitsJournal of Computational Physics, 2008
- Z3: An Efficient SMT SolverLecture Notes in Computer Science, 2008
- Automatic generation of peephole superoptimizersACM SIGARCH Computer Architecture News, 2006
- Evolving Neural Networks through Augmenting TopologiesEvolutionary Computation, 2002
- Gradient-based learning applied to document recognitionProceedings of the IEEE, 1998
- Large-scale parallel data clusteringIEEE Transactions on Pattern Analysis and Machine Intelligence, 1998