Towards neural architecture-aware exploration of compiler optimizations in a deep learning {graph} compiler
- 17 May 2022
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM) in Proceedings of the 19th ACM International Conference on Computing Frontiers
Abstract
Deep Neural Networks (DNN) form the basis for many existing and emerging applications. Many DL compilers analyze the computation graphs and apply various optimizations at different stages. These high-level optimizations are applied using compiler passes before feeding the resultant computation graph for low-level and hardware-specific optimizations. With advancements in DNN architectures and backend hardware, the search space of compiler optimizations has grown manifolds. Also, the inclusion of passes without the knowledge of the computation graph leads to increased execution time with a slight influence on the intermediate representation. This paper presents preliminary results 1) summarizing the relevance of pass selection and ordering in a DL compiler, 2) neural architecture-aware selection of optimization passes, and 3) pruning search space for the phase selection problem in a DL compiler. We use TVM as a compiler to demonstrate the experimental results on Nvidia A100 and GeForce RTX 2080 GPUs, establishing the relevance of neural architecture-aware selection of optimization passes for DNNs DL compilers. Experimental evaluation with seven models categorized into four architecturally different classes demonstrated performance gains for most neural networks. For ResNets, the average throughput increased by 24% and 32% for TensorFlow and PyTorch frameworks, respectively. Additionally, we observed an average 15% decrease in the compilation time for ResNets, 45% for MobileNet, and 54% for SSD-based models without impacting the throughput. BERT models showed a dramatic improvement with a 92% reduction in the compile time.Keywords
Funding Information
- U.S. Office of the Under Secretary of Defense for Research and Engineering (OUSD(R&E)) (FA8750-15-2-0119)
This publication has 20 references indexed in Scilit:
- SSD: Single Shot MultiBox DetectorPublished by Springer Science and Business Media LLC ,2016
- A graph-based iterative compiler pass selection and phase ordering approachACM SIGPLAN Notices, 2016
- Clustering-Based Selection for the Exploration of Compiler Optimization SequencesACM Transactions on Architecture and Code Optimization, 2016
- Predictive modeling methodology for compiler phase-orderingPublished by Association for Computing Machinery (ACM) ,2016
- Use of Previously Acquired Positioning of Optimizations for Phase Ordering ExplorationPublished by Association for Computing Machinery (ACM) ,2015
- A clustering-based approach for exploring sequences of compiler optimizationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Automatic selection of compiler options using genetic techniques for embedded software designPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Collective optimizationACM Transactions on Architecture and Code Optimization, 2010
- Practical exhaustive optimization phase order exploration and evaluationACM Transactions on Architecture and Code Optimization, 2009
- Optimizing general purpose compiler optimizationPublished by Association for Computing Machinery (ACM) ,2005