Exploiting Parallelism Opportunities with Deep Learning Frameworks
- 30 December 2020
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 18 (1), 1-23
- https://doi.org/10.1145/3431388
Abstract
State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers. Identifying and using a performance-optimal setting in feature-rich frameworks, however, involves a non-trivial amount of performance profiling efforts and often relies on domain-specific knowledge. This article takes a deep dive into analyzing the performance impact of key design features in a machine learning framework and quantifies the role of parallelism. The observations and insights distill into a simple set of guidelines that one can use to achieve much higher training and inference speedup. Across a diverse set of real-world deep learning models, the evaluation results show that the proposed performance tuning guidelines outperform the Intel and TensorFlow recommended settings by 1.30× and 1.38×, respectively.Keywords
Funding Information
- NSF (1533737)
This publication has 16 references indexed in Scilit:
- Neural Collaborative FilteringPublished by Association for Computing Machinery (ACM) ,2017
- Julia: A Fresh Approach to Numerical ComputingSIAM Review, 2017
- Benchmarking State-of-the-Art Deep Learning Software ToolsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Deep Neural Networks for YouTube RecommendationsPublished by Association for Computing Machinery (ACM) ,2016
- Deep Residual Learning for Image RecognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Rethinking the Inception Architecture for Computer VisionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- High Prevalence of Assisted Injection Among Street-Involved Youth in a Canadian SettingAIDS and Behavior, 2015
- Going deeper with convolutionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- CaffePublished by Association for Computing Machinery (ACM) ,2014
- A Top-Down method for performance analysis and counters architecturePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014