Exploiting Parallelism Opportunities with Deep Learning Frameworks

30 December 2020

journal article
research article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization

Vol. 18 (1), 1-23
https://doi.org/10.1145/3431388

Abstract

State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers. Identifying and using a performance-optimal setting in feature-rich frameworks, however, involves a non-trivial amount of performance profiling efforts and often relies on domain-specific knowledge. This article takes a deep dive into analyzing the performance impact of key design features in a machine learning framework and quantifies the role of parallelism. The observations and insights distill into a simple set of guidelines that one can use to achieve much higher training and inference speedup. Across a diverse set of real-world deep learning models, the evaluation results show that the proposed performance tuning guidelines outperform the Intel and TensorFlow recommended settings by 1.30× and 1.38×, respectively.

Keywords

Funding Information

NSF (1533737)

This publication has 16 references indexed in Scilit:

Neural Collaborative Filtering
Published by Association for Computing Machinery (ACM) ,2017
Julia: A Fresh Approach to Numerical Computing
SIAM Review, 2017
Benchmarking State-of-the-Art Deep Learning Software Tools
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Deep Neural Networks for YouTube Recommendations
Published by Association for Computing Machinery (ACM) ,2016
Deep Residual Learning for Image Recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Rethinking the Inception Architecture for Computer Vision
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
High Prevalence of Assisted Injection Among Street-Involved Youth in a Canadian Setting
AIDS and Behavior, 2015
Going deeper with convolutions
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Caffe
Published by Association for Computing Machinery (ACM) ,2014
A Top-Down method for performance analysis and counters architecture
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014

Cited by 10 articles