DUET: Boosting Deep Neural Network Efficiency on Dual-Module Architecture

1 October 2020

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 738-750
https://doi.org/10.1109/micro50266.2020.00066

Abstract

Deep Neural Networks (DNNs) have been driving the mainstream of Machine Learning applications. However, deploying DNNs on modern hardware with stringent latency requirements and energy constraints is challenging because of the compute-intensive and memory-intensive execution patterns of various DNN models. We propose an algorithm-architecture co-design to boost DNN execution efficiency. Leveraging the noise resilience of nonlinear activation functions in DNNs, we propose dual-module processing that uses approximate modules learned from original DNN layers to compute insensitive activations. Therefore, we can save expensive computations and data accesses of unnecessary sensitive activations. We then design an Executor-Speculator dual-module architecture with support for balance execution and memory access reduction. With acceptable model inference quality degradation, our accelerator design can achieve 2.24x speedup and 1.97x energy efficiency improvement for compute-bound Convolutional Neural Networks (CNNs) and memory-bound Recurrent Neural Networks (RNNs).

Keywords

This publication has 43 references indexed in Scilit:

DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
ShiDianNao
ACM SIGARCH Computer Architecture News, 2015
PuDianNao
ACM SIGARCH Computer Architecture News, 2015
Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks
Published by Association for Computing Machinery (ACM) ,2015
DaDianNao: A Machine-Learning Supercomputer
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Convolutional Neural Networks for Speech Recognition
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
Speech recognition with deep recurrent neural networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
NeuFlow: A runtime reconfigurable dataflow processor for vision
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
A dynamically configurable coprocessor for convolutional neural networks
ACM SIGARCH Computer Architecture News, 2010
Database-friendly random projections
Published by Association for Computing Machinery (ACM) ,2001

Cited by 18 articles