DUET: Boosting Deep Neural Network Efficiency on Dual-Module Architecture
- 1 October 2020
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Deep Neural Networks (DNNs) have been driving the mainstream of Machine Learning applications. However, deploying DNNs on modern hardware with stringent latency requirements and energy constraints is challenging because of the compute-intensive and memory-intensive execution patterns of various DNN models. We propose an algorithm-architecture co-design to boost DNN execution efficiency. Leveraging the noise resilience of nonlinear activation functions in DNNs, we propose dual-module processing that uses approximate modules learned from original DNN layers to compute insensitive activations. Therefore, we can save expensive computations and data accesses of unnecessary sensitive activations. We then design an Executor-Speculator dual-module architecture with support for balance execution and memory access reduction. With acceptable model inference quality degradation, our accelerator design can achieve 2.24x speedup and 1.97x energy efficiency improvement for compute-bound Convolutional Neural Networks (CNNs) and memory-bound Recurrent Neural Networks (RNNs).Keywords
This publication has 43 references indexed in Scilit:
- DeepDriving: Learning Affordance for Direct Perception in Autonomous DrivingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- ShiDianNaoACM SIGARCH Computer Architecture News, 2015
- PuDianNaoACM SIGARCH Computer Architecture News, 2015
- Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural NetworksPublished by Association for Computing Machinery (ACM) ,2015
- DaDianNao: A Machine-Learning SupercomputerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Convolutional Neural Networks for Speech RecognitionIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
- Speech recognition with deep recurrent neural networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- NeuFlow: A runtime reconfigurable dataflow processor for visionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2011
- A dynamically configurable coprocessor for convolutional neural networksACM SIGARCH Computer Architecture News, 2010
- Database-friendly random projectionsPublished by Association for Computing Machinery (ACM) ,2001