Proteus
- 1 June 2016
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 23-23:12
- https://doi.org/10.1145/2925426.2926294
Abstract
This work exploits the tolerance of Deep Neural Networks (DNNs) to reduced precision numerical representations and specifically, their recently demonstrated ability to tolerate representations of different precision per layer while maintaining accuracy. This flexibility enables improvements over conventional DNN implementations that use a single, uniform representation. This work proposes Proteus, which reduces the data traffic and storage footprint needed by DNNs, resulting in reduced energy and improved area efficiency for DNN implementations. Proteus uses a different representation per layer for both the data (neurons) and the weights (synapses) processed by DNNs. Proteus is a layered extension over existing DNN implementations that converts between the numerical representation used by the DNN execution engines and the shorter, layer-specific fixed-point representation used when reading and writing data values to memory be it on-chip buffers or off-chip memory. Proteus uses a novel memory layout for DNN data, enabling a simple, low-cost and low-energy conversion unit. We evaluate Proteus as an extension to a state-of-the-art accelerator [7] which uses a uniform 16-bit fixed-point representation. On five popular DNNs Proteus reduces data traffic among layers by 43% on average while maintaining accuracy within 1% even when compared to a single precision floating-point implementation. As a result, Proteus improves energy by 15% with no performance loss. Proteus also reduces the data footprint by at least 38% and hence the amount of on-chip buffering needed resulting in an implementation that requires 20% less area overall. This area savings can be used to improve cost by building smaller chips, to process larger DNNs for the same on-chip area, or to incorporate an additional three execution engines increasing peak performance bandwidth by 18%.Keywords
This publication has 19 references indexed in Scilit:
- ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision, 2015
- Fixed point optimization of deep convolutional neural networks for object recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- DESTINY: A Tool for Modeling Emerging 3D NVM and eDRAM cachesPublished by EDAA ,2015
- DianNaoACM SIGPLAN Notices, 2014
- Mitosis Detection in Breast Cancer Histology Images with Deep Neural NetworksPublished by Springer Science and Business Media LLC ,2013
- Base-delta-immediate compressionPublished by Association for Computing Machinery (ACM) ,2012
- A fixed point implementation of the backpropagation learning algorithmPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Gradient-based learning applied to document recognitionProceedings of the IEEE, 1998
- Using simulations of reduced precision arithmetic to design a neuro-microprocessorJournal of Signal Processing Systems, 1993
- Finite precision error analysis of neural network hardware implementationsInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1993