High-throughput Near-Memory Processing on CNNs with 3D HBM-like Memory
- 28 June 2021
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Design Automation of Electronic Systems
- Vol. 26 (6), 1-20
- https://doi.org/10.1145/3460971
Abstract
This article discusses the high-performance near-memory neural network (NN) accelerator architecture utilizing the logic die in three-dimensional (3D) High Bandwidth Memory– (HBM) like memory. As most of the previously reported 3D memory-based near-memory NN accelerator designs used the Hybrid Memory Cube (HMC) memory, we first focus on identifying the key differences between HBM and HMC in terms of near-memory NN accelerator design. One of the major differences between the two 3D memories is that HBM has the centralized through- silicon-via (TSV) channels while HMC has distributed TSV channels for separate vaults. Based on the observation, we introduce the Round-Robin Data Fetching and Groupwise Broadcast schemes to exploit the centralized TSV channels for improvement of the data feeding rate for the processing elements. Using synthesized designs in a 28-nm CMOS technology, performance and energy consumption of the proposed architectures with various dataflow models are evaluated. Experimental results show that the proposed schemes reduce the runtime by 16.4–39.3% on average and the energy consumption by 2.1–5.1% on average compared to conventional data fetching schemes.Keywords
Funding Information
- National Research Foundation of Korea
- Korea government (NRF-2019R1A5A1027055, NRF-2020R1A2C2004329, and 2020-0-01309)
- Institute of Information & Communications Technology Planning & Evaluation
This publication has 30 references indexed in Scilit:
- Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural NetworksIEEE Journal of Solid-State Circuits, 2016
- A 1.2 V 20 nm 307 GB/s HBM DRAM With At-Speed Wafer-Level IO Test Scheme and Adaptive Refresh Considering Temperature DistributionIEEE Journal of Solid-State Circuits, 2016
- EIE: Efficient Inference Engine on Compressed Deep Neural NetworkPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- NeurocubeACM SIGARCH Computer Architecture News, 2016
- EIEACM SIGARCH Computer Architecture News, 2016
- Mastering the game of Go with deep neural networks and tree searchNature, 2016
- Deep learningNature, 2015
- Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural NetworksPublished by Association for Computing Machinery (ACM) ,2015
- DaDianNao: A Machine-Learning SupercomputerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Hybrid memory cube (HMC)Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011