High-throughput Near-Memory Processing on CNNs with 3D HBM-like Memory

28 June 2021

journal article
research article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Design Automation of Electronic Systems

Vol. 26 (6), 1-20
https://doi.org/10.1145/3460971

Abstract

This article discusses the high-performance near-memory neural network (NN) accelerator architecture utilizing the logic die in three-dimensional (3D) High Bandwidth Memory– (HBM) like memory. As most of the previously reported 3D memory-based near-memory NN accelerator designs used the Hybrid Memory Cube (HMC) memory, we first focus on identifying the key differences between HBM and HMC in terms of near-memory NN accelerator design. One of the major differences between the two 3D memories is that HBM has the centralized through- silicon-via (TSV) channels while HMC has distributed TSV channels for separate vaults. Based on the observation, we introduce the Round-Robin Data Fetching and Groupwise Broadcast schemes to exploit the centralized TSV channels for improvement of the data feeding rate for the processing elements. Using synthesized designs in a 28-nm CMOS technology, performance and energy consumption of the proposed architectures with various dataflow models are evaluated. Experimental results show that the proposed schemes reduce the runtime by 16.4–39.3% on average and the energy consumption by 2.1–5.1% on average compared to conventional data fetching schemes.

Keywords

Funding Information

National Research Foundation of Korea
Korea government (NRF-2019R1A5A1027055, NRF-2020R1A2C2004329, and 2020-0-01309)
Institute of Information & Communications Technology Planning & Evaluation

This publication has 30 references indexed in Scilit:

Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
IEEE Journal of Solid-State Circuits, 2016
A 1.2 V 20 nm 307 GB/s HBM DRAM With At-Speed Wafer-Level IO Test Scheme and Adaptive Refresh Considering Temperature Distribution
IEEE Journal of Solid-State Circuits, 2016
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Neurocube
ACM SIGARCH Computer Architecture News, 2016
EIE
ACM SIGARCH Computer Architecture News, 2016
Mastering the game of Go with deep neural networks and tree search
Nature, 2016
Deep learning
Nature, 2015
Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks
Published by Association for Computing Machinery (ACM) ,2015
DaDianNao: A Machine-Learning Supercomputer
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Hybrid memory cube (HMC)
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011

Cited by 4 articles