An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators

Open Access

Abstract

Deep Neural Networks (DNNs) have shown significant advantages in many domains, such as pattern recognition, prediction, and control optimization. The edge computing demand in the Internet-of-Things (IoTs) era has motivated many kinds of computing platforms to accelerate DNN operations. However, due to the massive parallel processing, the performance of the current large-scale artificial neural network is often limited by the huge communication overheads and storage requirements. As a result, efficient interconnection and data movement mechanisms for future on-chip artificial intelligence (AI) accelerators are worthy of study. Currently, a large body of research aims to find an efficient on-chip interconnection to achieve low-power and high-bandwidth DNN computing. This paper provides a comprehensive investigation of the recent advances in efficient on-chip interconnection and design methodology of the DNN accelerator design. First, we provide an overview of the different interconnection methods on the DNN accelerator. Then, the interconnection methods on the non-ASIC DNN accelerator will be discussed. On the other hand, with the flexible interconnection, the DNN accelerator can support different computing flow, which increases the computing flexibility. With this motivation, reconfigurable DNN computing with flexible on-chip interconnection will be investigated in this paper. Finally, we investigate the emerging interconnection technologies (e.g., in/near-memory processing) for the DNN accelerator design. This paper systematically investigates the interconnection networks in modern DNN accelerator designs. With this article, the readers are able to: 1) understand the interconnection design for DNN accelerators; 2) evaluate DNNs with different on-chip interconnection; 3) familiarize with the trade-offs under different interconnections.

Keywords

Funding Information

Ministry of Science and Technology Taiwan (MOST 109-2221-E-110-062)

This publication has 120 references indexed in Scilit:

A scalable processing-in-memory accelerator for parallel graph processing
Published by Association for Computing Machinery (ACM) ,2015
RENO
Published by Association for Computing Machinery (ACM) ,2015
Spin-Transfer Torque Magnetic Memory as a Stochastic Memristive Synapse for Neuromorphic Systems
IEEE Transactions on Biomedical Circuits and Systems, 2015
Fixed latency on-chip interconnect for hardware spiking neural network architectures
Parallel Computing, 2013
Spin-transfer torque magnetic random access memory (STT-MRAM)
ACM Journal on Emerging Technologies in Computing Systems, 2013
Advancing interconnect density for spiking neural network hardware implementations using traffic-aware adaptive network-on-chip routers
Neural Networks, 2012
Fast and Scalable CPU/GPU Collision Detection for Rigid and Deformable Surfaces
Computer Graphics Forum, 2010
SPIKING NEURAL NETWORKS
International Journal of Neural Systems, 2009
The missing memristor found
Nature, 2008
On-line algorithms for path selection in a nonblocking network
Published by Association for Computing Machinery (ACM) ,1990

Cited by 47 articles