Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package

Abstract

Deep neural network (DNN) models continue to grow in size and complexity, demanding higher computational power to enable real-time inference. To efficiently deliver such computational demands, hardware accelerators are being developed and deployed across scales. This naturally requires an efficient scale-out mechanism for increasing compute density as required by the application. 2.5D integration over interposer has emerged as a promising solution, but as we show in this work, the limited interposer bandwidth and multiple hops in the Network-on-Package (NoP) can diminish the benefits of the approach. To cope with this challenge, we propose WIENNA, a wireless NoP-based 2.5D DNN accelerator. In WIENNA, the wireless NoP connects an array of DNN accelerator chiplets to the global buffer chiplet, providing high-bandwidth multicasting capabilities. Here, we also identify the dataflow style that most efficienty exploits the wireless NoP's high-bandwidth multicasting capability on each layer. With modest area and power overheads, WIENNA achieves 2.2X-5.1X higher throughput and 38.2% lower energy than an interposer-based NoP design.

Keywords

This publication has 24 references indexed in Scilit:

Improving Inference Latency and Energy of DNNs through Wireless Enabled Multi-Chip-Module-based Architectures and Model Parameters Compression
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2020
Understanding the Impact of On-chip Communication on DNN Accelerator Performance
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2019
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2019
Replica
Published by Association for Computing Machinery (ACM) ,2019
An Asymmetric, Energy Efficient One-to-Many Traffic-Aware Wireless Network-in-Package Interconnection Architecture for Multichip Systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2018
On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2017
TETRIS
Published by Association for Computing Machinery (ACM) ,2017
ShiDianNao
Published by Association for Computing Machinery (ACM) ,2015
The last barrier: on-chip antennas
IEEE Microwave Magazine, 2013
An 8x 10-Gb/s Source-Synchronous I/O System Based on High-Density Silicon Carrier Interconnects
IEEE Journal of Solid-State Circuits, 2012

Cited by 5 articles