FINN- R
Top Cited Papers
- 30 September 2018
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Reconfigurable Technology and Systems
- Vol. 11 (3), 1-23
- https://doi.org/10.1145/3242897
Abstract
Convolutional Neural Networks have rapidly become the most successful machine-learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing systems. While the underlying arithmetic is structurally simple, compute and memory requirements are challenging. One of the promising opportunities is leveraging reduced-precision representations for inputs, activations, and model parameters. The resulting scalability in performance, power efficiency, and storage footprint provides interesting design compromises in exchange for a small reduction in accuracy. FPGAs are ideal for exploiting low-precision inference engines leveraging custom precisions to achieve the required numerical accuracy for a given application. In this article, we describe the second generation of the FINN framework, an end-to-end tool that enables design-space exploration and automates the creation of fully customized inference engines on FPGAs. Given a neural network description, the tool optimizes for given platforms, design targets, and a specific precision. We introduce formalizations of resource cost functions and performance predictions and elaborate on the optimization algorithms. Finally, we evaluate a selection of reduced precision neural networks ranging from CIFAR-10 classifiers to YOLO-based object detection on a range of platforms including PYNQ and AWS F1, demonstrating new unprecedented measured throughput at 50 TOp/s on AWS F1 and 5 TOp/s on embedded devices.Keywords
Funding Information
- National Science Foundation (1717213)
This publication has 37 references indexed in Scilit:
- CaffeinePublished by Association for Computing Machinery (ACM) ,2016
- DianNao familyCommunications of the ACM, 2016
- CaffePressoPublished by Association for Computing Machinery (ACM) ,2016
- Convolutional networks for fast, energy-efficient neuromorphic computingProceedings of the National Academy of Sciences of the United States of America, 2016
- Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural NetworksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary WeightsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- FPGA based implementation of deep neural networks using on-chip memory onlyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural NetworksPublished by Association for Computing Machinery (ACM) ,2015
- Artificial neural networks in hardware: A survey of two decades of progressNeurocomputing, 2010
- RooflineCommunications of the ACM, 2009