DyFiP

Abstract

Filter pruning is one of the most effective ways to accelerate Convolutional Neural Networks (CNNs). Most of the existing works are focused on the static pruning of CNN filters. In dynamic pruning of CNN filters, existing works are based on the idea of switching between different branches of a CNN or exiting early based on the hardness of a sample. These approaches can reduce the average latency of inference, but they cannot reduce the longest-path latency of inference. In contrast, we present a novel approach of dynamic filter pruning that utilizes explainable AI along with early coarse prediction in the intermediate layers of a CNN. This coarse prediction is performed using a simple branch that is trained to perform top-k classification. The branch either predicts the output class with high confidence, in which case the rest of the computations are left out. Alternatively, the branch predicts the output class to be within a subset of possible output classes. After this coarse prediction, only those filters that are important for this subset of classes are then evaluated. The importances of filters for each output class are obtained using explainable AI. Using this concept of dynamic pruning, we are able not only to reduce the average latency of inference, but also the longest-path latency of inference. Our proposed architecture for dynamic pruning can be deployed on different hardware platforms.

Keywords

This publication has 6 references indexed in Scilit:

It's always personal
Published by Association for Computing Machinery (ACM) ,2021
Dynamic Pruning of CNN networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2019
Shallowing Deep Networks: Layer-Wise Pruning Based on Feature Representations
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018
BranchyNet: Fast inference via early exiting from deep neural networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
The NumPy Array: A Structure for Efficient Numerical Computation
Computing in Science & Engineering, 2011
Matplotlib: A 2D Graphics Environment
Computing in Science & Engineering, 2007

Cited by 7 articles