THE EFFECTS OF QUANTIZATION ON MULTI-LAYER FEEDFORWARD NEURAL NETWORKS

Abstract
In this paper we investigate the combined effect of quantization and clipping on multi-layer feedforward neural networks (MLFNN). Statistical models are used to analyze the effects of quantization in a digital implementation. We analyze the performance degradation caused as a function of the number of fixed-point and floating-point quantization bits in the MLFNN. To analyze a true nonlinear neuron, we adopt the uniform and normal probability distributions, compare the training performances with and without weight clipping, and derive in detail the effect of the quantization error on forward and backward propagation. No matter what distribution the initial weights comply with, the weights distribution will approximate a normal distribution for the training of floating-point or high-precision fixed-point quantization. Only when the number of quantization bits is very low, the weights distribution may cluster to ± 1 for the training with fixed-point quantization. We establish and analyze the relationships for a true nonlinear neuron between inputs and outputs bit resolution, the number of network layers and the performance degradation, based on statistical models of on-chip and off-chip training. Our experimental simulation results verify the presented theoretical analysis.

This publication has 15 references indexed in Scilit: