PolyNet: A Pursuit of Structural Diversity in Very Deep Networks

1 July 2017

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 10636919,p. 3900-3908
https://doi.org/10.1109/cvpr.2017.415

Abstract

A number of studies have shown that increasing the depth or width of convolutional networks is a rewarding approach to improve the performance of image recognition. In our study, however, we observed difficulties along both directions. On one hand, the pursuit for very deep networks is met with a diminishing return and increased training difficulty, on the other hand, widening a network would result in a quadratic growth in both computational cost and memory demand. These difficulties motivate us to explore structural diversity in designing deep networks, a new dimension beyond just depth and width. Specifically, we present a new family of modules, namely the PolyInception, which can be flexibly inserted in isolation or in a composition as replacements of different parts of a network. Choosing PolyInception modules with the guidance of architectural efficiency can improve the expressive power while preserving comparable computational cost. The Very Deep PolyNet, designed following this direction, demonstrates substantial improvements over the state-of-the-art on the ILSVRC 2012 benchmark. Compared to Inception-ResNet-v2, it reduces the top-5 validation error on single crops from 4.9% to 4.25%, and that on multi-crops from 3.7% to 3.45%.

Keywords

This publication has 10 references indexed in Scilit:

Deep Residual Learning for Image Recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Convolutional neural networks at constrained time cost
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Going deeper with convolutions
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
ImageNet Large Scale Visual Recognition Challenge
International Journal of Computer Vision, 2015
The random subspace method for constructing decision forests
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998
Bagging predictors
Machine Learning, 1996
Learning long-term dependencies with gradient descent is difficult
IEEE Transactions on Neural Networks, 1994
Adaptive Mixtures of Local Experts
Neural Computation, 1991
The strength of weak learnability
Machine Learning, 1990
Boosting a weak learning algorithm by majority
Published by Elsevier BV ,1990

Cited by 140 articles