Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks

1 December 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 2956-2964
https://doi.org/10.1109/iccv.2015.338

Abstract

While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to note that the human visual cortex generally contains more feedback than feedforward connections. In this paper, we will briefly introduce the background of feedbacks in the human visual cortex, which motivates us to develop a computational feedback mechanism in deep neural networks. In addition to the feedforward inference in traditional neural networks, a feedback loop is introduced to infer the activation status of hidden layer neurons according to the "goal" of the network, e.g., high-level semantic labels. We analogize this mechanism as "Look and Think Twice." The feedback networks help better visualize and understand how deep neural networks work, and capture visual attention on expected objects, even in images with cluttered background and multiple objects. Experiments on ImageNet dataset demonstrate its effectiveness in solving tasks such as image classification and object localization.

Keywords

This publication has 19 references indexed in Scilit:

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Caffe
Published by Association for Computing Machinery (ACM) ,2014
Visualizing and Understanding Convolutional Networks
Lecture Notes in Computer Science, 2014
Building high-level features using large scale unsupervised learning
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Selective Search for Object Recognition
International Journal of Computer Vision, 2013
Adaptive deconvolutional networks for mid and high level feature learning
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
ImageNet: A large-scale hierarchical image database
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Hierarchical Bayesian inference in the visual cortex
Journal of the Optical Society of America A, 2003
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Long Short-Term Memory
Neural Computation, 1997

Cited by 284 articles