SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks

1 December 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 262-270
https://doi.org/10.1109/iccv.2015.38

Abstract

Saliency in Context (SALICON) is an ongoing effort that aims at understanding and predicting visual attention. Conventional saliency models typically rely on low-level image statistics to predict human fixations. While these models perform significantly better than chance, there is still a large gap between model prediction and human behavior. This gap is largely due to the limited capability of models in predicting eye fixations with strong semantic content, the so-called semantic gap. This paper presents a focused study to narrow the semantic gap with an architecture based on Deep Neural Network (DNN). It leverages the representational power of high-level semantics encoded in DNNs pretrained for object recognition. Two key components are fine-tuning the DNNs fully convolutionally with an objective function based on the saliency evaluation metrics, and integrating information at different image scales. We compare our method with 14 saliency models on 6 public eye tracking benchmark datasets. Results demonstrate that our DNNs can automatically learn features particularly for saliency prediction that surpass by a big margin the state-of-the-art. In addition, our model ranks top to date under all seven metrics on the MIT300 challenge set.

Keywords

This publication has 29 references indexed in Scilit:

Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost
Journal of Vision, 2012
Saliency from hierarchical adaptation through decorrelation and variance normalization
Image and Vision Computing, 2012
Objects predict fixations better than early saliency
Journal of Vision, 2008
SUN: A Bayesian framework for saliency using natural statistics
Journal of Vision, 2008
Assessing the contribution of color in visual attention
Computer Vision and Image Understanding, 2005
Components of bottom-up gaze allocation in natural images
Vision Research, 2005
Visual correlates of fixation selection: effects of scale and time
Vision Research, 2005
A model of saliency-based visual attention for rapid scene analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998
Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry
Published by Springer Science and Business Media LLC ,1987
A feature-integration theory of attention
Cognitive Psychology, 1980

Cited by 404 articles