Fully Convolutional Networks for Semantic Segmentation

Top Cited Papers

Open Access

24 May 2016

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence

Vol. 39 (4), 640-651
https://doi.org/10.1109/tpami.2016.2572683

Abstract

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional networks achieve improved segmentation of PASCAL VOC (30% relative improvement to 67.2% mean IU on 2012), NYUDv2, SIFT Flow, and PASCAL-Context, while inference takes one tenth of a second for a typical image.

Keywords

Other Versions

Funding Information

DARPA's MSEE
SMISC
US National Science Foundation (IIS-1427425, IIS-1212798, IIS-1116411)
NSF
GRFP
Toyota
Berkeley Vision and Learning Center
Nvidia
GPU
SIFT
IU
IU

This publication has 29 references indexed in Scilit:

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015
BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Holistically-Nested Edge Detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Convolutional feature masking for joint object and stuff segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Feedforward semantic segmentation with zoom-out features
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
The Role of Context for Object Detection and Semantic Segmentation in the Wild
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Pedestrian Detection with Unsupervised Multi-stage Feature Learning
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Representation of local geometry in the visual system
Biological Cybernetics, 1987

Cited by 5735 articles