Convolutional gated recurrent networks for video segmentation

1 September 2017

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 3090-3094
https://doi.org/10.1109/icip.2017.8296851

Abstract

Semantic segmentation has recently witnessed major progress, but most of the previous work focused on improving single image segmentation. In this paper, we introduce a novel approach to implicitly utilize temporal data in videos for online segmentation. This design receives a sequence of consecutive video frames and outputs the segmentation of the last frame. Convolutional gated recurrent networks are used for the recurrent part to preserve spatial connectivities in the image. This architecture is tested for both binary and semantic video segmentation tasks. Experiments are conducted on the recent benchmarks in SegTrack V2, Davis, Camvid, and Synthia. Using recurrent fully convolutional networks improved the baseline network performance in all of our experiments. Namely, 5% and 3% improvement of F-measure in SegTrack2 and Davis respectively, 5.7% and 1.6% improvement in mean IoU in Synthia and Camvid. Thus, RFCN networks can be seen as a method to improve any baseline segmentation network by embedding them into a recurrent module that utilizes temporal data.

Keywords

This publication has 9 references indexed in Scilit:

The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Enhancing Semantic Segmentation for Robotics: The Power of 3-D Entangled Forests
IEEE Robotics and Automation Letters, 2015
Conditional Random Fields as Recurrent Neural Networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Learning Deconvolution Network for Semantic Segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Fully convolutional networks for semantic segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Understanding High-Level Semantics by Modeling Traffic Patterns
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Video Segmentation by Tracking Many Figure-Ground Segments
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
SuperParsing: Scalable Nonparametric Image Parsing with Superpixels
Lecture Notes in Computer Science, 2010
Long Short-Term Memory
Neural Computation, 1997

Cited by 38 articles