Convolutional gated recurrent networks for video segmentation
- 1 September 2017
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 3090-3094
- https://doi.org/10.1109/icip.2017.8296851
Abstract
Semantic segmentation has recently witnessed major progress, but most of the previous work focused on improving single image segmentation. In this paper, we introduce a novel approach to implicitly utilize temporal data in videos for online segmentation. This design receives a sequence of consecutive video frames and outputs the segmentation of the last frame. Convolutional gated recurrent networks are used for the recurrent part to preserve spatial connectivities in the image. This architecture is tested for both binary and semantic video segmentation tasks. Experiments are conducted on the recent benchmarks in SegTrack V2, Davis, Camvid, and Synthia. Using recurrent fully convolutional networks improved the baseline network performance in all of our experiments. Namely, 5% and 3% improvement of F-measure in SegTrack2 and Davis respectively, 5.7% and 1.6% improvement in mean IoU in Synthia and Camvid. Thus, RFCN networks can be seen as a method to improve any baseline segmentation network by embedding them into a recurrent module that utilizes temporal data.Keywords
This publication has 9 references indexed in Scilit:
- The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban ScenesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Enhancing Semantic Segmentation for Robotics: The Power of 3-D Entangled ForestsIEEE Robotics and Automation Letters, 2015
- Conditional Random Fields as Recurrent Neural NetworksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Learning Deconvolution Network for Semantic SegmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Fully convolutional networks for semantic segmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Understanding High-Level Semantics by Modeling Traffic PatternsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Video Segmentation by Tracking Many Figure-Ground SegmentsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- SuperParsing: Scalable Nonparametric Image Parsing with SuperpixelsLecture Notes in Computer Science, 2010
- Long Short-Term MemoryNeural Computation, 1997