The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes
- 1 June 2016
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 3234-3243
- https://doi.org/10.1109/cvpr.2016.352
Abstract
Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. Recent revolutionary results of deep convolutional neural networks (DCNNs) foreshadow the advent of reliable classifiers to perform such visual tasks. However, DCNNs require learning of many parameters from raw images, thus, having a sufficient amount of diverse images with class annotations is needed. These annotations are obtained via cumbersome, human labour which is particularly challenging for semantic segmentation since pixel-level annotations are required. In this paper, we propose to use a virtual world to automatically generate realistic synthetic images with pixel-level annotations. Then, we address the question of how useful such data can be for semantic segmentation - in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations. We use SYNTHIA in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments with DCNNs that show how the inclusion of SYNTHIA in the training stage significantly improves performance on the semantic segmentation task.Keywords
This publication has 20 references indexed in Scilit:
- Learning Deep Object Detectors from 3D ModelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Learning scene-specific pedestrian detectors without real dataPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision, 2015
- Large-Scale Video Classification with Convolutional Neural NetworksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Rich Feature Hierarchies for Accurate Object Detection and Semantic SegmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Articulated people detection and pose estimation: Reshaping the futurePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Real-time human pose recognition in parts from single depth imagesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2011
- It's All About the DataProceedings of the IEEE, 2010
- Semantic object classes in video: A high-definition ground truth databasePattern Recognition Letters, 2009
- LabelMe: A Database and Web-Based Tool for Image AnnotationInternational Journal of Computer Vision, 2007