The Cityscapes Dataset for Semantic Urban Scene Understanding

Top Cited Papers

1 June 2016

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 3213-3223
https://doi.org/10.1109/cvpr.2016.350

Abstract

Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.

Keywords

Other Versions

Version 2, 2016-04-06, preprints

This publication has 54 references indexed in Scilit:

Guest Editorial: Scene Understanding
International Journal of Computer Vision, 2015
Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors
International Journal of Computer Vision, 2014
The Pascal Visual Object Classes Challenge: A Retrospective
International Journal of Computer Vision, 2014
3D Traffic Scene Understanding From Movable Platforms
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
Vision meets robotics: The KITTI dataset
The International Journal of Robotics Research, 2013
Selective Search for Object Recognition
International Journal of Computer Vision, 2013
Superparsing
International Journal of Computer Vision, 2012
Hough Regions for Joining Instance Localization and Segmentation
Lecture Notes in Computer Science, 2012
Robust Object Detection with Interleaved Categorization and Segmentation
International Journal of Computer Vision, 2007
LabelMe: A Database and Web-Based Tool for Image Annotation
International Journal of Computer Vision, 2007

Cited by 6694 articles