The Cityscapes Dataset for Semantic Urban Scene Understanding
Top Cited Papers
- 1 June 2016
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 3213-3223
- https://doi.org/10.1109/cvpr.2016.350
Abstract
Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.Keywords
Other Versions
This publication has 54 references indexed in Scilit:
- Guest Editorial: Scene UnderstandingInternational Journal of Computer Vision, 2015
- Scene Parsing with Object Instance Inference Using Regions and Per-exemplar DetectorsInternational Journal of Computer Vision, 2014
- The Pascal Visual Object Classes Challenge: A RetrospectiveInternational Journal of Computer Vision, 2014
- 3D Traffic Scene Understanding From Movable PlatformsIEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
- Vision meets robotics: The KITTI datasetThe International Journal of Robotics Research, 2013
- Selective Search for Object RecognitionInternational Journal of Computer Vision, 2013
- SuperparsingInternational Journal of Computer Vision, 2012
- Hough Regions for Joining Instance Localization and SegmentationLecture Notes in Computer Science, 2012
- Robust Object Detection with Interleaved Categorization and SegmentationInternational Journal of Computer Vision, 2007
- LabelMe: A Database and Web-Based Tool for Image AnnotationInternational Journal of Computer Vision, 2007