VirtualWorlds as Proxy for Multi-object Tracking Analysis
- 1 June 2016
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 4340-4349
- https://doi.org/10.1109/cvpr.2016.470
Abstract
Modern computer vision algorithms typically require expensive data acquisition and accurate manual labeling. In this work, we instead leverage the recent progress in computer graphics to generate fully labeled, dynamic, and photo-realistic proxy virtual worlds. We propose an efficient real-to-virtual world cloning method, and validate our approach by building and publicly releasing a new video dataset, called "Virtual KITTI", automatically labeled with accurate ground truth for object detection, tracking, scene and instance segmentation, depth, and optical flow. We provide quantitative experimental evidence suggesting that (i) modern deep learning algorithms pre-trained on real data behave similarly in real and virtual worlds, and (ii) pre-training on virtual data improves performance. As the gap between real and virtual worlds is small, virtual worlds enable measuring the impact of various weather and imaging conditions on recognition performance, all other things being equal. We show these factors may affect drastically otherwise high-performing deep models for tracking.Keywords
This publication has 29 references indexed in Scilit:
- Picture: A probabilistic programming language for scene perceptionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Learning scene-specific pedestrian detectors without real dataPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Learning Optimal Parameters For Multi-target TrackingPublished by British Machine Vision Association and Society for Pattern Recognition ,2015
- Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD ModelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Continuous Energy Minimization for Multitarget TrackingIeee Transactions On Pattern Analysis and Machine Intelligence, 2013
- Simulation as an engine of physical scene understandingProceedings of the National Academy of Sciences of the United States of America, 2013
- 3D Traffic Scene Understanding From Movable PlatformsIEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
- Back to the Future: Learning Shape Models from 3D CAD DataPublished by British Machine Vision Association and Society for Pattern Recognition ,2010
- Evaluating Multiple Object Tracking Performance: The CLEAR MOT MetricsEURASIP Journal on Image and Video Processing, 2007
- Model-based validation approaches and matching techniques for automotive vision based pedestrian detectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006