Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views

Top Cited Papers

1 December 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 2686-2694
https://doi.org/10.1109/iccv.2015.308

Abstract

Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs (Convolutional Neural Networks). We believe that 3D models have the potential in generating a large number of images of high variation, which can be well exploited by deep CNN with a high learning capacity. Towards this goal, we propose a scalable and overfit-resistant image synthesis pipeline, together with a novel CNN specifically tailored for the viewpoint estimation task. Experimentally, we show that the viewpoint estimation from our pipeline can significantly outperform state-of-the-art methods on PASCAL 3D+ benchmark.

Keywords

This publication has 26 references indexed in Scilit:

Joint embeddings of shapes and images via CNN image purification
ACM Transactions on Graphics, 2015
Viewpoints and keypoints
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Beyond PASCAL: A benchmark for 3D object detection in the wild
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
SUN database: Large-scale scene recognition from abbey to zoo
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Multi-view object class detection with a 3D geometric model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Back to the Future: Learning Shape Models from 3D CAD Data
Published by British Machine Vision Association and Society for Pattern Recognition ,2010
Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Free-form deformation of solid geometric models
Published by Association for Computing Machinery (ACM) ,1986
The singularities of the visual mapping
Biological Cybernetics, 1976

Cited by 455 articles