Learning descriptors for object recognition and 3D pose estimation

1 June 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 10636919,p. 3109-3118
https://doi.org/10.1109/cvpr.2015.7298930

Abstract

Detecting poorly textured objects and estimating their 3D pose reliably is still a very challenging problem. We introduce a simple but powerful approach to computing descriptors for object views that efficiently capture both the object identity and 3D pose. By contrast with previous manifold-based approaches, we can rely on the Euclidean distance to evaluate the similarity between descriptors, and therefore use scalable Nearest Neighbor search methods to efficiently handle a large number of objects under a large range of poses. To achieve this, we train a Convolutional Neural Network to compute these descriptors by enforcing simple similarity and dissimilarity constraints between the descriptors. We show that our constraints nicely untangle the images from different objects and different views into clusters that are not only well-separated but also structured as the corresponding sets of poses: The Euclidean distance between descriptors is large when the descriptors are from different objects, and directly related to the distance between the poses when the descriptors are from the same object. These important properties allow us to outperform state-of-the-art object views representations on challenging RGB and RGB-D data.

Keywords

Other Versions

Version 2, 2015-02-20, preprints

This publication has 27 references indexed in Scilit:

Scalable Nearest Neighbor Algorithms for High Dimensional Data
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014
Boosting masked dominant orientation templates for efficient object detection
Computer Vision and Image Understanding, 2014
Fast Exact Search in Hamming Space With Multi-Index Hashing
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
A Survey of Manifold Learning for Images
IPSJ Transactions on Computer Vision and Applications, 2009
Scalable Recognition with a Vocabulary Tree
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Learning a Similarity Metric Discriminatively, with Application to Face Verification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Real-time focus range sensor
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996
Mental rotation and orientation-dependence in shape recognition
Cognitive Psychology, 1989

Cited by 266 articles