A large-scale hierarchical multi-view RGB-D object dataset
- 1 May 2011
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 1817-1824
- https://doi.org/10.1109/icra.2011.5980382
Abstract
Over the last decade, the availability of public image repositories and recognition benchmarks has enabled rapid progress in visual object category and instance detection. Today we are witnessing the birth of a new generation of sensing technologies capable of providing high quality synchronized videos of both color and depth, the RGB-D (Kinect-style) camera. With its advanced sensing capabilities and the potential for mass adoption, this technology represents an opportunity to dramatically increase robotic object recognition, manipulation, navigation, and interaction capabilities. In this paper, we introduce a large-scale, hierarchical multi-view object dataset collected using an RGB-D camera. The dataset contains 300 objects organized into 51 categories and has been made publicly available to the research community so as to enable rapid progress based on this promising technology. This paper describes the dataset collection procedure and introduces techniques for RGB-D based object recognition and detection, demonstrating that combining color and depth information substantially improves quality of results.Keywords
This publication has 14 references indexed in Scilit:
- Using stereo for object recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2010
- LabelMe: A Database and Web-Based Tool for Image AnnotationInternational Journal of Computer Vision, 2007
- Efficient estimation of accurate maximum likelihood maps in 3DPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- 3D generic object categorization, localization and pose estimationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- Random ForestsMachine Learning, 2001
- Representing and Recognizing the Visual Appearance of Materials using Three-dimensional TextonsInternational Journal of Computer Vision, 2001
- SurfelsPublished by Association for Computing Machinery (ACM) ,2000
- Using spin images for efficient object recognition in cluttered 3D scenesIEEE Transactions on Pattern Analysis and Machine Intelligence, 1999
- Object recognition from local scale-invariant featuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Random sample consensusCommunications of the ACM, 1981