USB: Ultrashort Binary Descriptor for Fast Visual Matching and Retrieval
- 12 June 2014
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Image Processing
- Vol. 23 (8), 3671-3683
- https://doi.org/10.1109/tip.2014.2330794
Abstract
Currently, many local descriptors have been proposed to tackle a basic issue in computer vision: duplicate visual content matching. These descriptors either are represented as high-dimensional vectors relatively expensive to extract and compare or are binary codes limited in robustness. Bag-of-visual words (BoWs) model compresses local features into a compact representation that allows for fast matching and scalable indexing. However, the codebook training, high-dimensional feature extraction, and quantization significantly degrade the flexibility and efficiency of BoWs model. In this paper, we study an alternative to current local descriptors and BoWs model by extracting the ultrashort binary descriptor (USB) and a compact auxiliary spatial feature from each keypoint detected in images. A typical USB is a 24-bit binary descriptor, hence it directly quantizes visual clues of image keypoints to about 16 million unique IDs. USB allows fast image matching and indexing and avoids the expensive codebook training and feature quantization in BoWs model. The spatial feature complementarily captures the spatial configuration in neighbor region of each keypoint, hence is used to filter mismatched USBs in a cascade verification. In image matching task, USB shows promising accuracy and nearly one-order faster speed than SIFT. We also test USB in retrieval tasks on UKbench, Oxford5K, and 1.2 million distractor images. Comparisons with recent retrieval methods manifest the competitive accuracy, memory consumption, and significantly better efficiency of our approach.Keywords
Funding Information
- National Basic Research Program of China (973 Program) (2012CB316400)
- Army Research Office (W911NF-12-1-0057)
- Faculty Research Awards through NEC Laboratories of America
- 2012 UTSA START-R Research Award
- National Science Foundation of China (61128007, 61025011, 61332016)
This publication has 28 references indexed in Scilit:
- Embedding Multi-Order Spatial Clues for Scalable Visual Matching and RetrievalIEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2014
- Semantic-Aware Co-indexing for Image RetrievalPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Edge-SIFT: Discriminative Binary Descriptor for Scalable Partial-Duplicate Mobile SearchIEEE Transactions on Image Processing, 2013
- BRISK: Binary Robust invariant scalable keypointsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2011
- Building contextual visual vocabulary for large-scale image applicationsPublished by Association for Computing Machinery (ACM) ,2010
- Descriptive visual words and visual phrases for image applicationsPublished by Association for Computing Machinery (ACM) ,2009
- Integrated feature selection and higher-order spatial feature extraction for object categorizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2008
- Rapid object detection using a boosted cascade of simple featuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- Robust Real-Time Face DetectionInternational Journal of Computer Vision, 2004