USB: Ultrashort Binary Descriptor for Fast Visual Matching and Retrieval

12 June 2014

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Image Processing

Vol. 23 (8), 3671-3683
https://doi.org/10.1109/tip.2014.2330794

Abstract

Currently, many local descriptors have been proposed to tackle a basic issue in computer vision: duplicate visual content matching. These descriptors either are represented as high-dimensional vectors relatively expensive to extract and compare or are binary codes limited in robustness. Bag-of-visual words (BoWs) model compresses local features into a compact representation that allows for fast matching and scalable indexing. However, the codebook training, high-dimensional feature extraction, and quantization significantly degrade the flexibility and efficiency of BoWs model. In this paper, we study an alternative to current local descriptors and BoWs model by extracting the ultrashort binary descriptor (USB) and a compact auxiliary spatial feature from each keypoint detected in images. A typical USB is a 24-bit binary descriptor, hence it directly quantizes visual clues of image keypoints to about 16 million unique IDs. USB allows fast image matching and indexing and avoids the expensive codebook training and feature quantization in BoWs model. The spatial feature complementarily captures the spatial configuration in neighbor region of each keypoint, hence is used to filter mismatched USBs in a cascade verification. In image matching task, USB shows promising accuracy and nearly one-order faster speed than SIFT. We also test USB in retrieval tasks on UKbench, Oxford5K, and 1.2 million distractor images. Comparisons with recent retrieval methods manifest the competitive accuracy, memory consumption, and significantly better efficiency of our approach.

Keywords

Funding Information

National Basic Research Program of China (973 Program) (2012CB316400)
Army Research Office (W911NF-12-1-0057)
Faculty Research Awards through NEC Laboratories of America
2012 UTSA START-R Research Award
National Science Foundation of China (61128007, 61025011, 61332016)

This publication has 28 references indexed in Scilit:

Embedding Multi-Order Spatial Clues for Scalable Visual Matching and Retrieval
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2014
Semantic-Aware Co-indexing for Image Retrieval
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Edge-SIFT: Discriminative Binary Descriptor for Scalable Partial-Duplicate Mobile Search
IEEE Transactions on Image Processing, 2013
BRISK: Binary Robust invariant scalable keypoints
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Building contextual visual vocabulary for large-scale image applications
Published by Association for Computing Machinery (ACM) ,2010
Descriptive visual words and visual phrases for image applications
Published by Association for Computing Machinery (ACM) ,2009
Integrated feature selection and higher-order spatial feature extraction for object categorization
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
Rapid object detection using a boosted cascade of simple features
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision, 2004
Robust Real-Time Face Detection
International Journal of Computer Vision, 2004

Cited by 54 articles