A Sparse Embedding and Least Variance Encoding Approach to Hashing

Top Cited Papers

25 June 2014

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Image Processing

Vol. 23 (9), 3737-3750
https://doi.org/10.1109/tip.2014.2332764

Abstract

Hashing is becoming increasingly important in large-scale image retrieval for fast approximate similarity search and efficient data storage. Many popular hashing methods aim to preserve the kNN graph of high dimensional data points in the low dimensional manifold space, which is, however, difficult to achieve when the number of samples is big. In this paper, we propose an effective and efficient hashing approach by sparsely embedding a sample in the training sample space and encoding the sparse embedding vector over a learned dictionary. To this end, we partition the sample space into clusters via a linear spectral clustering method, and then represent each sample as a sparse vector of normalized probabilities that it falls into its several closest clusters. This actually embeds each sample sparsely in the sample space. The sparse embedding vector is employed as the feature of each sample for hashing. We then propose a least variance encoding model, which learns a dictionary to encode the sparse embedding feature, and consequently binarize the coding coefficients as the hash codes. The dictionary and the binarization threshold are jointly optimized in our model. Experimental results on benchmark data sets demonstrated the effectiveness of the proposed approach in comparison with state-of-the-art methods.

Keywords

Funding Information

Hong Kong Polytechnic University Research Fund, Hong Kong (G-YK79)
Australia Research Council (ARC-FT130101530)
National Natural Science Foundation of China (61263035)
Guangxi 100 Plan

This publication has 35 references indexed in Scilit:

Sparse hashing for fast multimedia search
ACM Transactions on Information Systems, 2013
Self-taught dimensionality reduction on the high-dimensional small-sized data
Pattern Recognition, 2013
Exponential locality preserving projections for small sample size problem
Neurocomputing, 2011
Semantic hashing
International Journal of Approximate Reasoning, 2009
NUS-WIDE
Published by Association for Computing Machinery (ACM) ,2009
Image retrieval
ACM Computing Surveys, 2008
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM, 2008
Latent Semantic Indexing: A Probabilistic Analysis
Journal of Computer and System Sciences, 2000
Satisfying general proximity / similarity queries with metric trees
Information Processing Letters, 1991
Indexing by latent semantic analysis
Journal of the American Society for Information Science, 1990

Cited by 198 articles