Sparse hashing for fast multimedia search
Top Cited Papers
- 17 May 2013
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 31 (2), 1-24
- https://doi.org/10.1145/2457465.2457469
Abstract
Hash-based methods achieve fast similarity search by representing high-dimensional data with compact binary codes. However, both generating binary codes and encoding unseen data effectively and efficiently remain very challenging tasks. In this article, we focus on these tasks to implement approximate similarity search by proposing a novel hash based method named sparse hashing (SH for short). To generate interpretable (or semantically meaningful) binary codes, the proposed SH first converts original data into low-dimensional data through a novel nonnegative sparse coding method. SH then converts the low-dimensional data into Hamming space (i.e., binary encoding low-dimensional data) by a new binarization rule. After this, training data are represented by generated binary codes. To efficiently and effectively encode unseen data, SH learns hash functions by taking a-priori knowledge into account, such as implicit group effect of the features in training data, and the correlations between original space and the learned Hamming space. SH is able to perform fast approximate similarity search by efficient bit XOR operations in the memory of a modern PC with short binary code representations. Experimental results show that the proposed SH significantly outperforms state-of-the-art techniques.Keywords
This publication has 30 references indexed in Scilit:
- Near-duplicate video retrievalACM Computing Surveys, 2013
- Semantic hashingInternational Journal of Approximate Reasoning, 2009
- Bounded coordinate system indexing for real-time video clip searchACM Transactions on Information Systems, 2009
- Near-optimal hashing algorithms for approximate nearest neighbor in high dimensionsCommunications of the ACM, 2008
- Regularization and Variable Selection Via the Elastic NetJournal of the Royal Statistical Society Series B: Statistical Methodology, 2005
- Least angle regressionThe Annals of Statistics, 2004
- Nonlinear Dimensionality Reduction by Locally Linear EmbeddingScience, 2000
- Predicting Multivariate Responses in Multiple Linear RegressionJournal of the Royal Statistical Society Series B: Statistical Methodology, 1997
- Emergence of simple-cell receptive field properties by learning a sparse code for natural imagesNature, 1996
- RELATIONS BETWEEN TWO SETS OF VARIATESBiometrika, 1936