Scalable Similarity Search With Topology Preserving Hashing

21 May 2014

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Image Processing

Vol. 23 (7), 3025-3039
https://doi.org/10.1109/tip.2014.2326010

Abstract

Hashing-based similarity search techniques is becoming increasingly popular in large data sets. To capture meaningful neighbors, the topology of a data set, which represents the neighborhood relationships between its subregions and the relative proximities between the neighbors of each subregion, e.g., the relative neighborhood ranking of each subregion, should be exploited. However, most existing hashing methods are developed to preserve neighborhood relationships while ignoring the relative neighborhood proximities. Moreover, most hashing methods lack in providing a good result ranking, since there are often lots of results sharing the same Hamming distance to a query. In this paper, we propose a novel hashing method to solve these two issues jointly. The proposed method is referred to as topology preserving hashing (TPH). TPH is distinct from prior works by also preserving the neighborhood ranking. Based on this framework, we present three different TPH methods, including linear unsupervised TPH, semisupervised TPH, and kernelized TPH. Particularly, our unsupervised TPH is capable of mining semantic relationship between unlabeled data without supervised information. Extensive experiments on four large data sets demonstrate the superior performances of the proposed methods over several state-of-the-art unsupervised and semisupervised hashing techniques.

Keywords

Funding Information

National High Technology Research and Development Program of China (2014AA015202)
National Natural Science Foundation of China (61303151, 61273247, 61271428)
National Key Technology Research and Development Program of China (2012BAH39B02)
National Science Foundation of China (61128007)
Army Research Office (W911NF-12-1-0057)
Faculty Research Awards through the NEC Laboratories America Inc., Princeton, NJ, USA
2012 UTSA START-R Research Award
National Nature Science Foundation of China.

This publication has 32 references indexed in Scilit:

Packing and Padding: Coupled Multi-index for Accurate Image Retrieval
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Binary Code Ranking with Weighted Hamming Distance
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Query-Adaptive Image Search With Hash Codes
IEEE Transactions on Multimedia, 2012
Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012
Spectral Hashing With Semantically Consistent Graph for Image Indexing
IEEE Transactions on Multimedia, 2012
Semi-supervised Discriminant Hashing
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Kernelized Locality-Sensitive Hashing
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011
A posteriori multi-probe locality sensitive hashing
Published by Association for Computing Machinery (ACM) ,2008
Principles of hash-based text retrieval
Published by Association for Computing Machinery (ACM) ,2007
Nonlinear Dimensionality Reduction
Published by Springer Science and Business Media LLC ,2007

Cited by 41 articles