Single- and Cross-Modality Near Duplicate Image Pairs Detection via Spatial Transformer Comparing CNN
Open Access
- 2 January 2021
- Vol. 21 (1), 255
- https://doi.org/10.3390/s21010255
Abstract
Recently, both single modality and cross modality near-duplicate image detection tasks have received wide attention in the community of pattern recognition and computer vision. Existing deep neural networks-based methods have achieved remarkable performance in this task. However, most of the methods mainly focus on the learning of each image from the image pair, thus leading to less use of the information between the near duplicate image pairs to some extent. In this paper, to make more use of the correlations between image pairs, we propose a spatial transformer comparing convolutional neural network (CNN) model to compare near-duplicate image pairs. Specifically, we firstly propose a comparing CNN framework, which is equipped with a cross-stream to fully learn the correlation information between image pairs, while considering the features of each image. Furthermore, to deal with the local deformations led by cropping, translation, scaling, and non-rigid transformations, we additionally introduce a spatial transformer comparing CNN model by incorporating a spatial transformer module to the comparing CNN architecture. To demonstrate the effectiveness of the proposed method on both the single-modality and cross-modality (Optical-InfraRed) near-duplicate image pair detection tasks, we conduct extensive experiments on three popular benchmark datasets, namely CaliforniaND (ND means near duplicate), Mir-Flickr Near Duplicate, and TNO Multi-band Image Data Collection. The experimental results show that the proposed method can achieve superior performance compared with many state-of-the-art methods on both tasks.Keywords
Funding Information
- National Natural Science Foundation of China (U19B2037)
This publication has 35 references indexed in Scilit:
- Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-IdentificationIEEE Transactions on Image Processing, 2015
- Image Classification and Retrieval are ONEPublished by Association for Computing Machinery (ACM) ,2015
- Deep Learning for Content-Based Image RetrievalPublished by Association for Computing Machinery (ACM) ,2014
- Coupled Binary Embedding for Large-Scale Image RetrievalIEEE Transactions on Image Processing, 2014
- Near-duplicate video retrievalACM Computing Surveys, 2013
- California-ND: An annotated dataset for near-duplicate detection in personal photo collectionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Fisher Kernels on Visual Vocabularies for Image CategorizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- Detecting image near-duplicate by stochastic attributed relational graph matching with learningPublished by Association for Computing Machinery (ACM) ,2004
- Scale & Affine Invariant Interest Point DetectorsInternational Journal of Computer Vision, 2004