Image Retrieval Based on a Hybrid Model of Deep Convolutional Encoder

Abstract
Aiming at the difficulty of semantic gap in content-based image search (CBIR), inspired by the convolutional neural network (CNN) in image classification and detection, this paper proposes a simple and effective hybrid model of deep convolutional network and autoencoder network. This model uses the CNN network to extract the high-level semantic features of the image, then uses the depth autoencoder network to reduce the dimension of the extracted image features, and compresses the features into a 128-bit vector representation. Nearest Neighbor Search (ANN) is an effective strategy for large-scale image retrieval. This paper uses the annoy algorithm to calculate the similarity between the query image and the index tree, and outputs them in descending order of similarity.Experimental results show that the proposed method outperforms some of the latest deep-network image retrieval algorithms on the CIFAR-10 and MNIST datasets. In the TOP10 image search, the MNSIT dataset can obtain 100% accuracy. In the CIFAR dataset experiment, the accuracy and recall rate of the CIFAR4 dataset are as high as 99.9%, and the accuracy and recall rate of the CIFAR10 dataset reach respectively 97.2% and 98.1%. In addition, the size of the convolutional network's parameters and the size of the index are optimized compared to the previous model, so that the effect of second-level real-time response can be achieved in the 10,000-level image search.

This publication has 6 references indexed in Scilit: