California-ND: An annotated dataset for near-duplicate detection in personal photo collections

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE) in 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX)

p. 142-147
https://doi.org/10.1109/qomex.2013.6603227

Abstract

Managing photo collections involves a variety of image quality assessment tasks, e.g. the selection of the “best” photos. Detecting near-duplicate images is a prerequisite for automating these tasks. This paper presents a new dataset that may assist researchers in testing algorithms for the detection of near-duplicates in personal photo libraries. The proposed dataset is derived directly from an actual personal travel photo collection. It contains many difficult cases and types of near-duplicates. More importantly, in order to deal with the inevitable ambiguity that the near-duplicate cases exhibit, the dataset is annotated by 10 different subjects. These annotations are combined into a non-binary ground truth, which indicates the probability that a pair of images may be considered a near-duplicate by an observer.

Keywords

This publication has 18 references indexed in Scilit:

Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search
IEEE Transactions on Multimedia, 2011
Salient covariance for near-duplicate image and video detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Adding Affine Invariant Geometric Constraint for Partial-Duplicate Image Retrieval
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Consumer photo management and browsing facilitated by near-duplicate detection with feature filtering
Journal of Visual Communication and Image Representation, 2010
Near-duplicate detection for images and videos
Published by Association for Computing Machinery (ACM) ,2009
Clustering near-duplicate images in large collections
Published by Association for Computing Machinery (ACM) ,2007
SICO: A System for Detection of Near-Duplicate Images During Search
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
Object retrieval with large vocabularies and fast spatial matching
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
Scalable Recognition with a Vocabulary Tree
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Detection of non-identical duplicate consumer photographs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004

Cited by 16 articles