Query by image example: the comparison algorithm for navigating digital image databases (CANDID) approach

Abstract
CANDID (comparison algorithm for navigating digital image databases) was developed to enable content-based retrieval of digital imagery from large databases using a query-by- example methodology. A user provides an example image to the system, and images in the database that are similar to that example are retrieved. The development of CANDID was inspired by the N-gram approach to document fingerprinting, where a `global signature' is computed for every document in a database and these signatures are compared to one another to determine the similarity between any two documents. CANDID computes a global signature for every image in a database, where the signature is derived from various image features such as localized texture, shape, or color information. A distance between probability density functions of feature vectors is then used to compare signatures. In this paper, we present CANDID and highlight two results from our current research: subtracting a `background' signature from every signature in a database in an attempt to improve system performance when using inner-product similarity measures, and visualizing the contribution of individual pixels in the matching process. These ideas are applicable to any histogram-based comparison technique.
Keywords