Abstract
The general problem of object recognition is difficult and often requires a large amount of computing resources, even for locating an object within a single image. How, then, can it be possible to build a tool for indexing into a large database of, say, thousands of images, which works effectively in `interactive time' on affordable hardware? One important optimization is to take advantage of interaction with the user to find out what types of variation are expected in the database, and to rely on the user to discriminate between similar-looking objects. Another is to create appropriate data structures off-line to speed on-line searches. We are building a tool, called FINDIT, for locating the image of an object from within a large number of images of scenes which may contain the object. The user outlines an object in an image that he wants to find in the database, and specifies the constraints on the transformations of the object that are expected to occur. The program acts as a filter to quickly reduce the possible number of candidates to a number small enough to be perused by the user. FINDIT chooses an appropriate search algorithm depending on the selection of constraints by the user.