Search engine for handwritten documents

Abstract
Search aspects of a system for analyzing handwritten documents are described. Documents are indexed using global image features, e.g., stroke width, slant as well as local features that describe the shapes of words and characters. Image indexing is done automatically using page analysis, page segmentation, line separation, word segmentation and recognition of words and characters. Two types of search are permitted: search based on global features of entire document and search using features at local level. For the second type of search, i.e., local, all the words in the document are characterized and indexed by various features and it forms the basis of different search techniques. The paper focuses on local search and describes four tasks: word/phrase spotting, text to image, image to text and plain text. Performance in terms of precision/recall and word ranking is reported on a database of handwriting samples from about 1,000 individuals.