Handwritten document retrieval
- 1 January 2002
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
This paper investigates the use of both typed and handwritten queries to retrieve handwritten documents. The recognition-based approach reported here is novel in that it expands documents in a fashion analogous to query expansion: Individual documents are expanded using N-best lists which embody additional statistical information from a hidden Markov model (HMM) based handwriting recognizer used to transcribe each of the handwritten documents. This additional information enables the retrieval methods to be robust to machine transcription errors, retrieving documents which otherwise would be unretrievable. Cross-writer experiments on a database of 10985 words in 108 documents from 108 writers, and within-writer experiments in a probabilistic framework, on a database of 537724 words in 3342 documents from 43 writers, indicate that significant improvements in retrieval performance can be achieved. The second database is the largest database of on-line handwritten documents known to its.Keywords
This publication has 7 references indexed in Scilit:
- Writer dependent recognition of on-line unconstrained handwritingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Ink-link [character recognition]Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Online and off-line handwriting recognition: a comprehensive surveyIeee Transactions On Pattern Analysis and Machine Intelligence, 2000
- A language modeling approach to information retrievalPublished by Association for Computing Machinery (ACM) ,1998
- The handwritten triePublished by Association for Computing Machinery (ACM) ,1995
- Results of Applying Probabilistic IR to OCR TextPublished by Springer Science and Business Media LLC ,1994
- A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVALJournal of Documentation, 1972