Automatic script identification from document images using cluster-based templates
- 1 February 1997
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence
- Vol. 19 (2), 176-181
- https://doi.org/10.1109/34.574802
Abstract
We describe an automated script identification system for typeset document images. Templates for each script are created by clustering textual symbols from a training set. Symbols from new images are compared to the templates to find the best script. Our current system processes thirteen scripts with minimal preprocessing and high accuracy.This publication has 7 references indexed in Scilit:
- Stress assignment in letter to sound rules for speech synthesisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Automatic script identification from images using cluster-based templatesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Determination of the script and language content of document imagesIEEE Transactions on Pattern Analysis and Machine Intelligence, 1997
- Gauging Similarity with n -Grams: Language-Independent Categorization of TextScience, 1995
- Text characterization by connected component transformationsPublished by SPIE-Intl Soc Optical Eng ,1994
- Language determinationPublished by Association for Computational Linguistics (ACL) ,1994
- An integrated data flow visual language and software development environmentJournal of Visual Languages & Computing, 1991