A New Arabic Printed Text Image Database and Evaluation Protocols

Top Cited Papers

1 January 2009

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 946-950
https://doi.org/10.1109/icdar.2009.155

Abstract

We report on the creation of a database composed of images of Arabic Printed words. The purpose of this database is the large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style text recognition systems in Arabic. The challenges that are addressed by the database are in the variability of the sizes, fonts and style used to generate the images. A focus is also given on low-resolution images where anti-aliasing is generating noise on the characters to recognize. The database is synthetically generated using a lexicon of 113psila284 words, 10 Arabic fonts, 10 font sizes and 4 font styles. The database contains 45psila313psila600 single word images totaling to more than 250 million characters. Ground truth annotation is provided for each image. The database is called APTI for Arabic Printed Text Images.

Keywords

This publication has 7 references indexed in Scilit:

A New Arabic Printed Text Image Database and Evaluation Protocols
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Duration Models for Arabic Text Recognition Using Hidden Markov Models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK)
Pattern Recognition Letters, 2007
Affixal approach for Arabic decomposable vocabulary recognition a validation on printed word in only one font
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
The architecture of a standard Arabic lexical database
Published by Association for Computational Linguistics (ACL) ,2004
Multi-font recognition of printed Arabic using the BBN BYBLOS speech recognition system
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Statistical pattern recognition: a review
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000

Cited by 88 articles