Influence of text line segmentation in Handwritten Text Recognition
- 1 August 2015
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2015 13th International Conference on Document Analysis and Recognition (ICDAR)
Abstract
Text line segmentation is the process by which text lines in a document image are localized and extracted. It is an important step in off-line Handwritten Text Recognition (HTR) given that the input of these systems is the line image of the text to be transcribed. A myriad of solutions to the text line segmentation problem have been proposed in the literature. Although these solutions may differ greatly on what is actually applied to perform the segmentation, they can be classified by the level of precision and detail in the final extracted lines. In this paper we study the influence and real needs of different levels of precision and detail in the segmentation solutions in a real HTR task. We test three technics of text line segmentation whose output range from a simple rectangle for each line to a perfect fitted polygon surrounding the detected lines. Experiments have been carried out with a historical collection and results show that good HTR accuracy can be obtained with simple extraction algorithms.Keywords
This publication has 15 references indexed in Scilit:
- Automatic Line Segmentation and Ground-Truth Alignment of Handwritten DocumentsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Segmentation of Historical Handwritten Documents into Text Zones and Text LinesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- ICDAR 2013 Handwriting Segmentation ContestPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Moment-Based Image Normalization for Handwritten Text RecognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Statistical Text Line Analysis in Handwritten DocumentsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Text line and word segmentation of handwritten documentsPattern Recognition, 2009
- On-Line Handwritten Text Line Detection Using Dynamic ProgrammingNinth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
- Text line segmentation of historical documents: a surveyInternational Journal on Document Analysis and Recognition (IJDAR), 2006
- Bootstrap estimates for confidence intervals in ASR performance evaluationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Improved backing-off for M-gram language modelingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002