On Combining Multiple Segmentations in Scene Text Recognition

Abstract
An end-to-end real-time scene text localization and recognition method is presented. The three main novel features are: (i) keeping multiple segmentations of each character until the very last stage of the processing when the context of each character in a text line is known, (ii) an efficient algorithm for selection of character segmentations minimizing a global criterion, and (iii) showing that, despite using theoretically scale-invariant methods, operating on a coarse Gaussian scale space pyramid yields improved results as many typographical artifacts are eliminated. The method runs in real time and achieves state-of-the-art text localization results on the ICDAR 2011 Robust Reading dataset. Results are also reported for end-to-end text recognition on the ICDAR 2011 dataset.

This publication has 13 references indexed in Scilit: