Development of Vertical Text Interpreter for Natural Scene Images

Abstract
Automatic text recognition in natural scene images is essential for accessing information and understanding our surroundings. Scene text orientations include horizontal scene texts, arbitrarily oriented scene texts, curved scene texts, and vertically oriented scene texts. While attention has been given to horizontal, arbitrarily oriented, and curved text, limited research has been carried out on vertically oriented scene text recognition. To this end, we propose Vertical Text Interpreter, an autonomous vertically oriented scene text recognizer model. Vertical Text Interpreter detects and recognizes vertically oriented scene texts in natural scenes, including vertically-stacked texts, bottom-to-top vertical texts, and top-to-bottom vertical texts. It consists of a shared convolutional neural network, a Vertical Text Spotter, and a Vertical Text Reader. We developed a dataset, namely Vertically Oriented Scene Text 1250 Dataset, created as part of this research, addressing the need for a dataset for this category of scene texts. The performance of the Vertical Text Interpreter is evaluated using benchmark datasets and the VOST-1250 dataset. Results show that Vertical Text Interpreter can detect and recognize different types of vertically oriented scene texts simultaneously. For future work, we can explore Vertical Text Interpreter for the contexts such as reading assistance and visual navigation systems.
Funding Information
  • Swinburne Melbourne Sarawak Research Collaboration Scheme

This publication has 26 references indexed in Scilit: