Ridgelet-DTW-based word spotting for Arabic historical document

Abstract
In this paper we propose a system for word spotting in Arabic historical document using Ridgelet transform and Dynamic Time Warping (DTW). First, a preprocessing and segmentation processes are applied to all document pages to create a word image dataset. Keeping each word into its original size, Ridgelet descriptor is generated without applying the normalization criteria for Radon transform, where the rotation, translation and scaling invariance is achieved. Therefore, DTW algorithm is employed to match corresponding projection angle pairs from Ridgelet descriptor, while avoiding problems associated with dimensionality reduction of descriptor sets into one vector which cause a loss of useful information. Experiments were conducted on historical Arabic document from the National library. The obtained results showed the effectiveness of the proposed method.

This publication has 29 references indexed in Scilit: