Suggestion Generation for Erroneous Words in Tamil Documents Using N-gram Technique
Published: 5 May 2022
Asian Journal of Research in Computer Science pp 45-54; https://doi.org/10.9734/ajrcos/2022/v13i330317
Abstract: A spell checker is a tool that finds and corrects erroneous words and grammatical mistakes in a text document. Spelling error detection and correction techniques are widely used by text editing systems, search engines, text to speech and speech to text conversion systems, machine translation systems and optical character recognition systems. The spell checkers for European languages and some Indic languages are well developed. However, perhaps, owing to Tamil being a morphologically rich and agglutinative language this has been a challenging task. Erroneous words and grammatical mistakes can occur in sentences due to various reasons. Erroneous words can be classified into two categories, namely non-word errors and real-word errors. This work aims to correct non-word errors in Tamil documents by suggesting alternatives. The proposed approach uses letter-level and word-level n-gram, stemming and hash table techniques. Test results show that the suggestions generated by the system are with 95% accuracy.
Keywords: text / Tamil Documents / correct / speech / languages / Erroneous words
Scifeed alert for new publicationsNever miss any articles matching your research from any publisher
- Get alerts for new papers matching your research
- Find out the new papers from selected authors
- Updated daily for 49'000+ journals and 6000+ publishers
- Define your Scifeed now
Click here to see the statistics on "Asian Journal of Research in Computer Science" .