Rule based contextual post-processing for devanagari text recognition

作者:

Highlights:

摘要

The spatial relationships among the constituent symbols of Devanagari script play an important role in the interpretation of Devanagari words. There are a number of constraints on these spatial relationships which characterise Devanagari script composition syntax. When the word composition is not found to be syntactically correct, the symbols are substituted with their resembling counterparts. The symbol substitution rules are mostly heuristic in nature. Human interpretation normally involves application of script composition syntax rules and the symbol substitution rules in an interleaved fashion. This paper presents a design of a post-processor which corrects the Devanagari symbol string based on this observation. The composition syntax checker is represented in the form of a finite state machine. The substitution rules are in the form of condition action pairs giving flexibility to the system for easy alteration. Each substitution rule has a penalty associated with it and the accumulated penalty value for a word gives a measure of its confidence level.

论文关键词:Character recognition,Devanagari script,Script composition syntax,Substitution rules,Contextual post-processing

论文评审过程:Received 11 April 1986, Revised 20 October 1986, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(87)90075-6