Arabic character recognition system: A statistical approach for recognizing cursive typewritten text

作者:

Highlights:

摘要

Character recognition systems can contribute tremendously to the advancement of automation process, and can improve the interface between man and machine (computers) in many applications, including office automation and data entry. In this report we present a recognition system for typed Arabic text, which involves a statistical approach for character recognition. This approach uses “Accumulative Invariant Moments” as an identifier, which helped in the segmentation of connected and overlapping Arabic characters. However, Invariant Moments proved to be very sensitive to slight changes in a character shape. These changes are normally due to typing and the scanning process, and cannot be avoided. The recognition zone was defined based on the mean and standard deviation for the moments of a large sample of each character. However, this zone was increased, using an empirical multiplier, to improve recognition rate. The system was implemented on a mainframe in APL programming language for ease of experimentation, and then transported to a PC environment in BASIC for better portability. The recognition rate achieved was 94%, with a recognition speed of 10.6 characters/minute, running on a PC/AT with a math co-processor.

论文关键词:Pattern recognition,Character recognition,Cursive Arabic text recognition,Statistical techniques,Segmentation,Invariant moments,Cutting models

论文评审过程:Received 28 March 1989, Revised 6 April 1989, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(90)90069-W