Arabic character recognition using fourier descriptors and character contour encoding

作者:

Highlights:

摘要

Normalized Fourier descriptors are known to be invariant to scale, translation, and rotation. This technique was used by researchers of Latin OCR yielding acceptable results. In addition, contour analysis was used in object recognition with success. Both techniques are adopted as they are necessary for the recognition of Arabic characters with acceptable recognition rates. This combination was deemed necessary due to the special characteristics of Arabic characters that have some very similar characters. The character images are smoothed by a statistically-based algorithm to eliminate noise. Then, the contours of the image (namely the character primary part, the dots, and hole contours) are extracted. Fourier descriptors and curvature features of the primary part of the character are computed. These features of the training set are used as the model features. The features of an input character are compared to the models' features using a distance measure. The model with the minimum distance is taken as the class representing the character. The dots' and holes' features are then used to specify the particular character. Experimental results have shown that the combination of the Fourier descriptors, the curvature features and the use of dots' and holes' features to be powerful in successfully classifying Arabic characters. Recognition rates of 100% were achieved for the model classes. However, this rate has come down to 98% in the post-recognition phase of identifying the specific characters. The major part of these errors come from corrupted data.

论文关键词:Arabic character recognition,OCR,Fourier descriptors,Contour analysis,Curvature features,Direction features

论文评审过程:Received 25 January 1993, Revised 24 November 1993, Accepted 2 December 1993, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(94)90166-X