SPEEDING UP CHINESE CHARACTER RECOGNITION IN AN AUTOMATIC DOCUMENT READING SYSTEM

作者:

Highlights:

摘要

In this paper, We present two techniques for speeding up character recognition. Our character recognition system, including the candidate-cluster selection and modified branch-and-bound detail-matching modules, is implemented using two statistical features: crossing-counts and contour-direction counts. In the training stage, we divide characters into different clusters by using reference characters. To have a very high recognition rate, the candidate-cluster selection module selects the top 60 clusters with minimal distances from among 300 predefined clusters. To further speed-up the recognition speed, we use a modified branch-and-bound algorithm in the detail-matching module. In the automatic document reading system, characters and punctuation marks are first extracted from printed document images and sorted according to their positions and the document orientation. The system then recognizes all printed Chinese characters between pairs of punctuation marks. The results are then spoken aloud by a speech-synthesis system. The character recognition system and the text-to-speech synthesis system are integrated in the Windows-based document reading system, which provides a user-friendly environment.

论文关键词:Crossing-count features,Contour-direction features,Candidate-cluster selection,Branch-and-bound method,Text-to-speech technique,Automatic document reading system

论文评审过程:Received 16 April 1997, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(98)00043-0