Increasing character recognition accuracy by detection and correction of erroneously identified characters

作者:

Highlights:

摘要

This paper presents a general approach to increasing character recognition accuracy by detection and correction of erroneously identified characters. A high-accuracy two-stage recognition system for recognizing handwritten Chinese text with 5401 categories is demonstrated. In the first stage of the system, two matching modules are applied to recognize an input character simultaneously. The first matching module uses a direction feature and the second uses a generalized feature. A character is rejected at the first stage when the matching results of the two modules are not the same. Because our first stage recognizes most of the input characters correctly and outputs a small number of candidates for each rejected character, a bigram Markov language model in the second stage can choose a candidate with high recognition rate for each rejected character according to contextual information. Experiments are performed on sentences consisting of characters extracted from the CCL/HCCR1 handwritten Chinese character database. In the first stage, the rejection rate for the input characters is 13% and the recognition rate for the accepted characters is 95.9%. In the second stage, the recognition rate for the characters rejected by the first stage is 91.2%. Thus, the overall recognition rate for the input handwritten text is 95.9% × 0.87 + 91.2% × 0.13 (= 95.2%)

论文关键词:Character recognition,Handwritten Chinese characters,Generalized feature,Hill-climbing method,Markov language model,Part-of-speech

论文评审过程:Received 1 September 1993, Revised 3 March 1994, Accepted 5 April 1994, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(94)90009-4