Machine printed character segmentation —; An overview

作者:

Highlights:

摘要

This paper is part I of a two-part review series. We present here an overview of the character segmentation techniques in machine-printed documents. So far, as to this point, in most Optical Character Recognition (OCR) systems, either commercial products or systems described in the published literature, recognition algorithms are developed on isolated characters. Character segmentation is all too often ignored in the research community, yet broken and touching characters are responsible for the majority of errors in the automatic reading of both machine-printed and handwritten text. We will cover techniques for segmenting uniformed or proportional fonts, broken and touching characters; techniques based on text image features and techniques based on recognition results.

论文关键词:Character recognition,Character segmentation,OCR,Touching characters,Broken characters

论文评审过程:Received 2 September 1993, Revised 9 May 1994, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/0031-3203(94)00068-W