A generic method for determining up/down orientation of text in roman and non-roman scripts

作者:

Highlights:

摘要

This paper presents a method for determining the up/down orientation of text in a scanned document of unknown orientation, so that it can be appropriately rotated and processed by an optical character recognition (OCR) engine. The method analyzes the “open” portions of text blobs to determine the direction in which the open portions face. By determining the respective densities of blobs opening in a pair of opposite directions (e.g., right or left), the method can establish the direction in which the text as a whole is oriented. We first describe a method for determining the up/down orientation of roman text based on the asymmetry in the openness of most roman letters in the horizontal direction. For non-roman text such as Pashto and Hebrew, we provide a method that determines a direction that is the most asymmetric, and therefore the most useful for the determination of text orientation, given a training data set of documents of known orientation. This work can be adapted for use in automated mail processing or to determine the orientation of checks in automated teller machine envelopes, scanned or copied documents, documents sent via facsimile, and digital photographs that include text (e.g., road signs, business cards, driver's licenses), among other applications.

论文关键词:Document layout analysis,OCR,Page orientation,Multilingual analysis

论文评审过程:Received 17 February 2004, Revised 13 December 2004, Accepted 13 December 2004, Available online 9 April 2005.

论文官网地址:https://doi.org/10.1016/j.patcog.2004.12.011