Skew detection and text line position determination in digitized documents

作者:

Highlights:

摘要

This paper proposes a computationally efficient procedure for skew detection and text line position determination in digitized documents, which is based on the cross-correlation between the pixels of vertical lines in a document. The determination of the skew angle in documents is essential in optical character recognition systems. Due to the text skew, each horizontal text line intersects a predefined set of vertical lines at non-horizontal positions. Using only the pixels on these vertical lines we construct a correlation matrix and evaluate the skew angle of the document with high accuracy. In addition, using the same matrix, we compute the positions of text lines in the document. The proposed method is tested on a variety of mixed-type documents and it provides good and accurate results while it requires only a short computational time. We illustrate the effectiveness of the algorithm by presenting four characteristic examples.

论文关键词:Skew detection,Hough transform,Character recognition,Segmentation

论文评审过程:Received 8 June 1995, Revised 30 July 1996, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(96)00157-4