AN EFFICIENT ALGORITHM FOR FORM STRUCTURE EXTRACTION USING STRIP PROJECTION

作者:

Highlights:

摘要

This paper presents an efficient strip-projection-based approach to extracting form structures from form documents for office automation. To locate the data, we have to extract and interpret the form structure. In this paper, a strip projection method is presented for extracting the form structure. We first segment input form images into uniform vertical and horizontal strips. Since most form lines are vertical or horizontal, we project the image of each vertical strip horizontally and that of each horizontal strip vertically. The peak positions in these projection profiles denote possible locations of lines in form images. We then extract the lines starting with the possible line positions in the source image. After all lines have been extracted, redundant lines are removed using a line-verification algorithm and broken lines are linked using a line-merging algorithm. Experimental results show that the proposed method can extract form structures from A4-sized documents in about 3 seconds which is very efficient, compared with the methods based on Hough transformation and run-based line-detection algorithms.

论文关键词:Form-document processing,Strip projection,Line-detection and verification,Field extraction

论文评审过程:Received 13 November 1997, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(97)00156-8