Form recognition using linear structure

作者:

Highlights:

摘要

A novel solution for form recognition based on the description of linear structure is proposed in this paper. The geometric layout of objects such as lines, text, and spacing on a form is converted into a linear string representation. A new, generic, quantised string format is proposed and tested on classification of business forms. Very encouraging results have been obtained and the technique can be used for a wide range of applications and extended to handle documents/shapes without obvious linear features. The use of strings facilitates quick and robust measures of similarity between two documents, and a quantifiable tolerance of segmentation inconsistencies is made possible. This is an obvious advantage of the system as compared with others.

论文关键词:Document,Form,Linear,Quantization,Similarity,String

论文评审过程:Received 5 March 1997, Revised 2 July 1998, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(98)00106-X