A FEATURE POINT CLUSTERING APPROACH TO THE RECOGNITION OF FORM DOCUMENTS

作者:

Highlights:

摘要

Among various kinds of documents, forms are one of the important types. Automatic processing of form documents is a problem which is essential to the advancement of office automation. In this paper, we will present a clustering-based approach to recognize form documents. In our approach, the characters embedded in a form document are extracted first by separating the characters and structured line patterns into two distinct groups. Next, clustering process is employed to the corner points of the remained structured line patterns. Each form document is then represented as a weighted graph according to the clustering result. Form recognition problem is thereby formulated as a graph matching problem. The feasibility of the novel method is demonstrated through experimenting various kinds of forms. Experimental results reveal the feasibility of the novel method.

论文关键词:Document analysis,Feature point clustering,Maximin clustering algorithm,Weighted graph matching

论文评审过程:Received 14 October 1996, Revised 24 October 1997, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(97)00162-3