Page segmentation of Chinese newspapers

作者:

Highlights:

摘要

This paper describes a new bottom-up method for page segmentation of Chinese document images. Because of some special characteristics of Chinese newspaper documents, many traditional methods developed for English documents fail in segmenting them correctly. Based on run-length smoothing algorithm and minimal spanning tree clustering, the proposed method can resolve the problems of segmenting Chinese documents that differ from English documents.

论文关键词:Document layout analysis,Page segmentation,Run-length smoothing,Minimal spanning tree

论文评审过程:Received 31 October 2001, Accepted 31 October 2001, Available online 10 January 2002.

论文官网地址:https://doi.org/10.1016/S0031-3203(01)00248-5