A robust system for document layout analysis using multilevel homogeneity structure

作者:

Highlights:

• This paper presents a robust system for the document layout analysis.

• The proposed system is based on multilevel homogeneity structure (MHS).

• The proposed system is designed to work with many different document languages.

• Our system is tested on four published datasets with different document languages.

• The proposed system (MHS) won the RDCL-2015 competition (ICDAR2015).

摘要

•This paper presents a robust system for the document layout analysis.•The proposed system is based on multilevel homogeneity structure (MHS).•The proposed system is designed to work with many different document languages.•Our system is tested on four published datasets with different document languages.•The proposed system (MHS) won the RDCL-2015 competition (ICDAR2015).

论文关键词:Document layout analysis,Multilevel homogeneity structure (MHS),Page segmentation,Document image processing,OCR

论文评审过程:Received 9 January 2017, Revised 10 May 2017, Accepted 11 May 2017, Available online 12 May 2017, Version of Record 18 May 2017.

论文官网地址:https://doi.org/10.1016/j.eswa.2017.05.030