PINK PANTHER: A COMPLETE ENVIRONMENT FOR GROUND-TRUTHING AND BENCHMARKING DOCUMENT PAGE SEGMENTATION

作者:

Highlights:

摘要

We describe a new approach for the automatic evaluation of document page segmentation algorithms. Unlike techniques that rely on OCR output, our method is region-based: segmentation quality is assessed by comparing the segmentation output, described as a set of regions, to the corresponding ground-truth. Error maps are used to keep track of all the errors associated with each pixel, regardless of the document complexity. Misclassifications, splitting, and merging of regions are among the errors detected by the system. Each error can be weighted individually and the system can be customized to benchmark virtually any type of segmentation task.

论文关键词:Document,Page,Segmentation,Benchmarking,Ground-truth,OCR,Recognition

论文评审过程:Received 17 October 1997, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(97)00137-4