Symbolic Compression and Processing of Document Images

作者：

Highlights：

•

摘要

In this paper, we describe a compression and representation scheme which exploits the component-level redundancy found within a document image. The approach identifies patterns which appear repeatedly, represents similar patterns with a single prototype, stores the location of pattern instances, and codes the residuals between the prototypes and the pattern instances. Using a novel encoding scheme, we provide a representation that facilitates scalable lossy compression and progressive transmission and supports document image analysis in the compressed domain. We motivate the approach, provide details of the encoding procedures, report compression results, and describe a class of document image understanding tasks that operate on the compressed representation.

论文关键词：

论文评审过程：Received 3 January 1997, Accepted 15 December 1997, Available online 10 April 2002.

论文官网地址：https://doi.org/10.1006/cviu.1998.0682