A hierarchical classification strategy for digital documents

作者:

Highlights:

摘要

The effective classification of image contents allows us to adopt strategies that can meet the increasing demand for quality, speed and ease of use in imaging applications. We report here on our experience in the use of CART classifiers for the classification of images indexed by low-level perceptual features such as color, texture, and shape. The problem addressed is the complex matter of distinguishing among photographs, graphics, texts, and compound documents. To cope with the great variety of compound documents we have designed a hierarchical classification strategy which first classifies images as compound or non-compound by verifying the homogeneity of the sub-images in terms of low-level features. Non-compound images are then classified as photographs, graphics, or texts. The results are reported and discussed.

论文关键词:CART methodology,Compound documents,Graphics,Image classification,Low-level features,Photographs,Texts

论文评审过程:Received 5 July 2001, Available online 12 April 2002.

论文官网地址:https://doi.org/10.1016/S0031-3203(01)00168-6