Dynamic category profiling for text filtering and classification

作者:

Highlights:

摘要

Information is often represented in text form and classified into categories. Unfortunately, automatic classifiers often conduct misclassifications. One of the reasons is that the documents for training the classifiers are mainly from the categories, leading the classifiers to derive category profiles for distinguishing each category from others, rather than measuring the extent to which a document’s content overlaps that of a category. To tackle the problem, we present a technique DP4FC that selects suitable features to construct category profiles to distinguish relevant documents from irrelevant documents. More specially, DP4FC is associated with various classifiers. Upon receiving a document, it helps the classifiers to create dynamic category profiles with respect to the document, and accordingly make proper decisions in filtering and classification. Theoretical analysis and empirical results show that DP4FC may significantly promote different classifiers’ performances under various environments.

论文关键词:Text filtering,Text classification,Dynamic profiling

论文评审过程:Received 22 November 2005, Revised 24 February 2006, Accepted 28 February 2006, Available online 18 April 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.02.008