Classifying text streams by keywords using classifier ensemble
作者:
Highlights:
•
摘要
Traditional approaches for text data stream classification usually require the manual labeling of a number of documents, which is an expensive and time consuming process. In this paper, to overcome this limitation, we propose to classify text streams by keywords without labeled documents so as to reduce the burden of labeling manually. We build our base text classifiers with the help of keywords and unlabeled documents to classify text streams, and utilize classifier ensemble algorithms to cope with concept drifting in text data streams. Experimental results demonstrate that the proposed method can build good classifiers by keywords without manual labeling, and when the ensemble based algorithm is used, the concept drift in the streams can be well detected and adapted, which performs better than the single window algorithm.
论文关键词:Text stream classification,Concept drift,Classifier ensemble,Knowledge acquisition
论文评审过程:Received 5 December 2009, Revised 10 May 2011, Accepted 10 May 2011, Available online 1 June 2011.
论文官网地址:https://doi.org/10.1016/j.datak.2011.05.002