Knowledge-based system for text classification using ID6NB algorithm

作者:

Highlights:

摘要

This paper presents a novel algorithm named ID6NB for extending decision tree induced by Quinlan’s non-incremental ID3 algorithm. The presented approach is aimed at suggesting the solutions for few unhandled exceptions of the Decision tree induction algorithms such as (i) the situation in which the majority voting makes incorrect decision (generating two different types of rules for same data), and (ii) in case of dimensionality reduction by decision tree induction algorithms, the determination of appropriate attribute at a node where two or more attributes have equal highest information gain. Exception due to majority voting is handled with the help of Naive Bayes algorithm and also novel solutions are given for dimensionality reduction. As a result, the classification accuracy has drastically improved. An extensive experimental evaluation on a number of real and synthetic databases shows that ID6NB is a state-of-the-art classification algorithm that outperforms well than other methods of decision tree learning.

论文关键词:Data mining,Dimensionality reduction,Classification,Decision tree,Majority voting,Naive Bayes

论文评审过程:Received 17 November 2007, Revised 12 April 2008, Accepted 21 April 2008, Available online 1 May 2008.

论文官网地址:https://doi.org/10.1016/j.knosys.2008.04.006