A novel information theoretic-interact algorithm (IT-IN) for feature selection using three machine learning algorithms

作者:

Highlights:

摘要

The inclusion of irrelevant, redundant, and inconsistent features in the data-mining model results in poor predictions and high computational overhead. This paper proposes a novel information theoretic-based interact (IT-IN) algorithm, which concerns the relevance, redundancy, and consistency of the features. The proposed IT-IN algorithm is compared with existing Interact, FCBF, Relief and CFS feature selection algorithms. To evaluate the classification accuracy of IT-IN and remaining four feature selection algorithms, Naïve Bayes, SVM, and ELM classifier are used for ten UCI repository datasets. The proposed IT-IN performs better than existing above algorithms in terms of number of features. The specially designed hash function is used to speed up the IT-IN algorithms and provides minimum computation time than the Interact algorithms. The result clearly reveals that the proposed feature selection algorithm improves the classification accuracy for ELM, Naïve Bayes, and SVM classifiers. The performance of proposed IT-IN with ELM classifier is superior to other classifiers.

论文关键词:Feature selection,Correlation,Relevance,Redundancy,Consistency

论文评审过程:Available online 6 May 2010.

论文官网地址:https://doi.org/10.1016/j.eswa.2010.04.084