Feature subset selection combining maximal information entropy and maximal information coefficient

作者:Kangfeng Zheng, Xiujuan Wang, Bin Wu, Tong Wu

摘要

Feature subset selection is an efficient step to reduce the dimension of data, which remains an active research field in decades. In order to develop highly accurate and fast searching feature subset selection algorithms, a filter feature subset selection method combining maximal information entropy (MIE) and the maximal information coefficient (MIC) is proposed in this paper. First, a new metric mMIE-mMIC is defined to minimize the MIE among features while maximizing the MIC between the features and the class label. The mMIE-mMIC algorithm is designed to evaluate whether a candidate subset is valid for classification. Second, two searching strategies are adopted to identify a suitable solution in the candidate subset space, including the binary particle swarm optimization algorithm (BPSO) and sequential forward selection (SFS). Finally, classification is performed on UCI datasets to validate the performance of our work compared to 9 existing methods. Experimental results show that in most cases, the proposed method behaves equally or better than the other 9 methods in terms of classification accuracy and F1-score.

论文关键词:Feature subset selection, BPSO, MIC, SFS

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-019-01537-x