Relevance assignation feature selection method based on mutual information for machine learning

作者:

Highlights:

摘要

With the complication of the subjects and environment of the machine learning, feature selection methods have been used more frequently as an effective mean of dimension reduction. However, existing feature selection methods are deficient in striking a balance between the relevance evaluation accuracy with the searching efficiency. In this regard, the characteristics of the relevance between the feature set and the classification result are analyzed. Then, we propose our Relevance Assignation Feature Selection (RAFS) method based on the mutual information theory, which assigns the relevance evaluation according to the redundancy. With this method, we can estimate the contribution of each feature in a feature set, which is regarded as value of the feature and is used as the heuristic index in searching of the relevant features. A special dataset (“Grid World”) with strong interactive features is designed. Using the Grid World and six other natural datasets, the proposed method is compared with six other feature selection methods. Results show that in the Grid World dataset, the RAFS method can find correct relevant features with the probability above 90%, much higher than the others. In six other datasets, the RAFS method also has the best performance in the classification accuracy.

论文关键词:Feature selection,Kernel function,Mutual information,Redundancy evaluation,Relevance assignation

论文评审过程:Received 1 November 2019, Revised 17 August 2020, Accepted 19 August 2020, Available online 21 September 2020, Version of Record 24 September 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106439