Effective fuzzy joint mutual information feature selection based on uncertainty region for classification problem

作者:

Highlights:

摘要

Classification problem widely exists in real-world applications. Unfortunately, data quality is the main challenge of the classification models, especially when the data includes irrelevant and redundant features. Feature selection (FS) is an effective preprocessing technique to enhance the quality of the data. For this, an integration of information theory and fuzzy sets introduced powerful measures, such as fuzzy information measures, to develop many feature selection methods. However, estimating fuzzy information measures is not only costly in the space and runtime but also may be affected by the bias between the certainty and uncertainty regions. This paper proposes a novel instance selection based on uncertainty region (ISUR) to overcome these limitations. Then, a state-of-the-art FS method, called fuzzy joint mutual information (FJMI), has been adapted to design an effective FS method, called fuzzy joint mutual information feature selection based on uncertainty region (FJMIUR). The proposed method consists of two processes: instance selection and feature selection. The former selects the uncertainty region that improves the estimation of fuzzy information measures and reduces the consumed cost while the latter selects the most significant features. Using 20 real-world classification datasets, comparative experiments, including well-known and state-of-the-art FS methods, were conducted to evaluate the effectiveness of FJMIUR. The results show the outperformance of FJMIUR in most cases according to six classification measures, average percentage of selected features, space, and runtime.

论文关键词:Feature selection,Instance selection,Fuzzy mutual information,Uncertainty region,Classification problem

论文评审过程:Received 9 January 2022, Revised 22 August 2022, Accepted 11 September 2022, Available online 16 September 2022, Version of Record 30 September 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109885