Incremental feature selection for dynamic hybrid data using neighborhood rough set

作者:

Highlights:

摘要

Feature selection with rough sets aims to delete redundant conditional features from static data by considering single type features. However, traditional feature selection methods generally ignore real-world scenarios: hybrid conditional feature set including missing, categorical and numerical ones coexists in the data, and the object set may change dynamically in one by one over time. To deal with dynamic hybrid data with mixed-type features, we propose a neighborhood entropy-based incremental feature selection framework by neighborhood rough set model. In this paper, the dynamics of an object set involves the change of a single object and multiple objects. Therefore, two incremental feature selection algorithms are developed for hybrid data with the dynamic change of a single object and multiple objects, respectively. At first, an incremental manner is utilized to compute the neighborhood entropy as feature criterion. On this basis, the incremental computations of feature significance are used to select candidate features in a descending order. Meanwhile, a deletion strategy is employed to filter out redundant features from the selection results. Finally, experimental results on different real-life data sets demonstrate the proposed incremental algorithms can outperform the non-incremental algorithm for feature selection in speed within comparable classification accuracy. Especially for multiple objects adding into and deleting from the hybrid data, the incremental algorithm is more efficient to select a subset of features than the algorithm for handling the dynamic change of a single object.

论文关键词:Feature selection,Rough sets,Attribute reduction,Incremental algorithm,Dynamic data

论文评审过程:Received 21 August 2019, Revised 9 January 2020, Accepted 12 January 2020, Available online 25 January 2020, Version of Record 18 May 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.105516