Neighborhood rough sets with distance metric learning for feature selection

作者：

Highlights：

•

摘要

Neighborhood rough set is a useful mathematic tool to describe uncertainty in mixed data. Feature selection based on neighborhood rough set has been studied widely. However, most existing methods use a single predefined distance function to construct neighborhood granules. As not all datasets are created with the same way and data are also often disturbed with noisy, the same distance function may not be optimal for all datasets. This paper aims at improving the discriminative ability and decreasing the uncertainty in the representation from neighborhood rough set to deal with this issue. In this paper, distance learning method is first introduced into neighborhood rough set to optimize the structure of information granules. A novel neighborhood rough set model is then proposed, called Neighborhood rough set Model based on Distance metric learning (NMD). NMD exploits distance metric learning in which samples from the same decision achieve small distance than samples from different decisions. Such a method can improve the consistency of neighborhood granules. The paper also presents the properties of NMD and formulates the importance of feature. In addition, two feature selection algorithms are built upon the proposed NMD. Experimental results on real-world datasets demonstrate the effectiveness of the proposed feature selection algorithms and their superiority against comparison baselines.

论文关键词：Neighborhood rough sets,Neighborhood relation,Distance metric learning,Feature selection

论文评审过程：Received 27 August 2020, Revised 15 April 2021, Accepted 20 April 2021, Available online 21 April 2021, Version of Record 27 April 2021.

论文官网地址：https://doi.org/10.1016/j.knosys.2021.107076