Multi-label feature selection based on label correlations and feature redundancy

作者:

Highlights:

摘要

The task of multi-label feature selection (MLFS) is to reduce redundant information and generate the optimal feature subset from the original multi-label data. A variety of MLFS methods utilize pseudo-label matrix to explore label correlations for identifying the most informative features. Moreover, some methods consider feature redundancy by virtue of information theory technique, but no prior literature unites them in a framework to perform feature selection. To remedy the deficiency, we propose a novel MLFS method based on label correlations and feature redundancy, namely LFFS. To be specific, we first utilize the ridge regression to create a feature selection matrix and a low dimensional embedding, and impose -norm on the feature selection matrix. Then, the low-dimensional embedding is devoted to mine label correlations, which can keep the global and local structure of original label space. Finally, cosine similarity is employed to analyze feature redundancy, so as to generate a low redundancy feature subset. By virtue of the above process, we design an objective function followed with an optimization solution. Comprehensive experiments results demonstrate the effectiveness and superiority of the proposed method LFFS among ten competition methods.

论文关键词:Multi-label learning,Feature selection,Label correlations,Feature redundancy,Optimization framework

论文评审过程:Received 24 October 2021, Revised 17 January 2022, Accepted 19 January 2022, Available online 25 January 2022, Version of Record 3 February 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108256