Unsupervised feature selection for image classification: A bipartite matching-based principal component analysis approach

作者:

Highlights:

摘要

In this paper, we propose a general unsupervised feature selection method named unsupervised feature selection using principal component analysis (UFSPCA). Repetitive information causes redundancy in information and if there is a correlation between features, it is not easy to understand the information is repetitive. Accordingly, we first use PCA to create uncorrelated and orthogonal features, then calculate the similarities between the original and uncorrelated features. Next, we modeled two sets of original and orthogonal features and their similarity between them to a weighted bipartite graph. Finally, we obtained a matching with the maximum weight using the Hungarian algorithm. The vertices of the original features that are in this matching are the selected features. To illustrate the optimality and efficiency of the proposed method, we evaluated the performance of our proposed method on five datasets using the KNN classifier and compared it with seven well-known unsupervised feature selection algorithms. The evaluation results show that the UFSPCA method is superior to the other seven algorithms.

论文关键词:PCA,Pearson correlation coefficient,Bipartite graph matching,Augmenting path,Hungarian algorithm

论文评审过程:Received 28 October 2021, Revised 2 May 2022, Accepted 17 May 2022, Available online 25 May 2022, Version of Record 3 June 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109085