Joint feature and instance selection using manifold data criteria: application to image classification

作者:Fadi Dornaika

摘要

In many pattern recognition applications feature selection and instance selection can be used as two data preprocessing methods that aim at reducing the computational cost of the learning process. Moreover, in some cases, feature subset selection can improve the classification performance. Feature selection and instance selection can be interesting since the choice of features and instances greatly influence the performance of the learnt models as well as their training costs. In the past, unifying both problems was carried out by solving a global optimization problem using meta-heuristics. This paradigm not only does not exploit the manifold structure of data but can be computationally expensive. To the best of our knowledge, the joint use of sparse modeling representative and feature subset relevance have not been exploited by the joint feature and selection methods. In this paper, we target the joint feature and instance selection by adopting feature subset relevance and sparse modeling representative selection. More precisely, we propose three schemes for the joint feature and instance selection. The first is a wrapper technique while the two remaining ones are filter approaches. In the filter approaches, the search process adopts a genetic algorithm in which the evaluation is mainly given by a score that quantify the goodness of the features and instances. An efficient instance selection technique is used and integrated in the search process in order to adapt the instances to the candidate feature subset. We evaluate the performance of the proposed schemes using image classification where classifiers are the nearest neighbor classifier and support vector machine classifier. The study is conducted on five public image datasets. These experiments show the superiority of the proposed schemes over various baselines. The results confirm that the filter approaches leads to promising improvement on classification accuracy when both feature selection and instance selection are adopted.

论文关键词:Feature selection, Instance selection, Feature and instance selection, Data reduction, Linear discriminant analysis (LDA), Local discriminant embedding (LDE), Classification

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-020-09889-4