MABUSE: A margin optimization based feature subset selection algorithm using boosting principles

作者:

Highlights:

摘要

Feature subset selection is one of the most common procedures in machine learning tasks. In a broad sense, feature selection methods can be classified into three major groups, embedded, filter and wrapper methods. Although wrappers might attain superior classification performances, they suffer from scalability issues as they are more computationally expensive than the other methods. Filters are typically faster, and sometimes they are the only applicable methods when the datasets are large. In the field of classification, margin optimization has been proven to be an efficient approach for improving the generalization performance of many classification models. Although margins have been used as criteria for feature selection, in most cases, the most advanced methods are wrappers, which suffer from high computational costs and do not outperform the faster algorithms. In this paper, we propose MABUSE, which is a feature selection method that optimizes margins from a filter perspective. We consider a nearest-neighbor margin definition and, borrowing from the strategy of classifier ensemble construction using boosting, we develop a new method that uses a simple heuristic search. Extensive experimental validation demonstrates that our proposed approach outperforms the state-of-the-art algorithms in both classification and reduction, and has a computational cost that is similar to previous algorithms.

论文关键词:Feature subset selection,Filter and hybrid methods,Margin optimization

论文评审过程:Received 1 March 2022, Revised 22 June 2022, Accepted 22 July 2022, Available online 30 July 2022, Version of Record 12 August 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109529