A framework for feature selection through boosting

作者:

Highlights:

摘要

As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy.

论文关键词:Feature selection,Boosting,Ensemble learning,XGBoost

论文评审过程:Received 19 February 2021, Revised 20 June 2021, Accepted 7 September 2021, Available online 16 September 2021, Version of Record 23 September 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.115895