Stability of feature selection algorithms: a study on high-dimensional spaces

作者:Alexandros Kalousis, Julien Prados, Melanie Hilario

摘要

With the proliferation of extremely high-dimensional data, feature selection algorithms have become indispensable components of the learning process. Strangely, despite extensive work on the stability of learning algorithms, the stability of feature selection algorithms has been relatively neglected. This study is an attempt to fill that gap by quantifying the sensitivity of feature selection algorithms to variations in the training set. We assess the stability of feature selection algorithms based on the stability of the feature preferences that they express in the form of weights-scores, ranks, or a selected feature subset. We examine a number of measures to quantify the stability of feature preferences and propose an empirical way to estimate them. We perform a series of experiments with several feature selection algorithms on a set of proteomics datasets. The experiments allow us to explore the merits of each stability measure and create stability profiles of the feature selection algorithms. Finally, we show how stability profiles can support the choice of a feature selection algorithm.

论文关键词:Feature selection, High dimensionality, Feature stability

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-006-0040-8