Dependence-biased clustering for variable selection with random forests

作者:

Highlights:

• We introduce a novel conditional permutation measure for variable importance.

• This measure leverages inter-variable dependencies via biased K-means clustering.

• This measure allows to select a small number of relevant, non-redundant variables.

• Extensive results show our variable selection approach is very effective in practice.

摘要

•We introduce a novel conditional permutation measure for variable importance.•This measure leverages inter-variable dependencies via biased K-means clustering.•This measure allows to select a small number of relevant, non-redundant variables.•Extensive results show our variable selection approach is very effective in practice.

论文关键词:Variable selection,Random forest,Permutation importance,Regression,Classification,Clustering

论文评审过程:Received 13 October 2018, Revised 29 May 2019, Accepted 21 July 2019, Available online 24 July 2019, Version of Record 5 August 2019.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.106980