A tree-based algorithm for attribute selection

作者:José Augusto Baranauskas, Oscar Picchi Netto, Sérgio Ricardo Nozawa, Alessandra Alaniz Macedo

摘要

This paper presents an improved version of a decision tree-based filter algorithm for attribute selection. This algorithm can be seen as a pre-processing step of induction algorithms of machine learning and data mining tasks. The filter was evaluated based on thirty medical datasets considering its execution time, data compression ability and AUC (Area Under ROC Curve) performance. On average, our filter was faster than Relief-F but slower than both CFS and Gain Ratio. However for low-density (high-dimensional) datasets, our approach selected less than 2% of all attributes at the same time that it did not produce performance degradation during its further evaluation based on five different machine learning algorithms.

论文关键词:Attribute selection, Filter, Decision tree, High dimensional data, Data pre-processing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-017-1008-y