DF-SVM: a decision forest constructed on artificially enlarged feature space by support vector machine

作者:M. Faisal Zaman, Hideo Hirose

摘要

Enlarging the feature space of the base tree classifiers in a decision forest by means of informative features extracted from an additional predictive model is advantageous for classification tasks. In this paper, we have empirically examined the performance of this type of decision forest with three different base tree classifier models including; (1) the full decision tree, (2) eight-node decision tree and (3) two-node decision tree (or decision stump). The hybrid decision forest with these base classifiers are trained in nine different sized resampled training sets. We have examined the performance of all these ensembles from different point of views; we have studied the bias-variance decomposition of the misclassification error of the ensembles, then we have investigated the amount of dependence and degree of uncertainty among the base classifiers of these ensembles using information theoretic measures. The experiment was designed to find out: (1) optimal training set size for each base classifier and (2) which base classifier is optimal for this kind of decision forest. In the final comparison, we have checked whether the subsampled version of the decision forest outperform the bootstrapped version. All the experiments have been conducted with 20 benchmark datasets from UCI machine learning repository. The overall results clearly point out that with careful selection of the base classifier and training sample size, the hybrid decision forest can be an efficient tool for real world classification tasks.

论文关键词:Decision forest, Tree node size, Subsample ratio, Bias-variance decomposition, Empirical analysis

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-011-9291-1