Supervised heterogeneous feature transfer via random forests

作者：

摘要

Transfer learning across heterogeneous feature spaces can, in general, be a very difficult problem in practice due to the heterogeneity of features and lack of correspondence between data points of different domains. In this paper, we present a novel supervised domain adaptation algorithm (SHDA-RF) that transfers knowledge from a data-rich source domain to a target domain with only few training instances. The proposed method makes use of random forests to identify pivot features that bridge the two domains. The key idea of the proposed feature transfer approach is that every path in a decision tree leading to a partition of the data is associated with a certain label distribution and the label distributions that appear both in the source and target random forest models can be used as pivots for bridging the two domains. This information is used to generate a sparse feature transformation matrix, which maps patterns from the source feature space to the target feature space. The target model is then retrained along with the projected source data. We conduct extensive experiments on diverse datasets of varying dimensions and sparsity to verify the superiority of the proposed approach over other baseline and state of the art transfer approaches.

论文关键词：Feature transfer learning,Heterogeneous domain adaptation,Random forests

论文评审过程：Received 18 January 2017, Revised 17 August 2018, Accepted 20 November 2018, Available online 22 November 2018, Version of Record 29 November 2018.

论文官网地址：https://doi.org/10.1016/j.artint.2018.11.004