Linearly Combining Density Estimators via Stacking

作者:Padhraic Smyth, David Wolpert

摘要

This paper presents experimental results with both real and artificial data combining unsupervised learning algorithms using stacking. Specifically, stacking is used to form a linear combination of finite mixture model and kernel density estimators for non-parametric multivariate density estimation. The method outperforms other strategies such as choosing the single best model based on cross-validation, combining with uniform weights, and even using the single best model chosen by “Cheating” and examining the test set. We also investigate (1) how the utility of stacking changes when one of the models being combined is the model that generated the data, (2) how the stacking coefficients of the models compare to the relative frequencies with which cross-validation chooses among the models, (3) visualization of combined “effective” kernels, and (4) the sensitivity of stacking to overfitting as model complexity increases.

论文关键词:density estimation, stacking, kernel densities, cross-validation, mixture models

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1007511322260