A comparative study of nonlinear manifold learning methods for cancer microarray data classification

作者:

Highlights:

摘要

The paper presents an empirical comparison of the most prominent nonlinear manifold learning techniques for dimensionality reduction in the context of high-dimensional microarray data classification. In particular, we assessed the performance of six methods: isometric feature mapping, locally linear embedding, Laplacian eigenmaps, Hessian eigenmaps, local tangent space alignment and maximum variance unfolding. Unlike previous studies on the subject, the experimental framework adopted in this work properly extends to dimensionality reduction the supervised learning paradigm, by regarding the test set as an out-of-sample set of new points which are excluded from the manifold learning process. This in order to avoid a possible overestimate of the classification accuracy which may yield misleading comparative results. The different empirical approach requires the use of a fast and effective out-of-sample embedding method for mapping new high-dimensional data points into an existing reduced space. To this aim we propose to apply multi-output kernel ridge regression, an extension of linear ridge regression based on kernel functions which has been recently presented as a powerful method for out-of-sample projection when combined with a variant of isometric feature mapping. Computational experiments on a wide collection of cancer microarray data sets show that classifiers based on Isomap, LLE and LE were consistently more accurate than those relying on HE, LTSA and MVU. In particular, under different experimental conditions LLE-based classifier emerged as the most effective method whereas Isomap algorithm turned out to be the second best alternative for dimensionality reduction.

论文关键词:Nonlinear dimensionality reduction,Manifold learning,Classification,Cancer microarray data

论文评审过程:Available online 30 October 2012.

论文官网地址:https://doi.org/10.1016/j.eswa.2012.10.044