A survey of feature selection methods for Gaussian mixture models and hidden Markov models

作者:Stephen Adams, Peter A. Beling

摘要

Feature selection is the process of reducing the number of collected features to a relevant subset of features and is often used to combat the curse of dimensionality. This paper provides a review of the literature on feature selection techniques specifically designed for Gaussian mixture models (GMMs) and hidden Markov models (HMMs), two common parametric latent variable models. The primary contribution of this work is the collection and grouping of feature selection methods specifically designed for GMMs and for HMMs. An additional contribution lies in outlining the connections between these two groups of feature selection methods. Often, feature selection methods for GMMs and HMMs are treated as separate topics. In this survey, we propose that methods developed for one model can be adapted to the other model. Further, we find that the number of feature selection methods for GMMs outweighs the number of methods for HMMs and that the proportion of methods for HMMs that require supervised data is larger than the proportion of GMM methods that require supervised data. We conclude that further research into unsupervised feature selection methods for HMMs is required and that established methods for GMMs could be adapted to HMMs. It should be noted that feature selection can also be referred to as dimensionality reduction, variable selection, attribute selection, and variable subset reduction. In this paper, we make a distinction between dimensionality reduction and feature selection. Dimensionality reduction, which we do not consider, is any process that reduces the number of features used in a model and can include methods that transform features in order to reduce the dimensionality. Feature selection, by contrast, is a specific form of dimensionality reduction that eliminates feature as inputs into the model. The primary difference is that dimensionality reduction can still require the collection of all the data sources in order to transform and reduce the feature set, while feature selection eliminates the need to collect the irrelevant data sources.

论文关键词:Feature selection, Gaussian mixture model, Hidden Markov model

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-017-9581-3