On the optimal number of features in the classification of multivariate Gaussian data

作者:

Highlights:

摘要

We consider the classification problem in the context of two equiprobable multivariate Gaussian densities with a common covariance matrix. The use of Anderson's W statistic for classification results in the existence of an optimum number of features, popt such that, for a given sample size, the average probability of misclassification decreases at first as the number of features are increased, attains a minimum at popt and then starts increasing. We have examined this peaking phenomenon for several cases and provide expressions which relate popt to the number of available training samples and the Mahalanobis distance between the two populations. We also show that to prevent peaking, each additional feature's contribution to the Mahalanobis distance must be a certain proportion of the accumulated Mahalanobis distance.

论文关键词:Multivariate Gaussian densities,Average probability of misclassification,Mahalanobis distance,W statistic,Training samples,Peaking phenomenon,Optimum number of features

论文评审过程:Received 11 October 1977, Revised 30 March 1978, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(78)90008-0