Toward a tight upper bound for the error probability of the binary Gaussian classification problem

作者:

Highlights:

摘要

It is well known that the error probability, of the binary Gaussian classification problem with different class covariance matrices, cannot be generally evaluated exactly because of the lack of closed-form expression. This fact pointed out the need to find a tight upper bound for the error probability. This issue has been for more than 50 years ago and is still of interest. All derived upper-bounds are not free of flaws. They might be loose, computationally inefficient particularly in highly dimensional situations, or excessively time consuming if high degree of accuracy is desired. In this paper, a new technique is developed to estimate a tight upper bound for the error probability of the well-known binary Gaussian classification problem with different covariance matrices. The basic idea of the proposed technique is to replace the optimal Bayes decision boundary with suboptimal boundaries which provide an easy-to-calculate upper bound for the error probability. In particular, three types of decision boundaries are investigated: planes, elliptic cylinders, and cones. The new decision boundaries are selected in such a way as to provide the tightest possible upper bound. The proposed technique is found to provide an upper bound, tighter than many of the often used bounds such as the Chernoff bound and the Bayesian-distance bound. In addition, the computation time of the proposed bound is much less than that required by the Monte-Carlo simulation technique. When applied to real world classification problems, obtained from the UCI repository [H. Chernoff, A measure for asymptotic efficiency of a hypothesis based on a sum of observations, Ann. Math. Statist. 23 (1952) 493–507.], the proposed bound was found to provide a tight bound for the analytical error probability of the quadratic discriminant analysis (QDA) classifier and a good approximation to its empirical error probability.

论文关键词:Binary classification,Bayesian decision rule,Decision boundary,Error probability,Monte-Carlo simulations,Multivariate normal distribution,Quadratic surfaces

论文评审过程:Received 22 March 2007, Revised 23 October 2007, Accepted 26 October 2007, Available online 19 November 2007.

论文官网地址:https://doi.org/10.1016/j.patcog.2007.10.028