Bi-objective feature selection for discriminant analysis in two-class classification

作者:

Highlights:

摘要

This works deals with the problem of selecting variables (features) that are subsequently used in discriminant analysis. The aim is to find, from a set of m variables, smaller subsets which enable an efficient classification of cases in two classes. We consider two objectives, each one associated with the misclassification error in each class (type I and type II errors). Thus, we establish a bi-objective problem and develop an algorithm based on the NSGA-II strategy to this specific problem, in order to obtain a set of non-dominated solutions. Managing these two objectives separately (and not jointly) allows an enhanced analysis of the obtained solutions by observing the approach to efficient frontier. This is especially significant when each type of error has a different level of importance or when they cannot be compared. To illustrate these issues, several known databases from literature are used, as well as an additional database with several Spanish firms featured by financial variables and two classes: “creditworthy” and “non-creditworthy”. Finally, we show that when solutions obtained by our NSGA-II implementation are evaluated from the classic mono-objective perspective (minimizing the ratio of both error types jointly) they are better than those obtained by classic methods for feature selection and similar than those provided by other recently published methods.

论文关键词:Classification,Discriminant analysis,Feature selection,Bi-objective optimization,NSGA-II algorithm

论文评审过程:Received 28 August 2012, Revised 19 December 2012, Accepted 20 January 2013, Available online 9 February 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.01.019