OFES: Optimal feature evaluation and selection for multi-class classification

作者:

Highlights:

摘要

The complexity and accuracy of classification algorithms largely depend on the size and the quality of the feature set used to build classifiers. Feature evaluation and selection are critical steps to decide a small set of high-quality features to build accurate and efficient classifiers since low-quality features not only have negative impacts on classification results but also increase the complexity of classification algorithms. Current popular feature selection algorithms are not sufficient in selecting a set of high-quality features and discarding low-quality features, especially for streaming data. This paper proposes a novel and efficient approach, optimal feature evaluation and selection (OFES), to evaluate and select high-quality features for multi-class classification. OFES first measures the difference between any two classes based on the feature that is to be evaluated. Then, it defines two quantitative measures to evaluate quality of the feature and identify high-quality features. Applying OFES in a multi-class classification application that identifies users based on their arm movement patterns, we find when compared with other popular feature evaluation and selection approaches, such as Information Gain Feature Ranking and Random Projections with Matlab feature ranking, OFES identifies a set of high-quality features that improves the accuracy of classification regardless of different classification algorithms. It also demonstrates great scalability with the increase of number of classes and yields a higher accuracy of 95%.

论文关键词:Feature evaluation,Feature selection,Classification

论文评审过程:Received 8 November 2020, Revised 15 December 2021, Accepted 9 March 2022, Available online 15 March 2022, Version of Record 28 March 2022.

论文官网地址:https://doi.org/10.1016/j.datak.2022.102007