LMNNB: Two-in-One imbalanced classification approach by combining metric learning and ensemble learning

作者:Shaojie Qiao, Nan Han, Faliang Huang, Kun Yue, Tao Wu, Yugen Yi, Rui Mao, Chang-an Yuan

摘要

In the real-world applications of machine learning and cybernetics, the data with imbalanced distribution of classes or skewed class proportions is very pervasive. When dealing with imbalanced data, traditional classification approaches might fail to learn a good classifier. In the phase of learning, these algorithms are greatly impacted by the skewed distribution of data. Consequently, the performance of classification drops drastically. In this study, we propose a novel two-in-one algorithm for classifying the imbalanced data by integrating metric learning and ensemble learning algorithms. Firstly, we design a new metric learning algorithm for imbalanced data, which is called Large Margin Nearest Neighbors Balance (called LMNNB). This method can minimize the distance between one sample and its similar neighbors which belong to the same class, and maximize the distance from its dissimilar neighbors which belong to different classes as well. Essentially, this beneficial effect can also be achieved even if the distribution of data is imbalanced. Through metric learning, the imbalance data can be used to learn a better classifier. Secondly, we propose an ensemble learning algorithm to further improve the performance of classification. This method combines multiple sub-classifiers and makes decisions by applying a soft voting strategy. Extensive experiments are conducted on real benchmark imbalanced datasets to demonstrate the effectiveness of LMNNB with ensemble algorithm (called LMNNB-E) in several evaluation measurements. The results show that LMNNB and LMNNB-E outperform the state-of-the-art methods in classifying imbalance data.

论文关键词:Machine learning, Imbalanced data, Large margin nearest neighbors, Metric learning, Ensemble learning, Classification

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02901-6