Easy balanced mixing for long-tailed data

作者:

Highlights:

摘要

In long-tailed datasets, head classes occupy most of the data, while tail classes have very few samples. The imbalanced distribution of long-tailed data leads classifiers to overfit the data in head classes and mismatch with the training and testing distributions, especially for tail classes. To this end, this paper proposes an easy balanced mixing framework abbreviated EZBM to fit the long-tailed data and match training and testing distributions. The proposed EZBM utilizes a two-stage learning strategy to conduct feature extraction and classification hyperplane adjustment. In the first phase, EZBM utilizes ResNet as a backbone to map the input data into a new feature space and a fully connected layer as a classifier to conduct the feature extracting process. In the second phase, EZBM combines each training sample with another sample from a random class in the feature space to generate a mixed sample close to the head class. Then, EZBM adjusts the classification hyperplane to be close to mixed samples. In this way, EZBM biases the classification hyperplane to the head class, which is suitable for recognizing tail samples. Experiments on long-tailed datasets demonstrate the effectiveness of EZBM.

论文关键词:Long-tailed data,Balanced mixing,Mixed sample,Feature extraction,Classification hyperplane adjustment

论文评审过程:Received 17 October 2021, Revised 10 April 2022, Accepted 12 April 2022, Available online 25 April 2022, Version of Record 10 May 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108816