Imbalance: Oversampling algorithms for imbalanced classification in R

作者:

Highlights:

摘要

Addressing imbalanced datasets in classification tasks is a relevant topic in research studies. The main reason is that for standard classification algorithms, the success rate when identifying minority class instances may be adversely affected. Among different solutions to cope with this problem, data level techniques have shown a robust behavior. In this paper, the novel imbalance package is introduced. Written in R and C++, and available at CRAN repository, this library includes recent relevant oversampling algorithms to improve the quality of data in imbalanced datasets, prior to performing a learning task. The main features of the package, as well as some illustrative examples of its use are detailed throughout this manuscript.

论文关键词:Oversampling,Imbalanced classification,Machine learning,Preprocessing,SMOTE

论文评审过程:Received 1 March 2018, Revised 14 June 2018, Accepted 25 July 2018, Available online 23 August 2018, Version of Record 31 October 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.07.035