Two density-based sampling approaches for imbalanced and overlapping data
作者:
Highlights:
• Introducing two density-based methods to achieve balance, eliminate overlap and noise data.
• Increasing minority class learning quality without significantly harming majority class learning.
• Creating a boundary between the classes while trying to maintain their structures and shapes maximally.
• Introducing a density-based hybrid sampling method to achieve balance and create a uniform distribution of data in classes.
• Comprehensive evaluation of the proposed methods in compared to other recent related works on a variant set of imbalanced datasets.
摘要
•Introducing two density-based methods to achieve balance, eliminate overlap and noise data.•Increasing minority class learning quality without significantly harming majority class learning.•Creating a boundary between the classes while trying to maintain their structures and shapes maximally.•Introducing a density-based hybrid sampling method to achieve balance and create a uniform distribution of data in classes.•Comprehensive evaluation of the proposed methods in compared to other recent related works on a variant set of imbalanced datasets.
论文关键词:Imbalanced dataset,Density,Undersampling,Oversampling,Overlapping
论文评审过程:Received 6 August 2021, Revised 13 December 2021, Accepted 11 January 2022, Available online 29 January 2022, Version of Record 9 February 2022.
论文官网地址:https://doi.org/10.1016/j.knosys.2022.108217