CDBH: A clustering and density-based hybrid approach for imbalanced data classification

作者:

Highlights:

• The proposed hybrid method performs over-sampling and under-sampling effectively.

• The sampling process is done quickly and efficiently by clustering.

• The roulette wheel selects the most suitable samples through probabilities.

• The proposed method does not need to set any parameter.

• The proposed method reaches to IR=1 for all imbalanced data sets.

摘要

•The proposed hybrid method performs over-sampling and under-sampling effectively.•The sampling process is done quickly and efficiently by clustering.•The roulette wheel selects the most suitable samples through probabilities.•The proposed method does not need to set any parameter.•The proposed method reaches to IR=1 for all imbalanced data sets.

论文关键词:Imbalanced data,Hybrid methods,Clustering,K-means algorithm,Dense samples,Roulette wheel selection operator

论文评审过程:Received 23 April 2020, Revised 14 September 2020, Accepted 15 September 2020, Available online 28 September 2020, Version of Record 30 September 2020.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.114035