A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition
作者:
Highlights:
• Class-imbalanced data is a common problem to many prediction problems.
• Classification techniques can yield deceivingly high prediction accuracy with imbalanced dataset.
• All balancing techniques improved the prediction accuracy for the minority class.
• SVM combined with SMOTE data-balancing technique achieved the best overall accuracy.
• A sensitivity analysis revealed the most important variables for attrition prediction.
摘要
•Class-imbalanced data is a common problem to many prediction problems.•Classification techniques can yield deceivingly high prediction accuracy with imbalanced dataset.•All balancing techniques improved the prediction accuracy for the minority class.•SVM combined with SMOTE data-balancing technique achieved the best overall accuracy.•A sensitivity analysis revealed the most important variables for attrition prediction.
论文关键词:Student retention,Attrition,Prediction,Imbalanced class distribution,SMOTE,Sampling,Sensitivity analysis
论文评审过程:Available online 31 July 2013.
论文官网地址:https://doi.org/10.1016/j.eswa.2013.07.046