Choosing Between Two Classification Learning Algorithms Based on Calibrated Balanced \(5\times 2\) Cross-Validated F-Test

作者:Yu Wang, Jihong Li, Yanfang Li

摘要

\(5\times 2\) cross-validated F-test based on independent five replications of 2-fold cross-validation is recommended in choosing between two classification learning algorithms. However, the reusing of the same data in a \(5\times 2\) cross-validation causes the real degree of freedom (DOF) of the test to be lower than the F(10, 5) distribution given by (Neural Comput 11:1885–1892, [1]). This easily leads the test to suffer from high type I and type II errors. Random partitions for \(5\times 2\) cross-validation result in difficulty in analyzing the DOF for the test. In particular, Wang et al. (Neural Comput 26(1):208–235, [2]) proposed a new blocked \(3 \times 2\) cross-validation, that considered the correlation between any two 2-fold cross-validations. Based on this, a calibrated balanced \(5\times 2\) cross-validated F-test following F(7, 5) distribution is put forward in this study by calibrating the DOF for the F(10, 5) distribution. Simulated and real data studies demonstrate that the calibrated balanced \(5\times 2\) cross-validated F-test has lower type I and type II errors than the \(5\times 2\) cross-validated F-test following F(10, 5) in most cases.

论文关键词:Test, Type I error, Type II error, Cross-validation, Classification learning algorithm

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-016-9569-z