Model selection for primal SVM | 数据学习(DataLearner)

摘要

This paper introduces two types of nonsmooth optimization methods for selecting model hyperparameters in primal SVM models based on cross-validation. Unlike common grid search approaches for model selection, these approaches are scalable both in the number of hyperparameters and number of data points. Taking inspiration from linear-time primal SVM algorithms, scalability in model selection is achieved by directly working with the primal variables without introducing any dual variables. The proposed implicit primal gradient descent (ImpGrad) method can utilize existing SVM solvers. Unlike prior methods for gradient descent in hyperparameters space, all work is done in the primal space so no inversion of the kernel matrix is required. The proposed explicit penalized bilevel programming (PBP) approach optimizes both the hyperparameters and parameters simultaneously. It solves the original cross-validation problem by solving a series of least squares regression problems with simple constraints in both the hyperparameter and parameter space. Computational results on least squares support vector regression problems with multiple hyperparameters establish that both the implicit and explicit methods perform quite well in terms of generalization and computational time. These methods are directly applicable to other learning tasks with differentiable loss functions and regularization functions. Both the implicit and explicit algorithms investigated represent powerful new approaches to solving large bilevel programs involving nonsmooth loss functions.