Fast stochastic second-order method logarithmic in condition number

作者:

Highlights:

• Optimization is an important issue in machine learning because many machine learning models are reformulated as optimization problems. Different kinds of machine learning algorithms mainly focus on minimizing their emiprical loss like deep learning, logisitic regression, and support vector machine. Because data is explosively growing, it is challenging to deal with a large-scale optimization problem. We propose a fast and most condition number free algorithm for train machine learning algorithms. This algorithm has several advantages.

• Contrast to previous work which approximates Hessian matrix by sketched or subsampled Hessian directly, we use the sketched or subsampled Hessian matrix as preconditioner and get the direction vector by preconditioned conjugate gradient method. Accordingly, we propose a novel stochastic second-order algorithm called Preconditioned Newton Conjugate Gradient with Sketched Hessian (PNCG). This strategy makes our algorithm almost independent on the condition number.

• We numerically demonstrate the effectiveness and robustness of our algorithms in widely used machine learning algorithm Ridge Logistic Regression on several real datasets with different condition number. Our algorithms show good performance compared with existing second order methods in all cases, especially when condition number is large.

摘要

•Optimization is an important issue in machine learning because many machine learning models are reformulated as optimization problems. Different kinds of machine learning algorithms mainly focus on minimizing their emiprical loss like deep learning, logisitic regression, and support vector machine. Because data is explosively growing, it is challenging to deal with a large-scale optimization problem. We propose a fast and most condition number free algorithm for train machine learning algorithms. This algorithm has several advantages.•Contrast to previous work which approximates Hessian matrix by sketched or subsampled Hessian directly, we use the sketched or subsampled Hessian matrix as preconditioner and get the direction vector by preconditioned conjugate gradient method. Accordingly, we propose a novel stochastic second-order algorithm called Preconditioned Newton Conjugate Gradient with Sketched Hessian (PNCG). This strategy makes our algorithm almost independent on the condition number.•We numerically demonstrate the effectiveness and robustness of our algorithms in widely used machine learning algorithm Ridge Logistic Regression on several real datasets with different condition number. Our algorithms show good performance compared with existing second order methods in all cases, especially when condition number is large.

论文关键词:

论文评审过程:Received 21 November 2017, Revised 14 October 2018, Accepted 27 November 2018, Available online 30 November 2018, Version of Record 21 December 2018.

论文官网地址:https://doi.org/10.1016/j.patcog.2018.11.031