To regularize or not: Revisiting SGD with simple algorithms and experimental studies

作者:

Highlights:

• l2 regularized stochastic gradient descent may be suboptimal for the true loss.

• A simple accumulated stochastic gradient algorithm with advantages is proposed.

• Theoretical and experimental studies in detail validates the claims.

• Insights on when to regularize for stochastic gradient algorithms are provided.

摘要

•l2 regularized stochastic gradient descent may be suboptimal for the true loss.•A simple accumulated stochastic gradient algorithm with advantages is proposed.•Theoretical and experimental studies in detail validates the claims.•Insights on when to regularize for stochastic gradient algorithms are provided.

论文关键词:Stochastic gradient descent,Regularization,Accumulated stochastic gradient,Big data,SVMs,Regularized regression

论文评审过程:Received 16 August 2017, Revised 25 May 2018, Accepted 10 June 2018, Available online 15 June 2018, Version of Record 17 June 2018.

论文官网地址:https://doi.org/10.1016/j.eswa.2018.06.026