Nesterov's accelerated gradient and momentum as approximations to regularised update descent

A Botev, G Lever, D Barber - 2017 International joint conference …, 2017 - ieeexplore.ieee.org
2017 International joint conference on neural networks (IJCNN), 2017ieeexplore.ieee.org
We present a unifying framework for adapting the update direction in gradient-based
iterative optimization methods. As natural special cases we re-derive classical momentum
and Nesterov's accelerated gradient method, lending a new intuitive interpretation to the
latter algorithm. We show that a new algorithm, which we term Regularised Gradient
Descent, can converge more quickly than either Nesterov's algorithm or the classical
momentum algorithm.
We present a unifying framework for adapting the update direction in gradient-based iterative optimization methods. As natural special cases we re-derive classical momentum and Nesterov's accelerated gradient method, lending a new intuitive interpretation to the latter algorithm. We show that a new algorithm, which we term Regularised Gradient Descent, can converge more quickly than either Nesterov's algorithm or the classical momentum algorithm.
ieeexplore.ieee.org