Making gradient descent optimal for strongly convex stochastic optimization
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic
optimization problems which arise in machine learning. For strongly convex problems, its …
optimization problems which arise in machine learning. For strongly convex problems, its …
Non-convex learning via stochastic gradient langevin dynamics: a nonasymptotic analysis
M Raginsky, A Rakhlin… - Conference on Learning …, 2017 - proceedings.mlr.press
Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Stochastic Gradient
Descent, where properly scaled isotropic Gaussian noise is added to an unbiased estimate …
Descent, where properly scaled isotropic Gaussian noise is added to an unbiased estimate …
Deep learning: a statistical viewpoint
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
Deep convolutional neural networks for breast cancer histology image analysis
Breast cancer is one of the main causes of cancer death worldwide. Early diagnostics
significantly increases the chances of correct treatment and survival, but this process is tedious …
significantly increases the chances of correct treatment and survival, but this process is tedious …
Online learning with predictable sequences
A Rakhlin, K Sridharan - Conference on Learning Theory, 2013 - proceedings.mlr.press
We present methods for online linear optimization that take advantage of benign (as opposed
to worst-case) sequences. Specifically if the sequence encountered by the learner is …
to worst-case) sequences. Specifically if the sequence encountered by the learner is …
Automatic instrument segmentation in robot-assisted surgery using deep learning
Semantic segmentation of robotic instruments is an important problem for the robot-assisted
surgery. One of the main challenges is to correctly detect an instrument's position for the …
surgery. One of the main challenges is to correctly detect an instrument's position for the …
Sequential complexities and uniform martingale laws of large numbers
We establish necessary and sufficient conditions for a uniform martingale Law of Large
Numbers. We extend the technique of symmetrization to the case of dependent random variables …
Numbers. We extend the technique of symmetrization to the case of dependent random variables …
Size-independent sample complexity of neural networks
N Golowich, A Rakhlin… - Conference On Learning …, 2018 - proceedings.mlr.press
We study the sample complexity of learning neural networks, by providing new bounds on
their Rademacher complexity assuming norm constraints on the parameter matrix of each …
their Rademacher complexity assuming norm constraints on the parameter matrix of each …
[PDF][PDF] Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization.
… Alexander Rakhlin Computer Science Division UC Berkeley rakhlin@cs.berkeley.edu …
Rakhlin, and A. Tewari. High-probability bounds for the regret of bandit online linear …
Rakhlin, and A. Tewari. High-probability bounds for the regret of bandit online linear …
The statistical complexity of interactive decision making
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …