Time-independent generalization bounds for SGLD in non-convex settings
T Farghly, P Rebeschini - Advances in Neural Information …, 2021 - proceedings.neurips.cc
T Farghly, P Rebeschini
Advances in Neural Information Processing Systems, 2021•proceedings.neurips.ccWe establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD)
with constant learning rate under the assumptions of dissipativity and smoothness, a setting
that has received increased attention in the sampling/optimization literature. Unlike existing
bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as
the sample size increases. Using the framework of uniform stability, we establish time-
independent bounds by exploiting the Wasserstein contraction property of the Langevin …
with constant learning rate under the assumptions of dissipativity and smoothness, a setting
that has received increased attention in the sampling/optimization literature. Unlike existing
bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as
the sample size increases. Using the framework of uniform stability, we establish time-
independent bounds by exploiting the Wasserstein contraction property of the Langevin …
Abstract
We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.
proceedings.neurips.cc