Fast stochastic algorithms for low-rank and nonsmooth matrix problems

D Garber, A Kaplan - The 22nd International Conference on …, 2019 - proceedings.mlr.press
D Garber, A Kaplan
The 22nd International Conference on Artificial Intelligence …, 2019proceedings.mlr.press
Composite convex optimization problems which include both a nonsmooth term and a low-
rank promoting term have important applications in machine learning and signal processing,
such as when one wishes to recover an unknown matrix that is simultaneously low-rank and
sparse. However, such problems are highly challenging to solve in large-scale: the low-rank
promoting term prohibits efficient implementations of proximal methods for composite
optimization and even simple subgradient methods. On the other hand, methods which are …
Abstract
Composite convex optimization problems which include both a nonsmooth term and a low-rank promoting term have important applications in machine learning and signal processing, such as when one wishes to recover an unknown matrix that is simultaneously low-rank and sparse. However, such problems are highly challenging to solve in large-scale: the low-rank promoting term prohibits efficient implementations of proximal methods for composite optimization and even simple subgradient methods. On the other hand, methods which are tailored for low-rank optimization, such as conditional gradient-type methods, which are often applied to a smooth approximation of the nonsmooth objective, are slow since their runtime scales with both the large Lipchitz parameter of the smoothed gradient vector and with , where is the target accuracy. In this paper we develop efficient algorithms for\textit {stochastic} optimization of a strongly-convex objective which includes both a nonsmooth term and a low-rank promoting term. In particular, to the best of our knowledge, we present the first algorithm that enjoys all following critical properties for large-scale problems: i)(nearly) optimal sample complexity, ii) each iteration requires only a single\textit {low-rank} SVD computation, and iii) overall number of thin-SVD computations scales only with (as opposed to in previous methods). We also give an algorithm for the closely-related finite-sum setting. We empirically demonstrate our results on the problem of recovering a simultaneously low-rank and sparse matrix.
proceedings.mlr.press