Stochastic Variance-Reduced Prox-Linear Algorithms for Nonconvex Composite Optimization
- Junyu Zhang ,
- Lin Xiao
MSR-TR-2020-11 |
Published by Microsoft
We consider minimization of composite functions of the form \(f(g(x))+h(x)\), where \(f\) and \(h\) are convex functions (which can be nonsmooth) and \(g\) is a smooth vector mapping. In addition, we assume that \(g\) is the average of finite number of component mappings or the expectation over a family of random component mappings. We propose a class of stochastic variance-reduced prox-linear algorithms for solving such problems and bound their sample complexities for finding an \(\epsilon\)-stationary point in terms of the total number of evaluations of the component mappings and their Jacobians. When \(g\) is a finite average of \(N\) components, we obtain sample complexity \(O(N+ N^{4/5}\epsilon^{-1})\) for both mapping and Jacobian evaluations. When \(g\) is a general expectation, we obtain sample complexities of \(O(\epsilon^{-5/2})\) and \(O(\epsilon^{-3/2})\) for component mappings and their Jacobians respectively. If in addition \(f\) is smooth, then improved sample complexities of \(O(N+N^{1/2}\epsilon^{-1})\) and \(O(\epsilon^{-3/2})\) are derived for \(g\) being a finite average and a general expectation respectively, for both component mapping and Jacobian evaluations.