Stochastic Variance Reduction for DR-Submodular Maximization

Stochastic optimization has experienced significant growth in recent decades, with the increasing prevalence of variance reduction techniques in stochastic optimization algorithms to enhance computational efficiency. In this paper, we introduce two projection-free stochastic approximation algorithms...

Full description

Saved in:
Bibliographic Details
Published in:Algorithmica Vol. 86; no. 5; pp. 1335 - 1364
Main Authors: Lian, Yuefang, Du, Donglei, Wang, Xiao, Xu, Dachuan, Zhou, Yang
Format: Journal Article
Language:English
Published: New York Springer US 01.05.2024
Springer Nature B.V
Subjects:
ISSN:0178-4617, 1432-0541
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Stochastic optimization has experienced significant growth in recent decades, with the increasing prevalence of variance reduction techniques in stochastic optimization algorithms to enhance computational efficiency. In this paper, we introduce two projection-free stochastic approximation algorithms for maximizing diminishing return (DR) submodular functions over convex constraints, building upon the Stochastic Path Integrated Differential EstimatoR (SPIDER) and its variants. Firstly, we present a SPIDER Continuous Greedy (SPIDER-CG) algorithm for the monotone case that guarantees a ( 1 - e - 1 ) OPT - ε approximation after O ( ε - 1 ) iterations and O ( ε - 2 ) stochastic gradient computations under the mean-squared smoothness assumption. For the non-monotone case, we develop a SPIDER Frank–Wolfe (SPIDER-FW) algorithm that guarantees a 1 4 ( 1 - min x ∈ C ‖ x ‖ ∞ ) OPT - ε approximation with O ( ε - 1 ) iterations and O ( ε - 2 ) stochastic gradient estimates. To address the practical challenge associated with a large number of samples per iteration, we introduce a modified gradient estimator based on SPIDER, leading to a Hybrid SPIDER-FW (Hybrid SPIDER-CG) algorithm, which achieves the same approximation guarantee as SPIDER-FW (SPIDER-CG) algorithm with only O ( 1 ) samples per iteration. Numerical experiments on both simulated and real data demonstrate the efficiency of the proposed methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0178-4617
1432-0541
DOI:10.1007/s00453-023-01195-z