Stochastic privacy-preserving methods for nonconvex sparse learning

Sparse learning is essential in mining high-dimensional data. Iterative hard thresholding (IHT) methods are effective for optimizing nonconvex objectives for sparse learning. However, IHT methods are vulnerable to adversary attacks that infer sensitive data. Although pioneering works attempted to re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information sciences Jg. 630; S. 567 - 585
Hauptverfasser:	Liang, Guannan, Tong, Qianqian, Ding, Jiahao, Pan, Miao, Bi, Jinbo
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier Inc 01.06.2023
Schlagworte:	Differential privacy Sparse learning Stochastic algorithm Differential privacy Stochastic algorithm Sparse learning
ISSN:	0020-0255, 1872-6291
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Sparse learning is essential in mining high-dimensional data. Iterative hard thresholding (IHT) methods are effective for optimizing nonconvex objectives for sparse learning. However, IHT methods are vulnerable to adversary attacks that infer sensitive data. Although pioneering works attempted to relieve such vulnerability, they confront the issue of high computational cost for large-scale problems. We propose two differentially private stochastic IHT: one based on the stochastic gradient descent method (DP-SGD-HT) and the other based on the stochastically controlled stochastic gradient method (DP-SCSG-HT). The DP-SGD-HT method perturbs stochastic gradients with small Gaussian noise rather than full gradients, which are computationally expensive. As a result, computational complexity is reduced from O(nlog(n)) to a lower O(blog(n)), where n is the sample size and b is the mini-batch size used to compute stochastic gradients. The DP-SCSG-HT method further perturbs the stochastic gradients controlled by large-batch snapshot gradients to reduce stochastic gradient variance. We prove that both algorithms guarantee differential privacy and have linear convergence rates with estimation bias. A utility analysis examines the relationship between convergence rate and the level of perturbation, yielding the best-known utility bound for nonconvex sparse optimization. Extensive experiments show that our algorithms outperform existing methods.
ISSN:	0020-0255 1872-6291
DOI:	10.1016/j.ins.2022.09.062