Stochastic privacy-preserving methods for nonconvex sparse learning
Sparse learning is essential in mining high-dimensional data. Iterative hard thresholding (IHT) methods are effective for optimizing nonconvex objectives for sparse learning. However, IHT methods are vulnerable to adversary attacks that infer sensitive data. Although pioneering works attempted to re...
Gespeichert in:
| Veröffentlicht in: | Information sciences Jg. 630; S. 567 - 585 |
|---|---|
| Hauptverfasser: | , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Inc
01.06.2023
|
| Schlagworte: | |
| ISSN: | 0020-0255, 1872-6291 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Sparse learning is essential in mining high-dimensional data. Iterative hard thresholding (IHT) methods are effective for optimizing nonconvex objectives for sparse learning. However, IHT methods are vulnerable to adversary attacks that infer sensitive data. Although pioneering works attempted to relieve such vulnerability, they confront the issue of high computational cost for large-scale problems. We propose two differentially private stochastic IHT: one based on the stochastic gradient descent method (DP-SGD-HT) and the other based on the stochastically controlled stochastic gradient method (DP-SCSG-HT). The DP-SGD-HT method perturbs stochastic gradients with small Gaussian noise rather than full gradients, which are computationally expensive. As a result, computational complexity is reduced from O(nlog(n)) to a lower O(blog(n)), where n is the sample size and b is the mini-batch size used to compute stochastic gradients. The DP-SCSG-HT method further perturbs the stochastic gradients controlled by large-batch snapshot gradients to reduce stochastic gradient variance. We prove that both algorithms guarantee differential privacy and have linear convergence rates with estimation bias. A utility analysis examines the relationship between convergence rate and the level of perturbation, yielding the best-known utility bound for nonconvex sparse optimization. Extensive experiments show that our algorithms outperform existing methods. |
|---|---|
| ISSN: | 0020-0255 1872-6291 |
| DOI: | 10.1016/j.ins.2022.09.062 |