A stochastic gradient method with variance control and variable learning rate for Deep Learning
In this paper we study a stochastic gradient algorithm which rules the increase of the mini-batch size in a predefined fashion and automatically adjusts the learning rate by means of a monotone or non-monotone line search procedure. The mini-batch size is incremented at a suitable a priori rate thro...
Saved in:
| Published in: | Journal of computational and applied mathematics Vol. 451; p. 116083 |
|---|---|
| Main Authors: | , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.12.2024
|
| Subjects: | |
| ISSN: | 0377-0427 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In this paper we study a stochastic gradient algorithm which rules the increase of the mini-batch size in a predefined fashion and automatically adjusts the learning rate by means of a monotone or non-monotone line search procedure. The mini-batch size is incremented at a suitable a priori rate throughout the iterative process in order that the variance of the stochastic gradients is progressively reduced. The a priori rate is not subject to restrictive assumptions, allowing for the possibility of a slow increase in the mini-batch size. On the other hand, the learning rate can vary non-monotonically throughout the iterations, as long as it is appropriately bounded. Convergence results for the proposed method are provided for both convex and non-convex objective functions. Moreover it can be proved that the algorithm enjoys a global linear rate of convergence on strongly convex functions. The low per-iteration cost, the limited memory requirements and the robustness against the hyperparameters setting make the suggested approach well-suited for implementation within the deep learning framework, also for GPGPU-equipped architectures. Numerical results on training deep neural networks for multi-class image classification show a promising behaviour of the proposed scheme with respect to similar state of the art competitors. |
|---|---|
| ISSN: | 0377-0427 |
| DOI: | 10.1016/j.cam.2024.116083 |