Step Size Adaptation for Accelerated Stochastic Momentum Algorithm Using SDE Modeling and Lyapunov Drift Minimization

Training machine learning models often involves solving high-dimensional stochastic optimization problems, where stochastic gradient-based algorithms are hindered by slow convergence. Although momentum-based methods perform well in deterministic settings, their effectiveness diminishes under gradien...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on signal processing Jg. 73; S. 3124 - 3139
Hauptverfasser:	Yuan, Yulan, Tsang, Danny H. K., Lau, Vincent K. N.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Algorithms Approximation algorithms Cognitive tasks Convergence Differential equations Heuristic algorithms Liapunov functions Lyapunov drift Machine learning Mathematical models Momentum Noise Optimization Signal processing algorithms Stability analysis stochastic differential equation Stochastic momentum Stochastic processes Trajectory
ISSN:	1053-587X, 1941-0476
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Training machine learning models often involves solving high-dimensional stochastic optimization problems, where stochastic gradient-based algorithms are hindered by slow convergence. Although momentum-based methods perform well in deterministic settings, their effectiveness diminishes under gradient noise. In this paper, we introduce a novel accelerated stochastic momentum algorithm. Specifically, we first model the trajectory of discrete-time momentum-based algorithms using continuous-time stochastic differential equations (SDEs). By leveraging a tailored Lyapunov function, we derive 2-D adaptive step sizes through Lyapunov drift minimization, which significantly enhance both convergence speed and noise stability. The proposed algorithm not only accelerates convergence but also eliminates the need for hyperparameter fine-tuning, consistently achieving robust accuracy in machine learning tasks.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1053-587X 1941-0476
DOI:	10.1109/TSP.2025.3592678