Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems

Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel reg...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Engineering applications of artificial intelligence Ročník 133; s. 108444
Hlavní autoři: Kang, Qian, Yu, Dengxiu, Cheong, Kang Hao, Wang, Zhen
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 01.07.2024
Témata:
ISSN:0952-1976, 1873-6769
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L2 regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2024.108444