Residual Sarsa algorithm with function approximation

In this work, we proposed an efficient algorithm named the residual Sarsa algorithm with function approximation (FARS) to improve the performance of the traditional Sarsa algorithm, and we use the gradient-descent method to update the function parameter vector. In the learning process, the Bellman r...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Cluster computing Ročník 22; číslo Suppl 1; s. 795 - 807
Hlavní autoři:	Qiming, Fu, Wen, Hu, Quan, Liu, Heng, Luo, Lingyao, Hu, Jianping, Chen
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York Springer US 01.01.2019 Springer Nature B.V
Témata:	Algorithms Approximation Computer Communication Networks Computer Science Convergence Distance learning Fuzzy logic Machine learning Mathematical analysis Operating Systems Performance enhancement Processor Architectures Sarsa algorithm Gradient descent Function approximation Bellman residual Reinforcement learning
ISSN:	1386-7857, 1573-7543
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In this work, we proposed an efficient algorithm named the residual Sarsa algorithm with function approximation (FARS) to improve the performance of the traditional Sarsa algorithm, and we use the gradient-descent method to update the function parameter vector. In the learning process, the Bellman residual method is adopted to guarantee the convergence of the algorithm, and a new rule for updating vectors of action-value functions is adopted to solve unstable and slow convergence problems. To accelerate the convergence rate of the algorithm, we introduce a new factor, named the forgotten factor, which can help improve the robustness of the algorithm’s performance. Based on two classical reinforcement learning benchmark problems, the experimental results show that the FARS algorithm has better performance than other related reinforcement learning algorithms.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1386-7857 1573-7543
DOI:	10.1007/s10586-017-1303-8