Student-t policy in reinforcement learning to acquire global optimum of robot control

This paper proposes an actor-critic algorithm with a policy parameterized by student-t distribution, named student-t policy, to enhance learning performance, mainly in terms of reachability on global optimum for tasks to be learned. The actor-critic algorithm is one of the policy-gradient methods in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Jg. 49; H. 12; S. 4335 - 4347
1. Verfasser:	Kobayashi, Taisuke
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York Springer US 01.12.2019 Springer Nature B.V
Schlagworte:	Algorithms Artificial Intelligence Computer Science Computer simulation Exploration Machine learning Machines Manufacturing Mechanical Engineering Normal distribution Parameterization Processes Robot control Student-t distribution Robot learning Reinforcement learning
ISSN:	0924-669X, 1573-7497
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!