Adaptive dynamic programming and deep reinforcement learning for the control of an unmanned surface vehicle: Experimental results

This paper presents a low-level controller for an unmanned surface vehicle based on adaptive dynamic programming and deep reinforcement learning. This approach uses a single deep neural network capable of self-learning a policy, and driving the surge speed and yaw dynamics of a vessel. A simulation...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Control engineering practice Ročník 111; s. 104807
Hlavní autoři: Gonzalez-Garcia, Alejandro, Barragan-Alcantar, David, Collado-Gonzalez, Ivana, Garrido, Leonardo
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 01.06.2021
Témata:
ISSN:0967-0661, 1873-6939
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This paper presents a low-level controller for an unmanned surface vehicle based on adaptive dynamic programming and deep reinforcement learning. This approach uses a single deep neural network capable of self-learning a policy, and driving the surge speed and yaw dynamics of a vessel. A simulation of the vehicle mathematical model was used to train the neural network with the model-based backpropagation through time algorithm, capable of dealing with continuous action-spaces. The path-following control scenario is additionally addressed by combining the proposed low-level controller and a line-of-sight based guidance law with time-varying look-ahead distance. Simulation and real-world experimental results are presented to validate the control capabilities of the proposed approach and contribute to the diversity of validated applications of adaptive dynamic programming based control strategies. Results show the controller is capable of self-learning the policy to drive the surge speed and yaw dynamics, and has an improved performance in comparison to a standard controller. •Adaptive dynamic programming can control the speed and heading dynamics of a USV.•Backpropagation through time can train a DNN using a simulation model of the vehicle.•Real-world results achieve low error and control effort with set-point regulation.•When combined with LOS-based guidance, accurate path-following is demonstrated.
ISSN:0967-0661
1873-6939
DOI:10.1016/j.conengprac.2021.104807