Neuro-Control for Continuous-Time Stochastic Nonlinear Systems via Online Policy Iteration Algorithm

This paper is concerned with the neuro-control for continuous-time nonlinear systems subject to stochastic disturbance. Due to the stochastic disturbance, the traditional value function in existing literature cannot meet the stochastic control problems, since mixed second partial derivatives are emp...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Chinese Control and Decision Conference S. 1499 - 1503
Hauptverfasser:	Zhou, Tianmin, Hou, Jiaxu, Li, Handong, Di, Zengru, Zhao, Bo
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.08.2020
Schlagworte:	Adaptive Dynamic Programming Approximation algorithms Artificial neural networks Mathematical model Nonlinear systems Optimal control Policy Iteration Reinforcement Learning Stochastic Nonlinear Stochastic processes Stochastic systems
ISSN:	1948-9447
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper is concerned with the neuro-control for continuous-time nonlinear systems subject to stochastic disturbance. Due to the stochastic disturbance, the traditional value function in existing literature cannot meet the stochastic control problems, since mixed second partial derivatives are employed to construct modified value function of conditional expectation. To solve the Hamilton-Jacobi-Bellman equation, a novel online policy iteration algorithm with an Ito correction term is developed with establishing a critic neural network to approximate the optimal value function.ˆ Thus, the online optimal control can be obtained in a closed-loop form. The closed-loop system is guaranteed to be stable in probability via Lyapunov's direct method. Finally, numerical example is provided to illustrate the effectiveness of the developed control method.
ISSN:	1948-9447
DOI:	10.1109/CCDC49329.2020.9164777