Improved off‐policy reinforcement learning algorithm for robust control of unmodeled nonlinear system with asymmetric state constraints

In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the map...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	International journal of robust and nonlinear control Ročník 33; číslo 3; s. 1607 - 1632
Hlavní autoři:	Zhang, Yong, Mu, Chaoxu, Feng, Yanghe, Zhao, Zhijia
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Bognor Regis Wiley Subscription Services, Inc 01.02.2023
Témata:	Algorithms asymmetric state constraints Asymmetry Control algorithms Control systems design Control theory Data sampling Machine learning Mapping neural networks Nonlinear control Nonlinear systems principal component analysis Principal components analysis reinforcement learning Robust control Sampling methods
ISSN:	1049-8923, 1099-1239
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the mapping state has better response speed and amplitude than the original state. Then, an auxiliary mapping error system is constructed for the off‐policy robust controller design. At the same time, an innovative network dimensionality reduction method based on principal component analysis is proposed to simplify the useless activation function of action network in off‐policy algorithm, which can effectively reduce the computational burden of data episodes. Considering the uncertain data caused by disturbances, a dominant data sampling method is designed to extract samples that are beneficial to algorithm convergence. On this basis, the improved off‐policy robust control algorithm is constructed. Based on an industrial manipulator system, the effectiveness of the dominant data sampling method and the improved off‐policy robust control algorithm is verified by comparative simulation.
Bibliografie:	Funding information National Key Research and Development Program of China, Grant/Award Number: 2021YFB1714700; National Natural Science Foundation of China, Grant/Award Number: 62022061; Natural Science Foundation of Tianjin City, Grant/Award Number: 20JCYBJC00880 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1049-8923 1099-1239
DOI:	10.1002/rnc.6432