Improved off‐policy reinforcement learning algorithm for robust control of unmodeled nonlinear system with asymmetric state constraints

In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the map...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of robust and nonlinear control Ročník 33; číslo 3; s. 1607 - 1632
Hlavní autori: Zhang, Yong, Mu, Chaoxu, Feng, Yanghe, Zhao, Zhijia
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Bognor Regis Wiley Subscription Services, Inc 01.02.2023
Predmet:
ISSN:1049-8923, 1099-1239
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the mapping state has better response speed and amplitude than the original state. Then, an auxiliary mapping error system is constructed for the off‐policy robust controller design. At the same time, an innovative network dimensionality reduction method based on principal component analysis is proposed to simplify the useless activation function of action network in off‐policy algorithm, which can effectively reduce the computational burden of data episodes. Considering the uncertain data caused by disturbances, a dominant data sampling method is designed to extract samples that are beneficial to algorithm convergence. On this basis, the improved off‐policy robust control algorithm is constructed. Based on an industrial manipulator system, the effectiveness of the dominant data sampling method and the improved off‐policy robust control algorithm is verified by comparative simulation.
AbstractList In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the mapping state has better response speed and amplitude than the original state. Then, an auxiliary mapping error system is constructed for the off‐policy robust controller design. At the same time, an innovative network dimensionality reduction method based on principal component analysis is proposed to simplify the useless activation function of action network in off‐policy algorithm, which can effectively reduce the computational burden of data episodes. Considering the uncertain data caused by disturbances, a dominant data sampling method is designed to extract samples that are beneficial to algorithm convergence. On this basis, the improved off‐policy robust control algorithm is constructed. Based on an industrial manipulator system, the effectiveness of the dominant data sampling method and the improved off‐policy robust control algorithm is verified by comparative simulation.
Author Mu, Chaoxu
Zhao, Zhijia
Feng, Yanghe
Zhang, Yong
Author_xml – sequence: 1
  givenname: Yong
  orcidid: 0000-0003-4385-712X
  surname: Zhang
  fullname: Zhang, Yong
  organization: Tianjin University
– sequence: 2
  givenname: Chaoxu
  surname: Mu
  fullname: Mu, Chaoxu
  email: cxmu@tju.edu.cn
  organization: Tianjin University
– sequence: 3
  givenname: Yanghe
  surname: Feng
  fullname: Feng, Yanghe
  organization: National University of Defense Technology
– sequence: 4
  givenname: Zhijia
  orcidid: 0000-0001-5893-0233
  surname: Zhao
  fullname: Zhao, Zhijia
  organization: Guangzhou University
BookMark eNp1kM9O3DAQxq0KpPJP6iNY4sIliz1OnORYraBFQkVCcI68zoQaOfZie0G59d5LH4Fn4VH6JDjdnhCcZqT5fd_MfPtkx3mHhHzhbMEZg9Pg9EKWAj6RPc7atuAg2p25L9uiaUF8Jvsx3jOWZ1Dukd8X4zr4R-ypH4a_v_6svTV6ogGNG3zQOKJL1KIKzrg7quydDyb9HGkevjwHv9rERLV3KXibHejGjb5Hm-3yWda4LKRxiglH-pR1L88qTuOIKRhNY1IJZ3FMQRmX4iHZHZSNePS_HpDb87Ob5ffi8urbxfLrZaGhFVCgrGS1qvpKDkyIEnoOLe-5qhlDqWsBLbKGNyvdi6bkAkqpYIWqBo4N1JUSB-R465s_f9hgTN293wSXV3ZQSym4rAAytdhSOvgYAw6dNvliMz-rjO046-bAuxx4NweeBSdvBOtgRhWm99Biiz4Zi9OHXHf9Y_mPfwW59Zcm
CitedBy_id crossref_primary_10_1109_TETCI_2023_3301789
crossref_primary_10_1016_j_jfranklin_2025_108035
Cites_doi 10.1109/TSMCB.2005.862486
10.1109/TSMC.2015.2429555
10.1109/TCYB.2015.2421338
10.1109/TNNLS.2015.2508926
10.1109/ACSSC.2010.5757875
10.1109/TSG.2019.2942770
10.1016/j.automatica.2013.09.043
10.1016/j.automatica.2008.08.017
10.1016/j.neunet.2009.03.012
10.1049/iet-cta.2015.0019
10.1109/TFUZZ.2009.2020506
10.1109/TAC.2016.2550518
10.1016/j.automatica.2010.02.018
10.1109/TSMC.2019.2895692
10.1016/j.neucom.2011.05.031
10.1016/j.automatica.2017.03.033
10.1002/9781119132677
10.1080/00207179.2011.631192
10.1016/j.automatica.2015.10.034
10.1016/j.neucom.2017.01.076
10.1109/ICRA48506.2021.9561870
10.1007/s11768-011-0166-4
10.1109/IJCNN.2013.6707098
10.1109/37.845037
10.1109/TAC.2000.880994
10.1109/TNNLS.2018.2803827
10.1034/j.1600-0870.2001.00251.x
10.1162/089976699300016728
10.1109/TFUZZ.2021.3075501
10.1002/rnc.2814
10.1016/j.foodcont.2005.06.008
10.1016/S0005-1098(00)00116-3
10.1109/TNNLS.2016.2635111
10.1109/TNNLS.2013.2294968
10.1016/j.automatica.2016.12.009
10.1016/j.asoc.2013.01.006
10.1016/j.automatica.2010.10.033
10.1016/j.neunet.2015.08.007
10.1016/j.jfranklin.2011.08.004
10.1016/j.cie.2011.02.014
10.1109/TIE.2017.2764842
10.1016/S0005-1098(98)00018-1
10.1002/asjc.1184
10.1016/j.isatra.2015.05.014
10.1109/TSMCB.2008.924139
10.1007/s12555-012-0403-8
10.1109/TNNLS.2019.2900510
10.1109/TAC.2019.2905215
10.1109/TCYB.2014.2319577
10.1111/1467-9868.00196
10.1016/j.automatica.2004.11.034
10.1109/9.159566
10.1016/S0098-1354(00)00312-4
10.1049/cit2.12015
10.1016/j.neunet.2009.03.008
10.1109/TCYB.2015.2417170
10.1109/TNN.2010.2047115
10.1109/TNNLS.2020.2969215
10.1109/TNNLS.2020.3007414
10.1016/j.automatica.2008.11.017
10.1109/TNNLS.2015.2490698
10.1109/TIE.2008.2011621
10.1109/TNNLS.2017.2751018
10.1002/rnc.657
10.1016/S0005-1098(97)00065-4
10.1002/wics.56
ContentType Journal Article
Copyright 2022 John Wiley & Sons Ltd.
2023 John Wiley & Sons, Ltd.
Copyright_xml – notice: 2022 John Wiley & Sons Ltd.
– notice: 2023 John Wiley & Sons, Ltd.
DBID AAYXX
CITATION
7SC
7SP
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
DOI 10.1002/rnc.6432
DatabaseName CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
CrossRef
Technology Research Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1099-1239
EndPage 1632
ExternalDocumentID 10_1002_rnc_6432
RNC6432
Genre article
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 62022061
– fundername: Natural Science Foundation of Tianjin City
  funderid: 20JCYBJC00880
– fundername: National Key Research and Development Program of China
  funderid: 2021YFB1714700
GroupedDBID .3N
.GA
.Y3
05W
0R~
10A
1L6
1OB
1OC
31~
33P
3SF
3WU
4.4
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
5GY
5VS
66C
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABEML
ABIJN
ABJNI
ACAHQ
ACBWZ
ACCFJ
ACCZN
ACGFO
ACGFS
ACIWK
ACPOU
ACRPL
ACSCC
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
ADZOD
AEEZP
AEIGN
AEIMD
AENEX
AEQDE
AEUQT
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFWVQ
AFZJQ
AHBTC
AI.
AIAGR
AITYG
AIURR
AIWBW
AJBDE
AJXKR
ALAGY
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ASPBG
ATUGU
AUFTA
AVWKF
AZBYB
AZFZN
AZVAB
BAFTC
BDRZF
BFHJK
BHBCM
BMNLL
BMXJE
BNHUX
BROTX
BRXPI
BY8
CMOOK
CS3
D-E
D-F
DCZOG
DPXWK
DR2
DRFUL
DRSTM
DU5
EBS
EJD
F00
F01
F04
FEDTE
G-S
G.N
GNP
GODZA
H.T
H.X
HF~
HGLYW
HHY
HHZ
HVGLF
HZ~
IX1
J0M
JPC
KQQ
LATKE
LAW
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LW6
LYRES
M59
MEWTI
MK4
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
N9A
NF~
NNB
O66
O9-
P2P
P2W
P2X
P4D
PALCI
Q.N
Q11
QB0
QRW
R.K
RIWAO
RJQFR
ROL
RWI
RX1
RYL
SAMSI
SUPJJ
TUS
UB1
V2E
VH1
W8V
W99
WBKPD
WH7
WIH
WIK
WJL
WLBEL
WOHZO
WQJ
WRC
WWI
WXSBR
WYISQ
XG1
XV2
ZZTAW
~IA
~WT
AAMMB
AAYXX
AEFGJ
AEYWJ
AGHNM
AGQPQ
AGXDD
AGYGG
AIDQK
AIDYY
AIQQE
AMVHM
CITATION
O8X
7SC
7SP
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c2932-e6565b5d56f03342d1291d1a700e6c7329e0818bcd38413246a2bea721e8275a3
IEDL.DBID DRFUL
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000871443400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1049-8923
IngestDate Fri Jul 25 12:07:24 EDT 2025
Sat Nov 29 02:15:51 EST 2025
Tue Nov 18 21:11:13 EST 2025
Wed Jan 22 16:18:42 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2932-e6565b5d56f03342d1291d1a700e6c7329e0818bcd38413246a2bea721e8275a3
Notes Funding information
National Key Research and Development Program of China, Grant/Award Number: 2021YFB1714700; National Natural Science Foundation of China, Grant/Award Number: 62022061; Natural Science Foundation of Tianjin City, Grant/Award Number: 20JCYBJC00880
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-5893-0233
0000-0003-4385-712X
PQID 2766316522
PQPubID 1026344
PageCount 26
ParticipantIDs proquest_journals_2766316522
crossref_citationtrail_10_1002_rnc_6432
crossref_primary_10_1002_rnc_6432
wiley_primary_10_1002_rnc_6432_RNC6432
PublicationCentury 2000
PublicationDate February 2023
2023-02-00
20230201
PublicationDateYYYYMMDD 2023-02-01
PublicationDate_xml – month: 02
  year: 2023
  text: February 2023
PublicationDecade 2020
PublicationPlace Bognor Regis
PublicationPlace_xml – name: Bognor Regis
PublicationTitle International journal of robust and nonlinear control
PublicationYear 2023
Publisher Wiley Subscription Services, Inc
Publisher_xml – name: Wiley Subscription Services, Inc
References 2009; 45
2017; 81
2015; 71
2000; 45
2013; 23
2002; 12
2006; 36
2008; 38
2011; 61
2014; 25
2020; 11
2012; 10
2017; 238
2015; 46
2009; 56
2015; 45
2010; 21
2021; 38
2021; 32
2019; 64
2013; 13
2020; 50
2017; 78
1999; 11
2022; 30
2010; 2
2014; 50
2016; 46
2001; 53
2009; 17
65
2009; 22
2018; 29
2015; 58
2021; 6
2000; 24
2006; 17
2011; 84
2000; 20
1998
2005; 41
1996
1992; 37
1992
2002
1999; 61
2016; 18
2015; 9
2012; 78
2012; 349
2011; 9
2010; 46
2000; 36
2020; 31
1997; 33
2021
2016; 64
2017
2016; 61
2011; 47
2016; 27
1998; 34
e_1_2_11_70_1
e_1_2_11_32_1
e_1_2_11_55_1
e_1_2_11_30_1
e_1_2_11_57_1
e_1_2_11_36_1
e_1_2_11_51_1
e_1_2_11_74_1
e_1_2_11_13_1
e_1_2_11_34_1
e_1_2_11_53_1
e_1_2_11_11_1
e_1_2_11_6_1
Tao G (e_1_2_11_14_1) 2014; 25
e_1_2_11_27_1
e_1_2_11_4_1
e_1_2_11_48_1
e_1_2_11_2_1
Sutton Richard S (e_1_2_11_29_1) 1998
e_1_2_11_60_1
e_1_2_11_20_1
e_1_2_11_45_1
e_1_2_11_66_1
e_1_2_11_47_1
e_1_2_11_68_1
e_1_2_11_24_1
e_1_2_11_8_1
e_1_2_11_22_1
e_1_2_11_43_1
e_1_2_11_64_1
e_1_2_11_17_1
Wenchao M (e_1_2_11_15_1) 2015; 46
e_1_2_11_59_1
Nevistic V (e_1_2_11_72_1) 1996
e_1_2_11_38_1
e_1_2_11_19_1
e_1_2_11_71_1
e_1_2_11_10_1
Zhang Yong M (e_1_2_11_21_1) 2021; 38
e_1_2_11_31_1
e_1_2_11_56_1
e_1_2_11_58_1
e_1_2_11_35_1
e_1_2_11_52_1
e_1_2_11_73_1
e_1_2_11_12_1
e_1_2_11_33_1
e_1_2_11_54_1
e_1_2_11_75_1
e_1_2_11_7_1
e_1_2_11_28_1
e_1_2_11_5_1
e_1_2_11_26_1
e_1_2_11_3_1
e_1_2_11_49_1
Werbos PJ (e_1_2_11_62_1) 1992
e_1_2_11_61_1
e_1_2_11_44_1
e_1_2_11_67_1
e_1_2_11_46_1
e_1_2_11_69_1
e_1_2_11_25_1
e_1_2_11_40_1
e_1_2_11_63_1
e_1_2_11_9_1
e_1_2_11_23_1
e_1_2_11_42_1
e_1_2_11_65_1
Jolliffe IT (e_1_2_11_50_1) 2002
e_1_2_11_18_1
e_1_2_11_16_1
e_1_2_11_37_1
e_1_2_11_39_1
Yang Y (e_1_2_11_41_1) 2021
References_xml – volume: 6
  start-page: 203
  issue: 2
  year: 2021
  end-page: 212
  article-title: Learning‐based control for discrete‐time constrained nonzero‐sum games
  publication-title: CAAI Trans Intell Technol
– volume: 25
  start-page: 882
  issue: 5
  year: 2014
  end-page: 893
  article-title: Robust adaptive dynamic programming and feedback stabilization of nonlinear systems
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 10
  start-page: 684
  issue: 4
  year: 2012
  end-page: 696
  article-title: Adaptive fuzzy backstepping dynamic surface control for output‐constrained non‐smooth nonlinear dynamic system
  publication-title: Int J Control Autom Syst
– volume: 37
  start-page: 1283
  issue: 9
  year: 1992
  end-page: 1293
  article-title: Disturbance attenuation and control via measurement feedback in nonlinear systems
  publication-title: IEEE Trans Automat Contr
– volume: 61
  start-page: 4170
  issue: 12
  year: 2016
  end-page: 4175
  article-title: Adaptive dynamic programming for stochastic systems with state and control dependent noise
  publication-title: IEEE Trans Automat Contr
– volume: 81
  start-page: 232
  year: 2017
  end-page: 239
  article-title: Adaptive neural dynamic surface control of strict‐feedback nonlinear systems with full state constraints and unmodeled dynamics
  publication-title: Automatica
– volume: 2
  start-page: 54
  issue: 1
  year: 2010
  end-page: 60
  article-title: Importance sampling: a review
  publication-title: Wiley Interdiscip Rev Comput Stat
– volume: 38
  start-page: 898
  issue: 4
  year: 2008
  end-page: 900
  article-title: ADP: the key direction for future research in intelligent control and understanding brain intelligence
  publication-title: IEEE Trans Syst Man Cybern B Cybern
– volume: 9
  start-page: 353
  issue: 3
  year: 2011
  end-page: 360
  article-title: Adaptive dynamic programming for online solution of a zero‐sum differential game
  publication-title: J Control Theory Appl
– volume: 349
  start-page: 531
  issue: 2
  year: 2012
  end-page: 558
  article-title: Integrated guidance and autopilot design for a chasing UAV via high‐order sliding modes
  publication-title: J Frankl Inst
– volume: 25
  start-page: 1665
  issue: 7
  year: 2014
  end-page: 1674
  article-title: Backstepping control for output‐constrained nonlinear systems based on nonlinear mapping
  publication-title: Neural Comput Appl
– volume: 47
  start-page: 207
  issue: 1
  year: 2011
  end-page: 214
  article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero‐sum differential games
  publication-title: Automatica
– volume: 32
  start-page: 2650
  issue: 6
  year: 2021
  end-page: 2662
  article-title: Output‐feedback robust control of uncertain systems via online data‐driven learning
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 9
  start-page: 2312
  issue: 15
  year: 2015
  end-page: 2319
  article-title: Backstepping dynamic surface control for a class of non‐linear systems with time‐varying output constraints
  publication-title: IET Control Theory Appl
– volume: 36
  start-page: 1835
  issue: 12
  year: 2000
  end-page: 1846
  article-title: Adaptive neural network control for strict‐feedback nonlinear systems using backstepping design
  publication-title: Automatica
– year: 1998
– volume: 38
  start-page: 225
  issue: 2
  year: 2021
  end-page: 236
  article-title: Data‐based feedback relearning algorithm for robust control of SGCMG gimbal servo system with multi‐source disturbance
  publication-title: Trans Nanjing Univ Aeronaut Astronaut
– volume: 53
  start-page: 599
  issue: 5
  year: 2001
  end-page: 615
  article-title: Nonlinear principal component analysis by neural networks
  publication-title: Tellus A
– volume: 238
  start-page: 377
  year: 2017
  end-page: 386
  article-title: Data‐driven adaptive dynamic programming for continuous‐time fully cooperative games with partially constrained inputs
  publication-title: Neurocomputing
– volume: 13
  start-page: 2375
  issue: 5
  year: 2013
  end-page: 2389
  article-title: Nonlinear identification of a gasoline HCCI engine using neural networks coupled with principal component analysis
  publication-title: Appl Soft Comput
– volume: 11
  start-page: 1748
  issue: 2
  year: 2020
  end-page: 1758
  article-title: Energy‐storage‐based intelligent frequency control of microgrid with stochastic model uncertainties
  publication-title: IEEE Trans Smart Grid
– volume: 45
  start-page: 65
  issue: 1
  year: 2015
  end-page: 76
  article-title: Off‐policy reinforcement learning for control design
  publication-title: IEEE Trans Cybern
– start-page: 96
  year: 1996
  end-page: 021
  article-title: Constrained nonlinear optimal control: a converse HJB approach
  publication-title: Control Dyn Syst
– volume: 58
  start-page: 96
  year: 2015
  end-page: 104
  article-title: Adaptive neural network control of unknown nonlinear affine systems with input deadzone and output constraint
  publication-title: ISA Trans
– volume: 36
  start-page: 509
  issue: 3
  year: 2006
  end-page: 519
  article-title: Mode‐independent robust stabilization for uncertain markovian jump nonlinear systems via fuzzy control
  publication-title: IEEE Trans Syst Man Cybern Cybern
– volume: 20
  start-page: 38
  issue: 3
  year: 2000
  end-page: 52
  article-title: Tutorial overview of model predictive control
  publication-title: IEEE Control Syst Mag
– volume: 46
  start-page: 878
  issue: 5
  year: 2010
  end-page: 888
  article-title: Online actor–Critic algorithm to solve the continuous‐time infinite horizon optimal control problem
  publication-title: Automatica
– start-page: 493
  year: 1992
  end-page: 526
– volume: 29
  start-page: 2099
  issue: 6
  year: 2018
  end-page: 2111
  article-title: Adaptive constrained optimal control design for data‐based nonlinear discrete‐time systems with critic‐only structure
  publication-title: IEEE Trans Neural Networks Learn Syst
– volume: 46
  start-page: 334
  issue: 3
  year: 2016
  end-page: 344
  article-title: Adaptive neural impedance control of a robotic manipulator with input saturation
  publication-title: IEEE Trans Syst Man Cybern Syst
– volume: 31
  start-page: 5522
  issue: 12
  year: 2020
  end-page: 5533
  article-title: Adaptive optimal control for stochastic multiplayer differential games using on‐policy and off‐policy reinforcement learning
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 22
  start-page: 200
  issue: 3
  year: 2009
  end-page: 212
  article-title: Intelligence in the brain: a theory of how it works and how to build it
  publication-title: Neural Netw
– volume: 29
  start-page: 5554
  issue: 11
  year: 2018
  end-page: 5564
  article-title: Adaptive neural control for robotic manipulators with output constraints and uncertainties
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 65
  start-page: 3480
  issue: 4
  end-page: 3490
  article-title: Dynamic behavior of terminal sliding mode control
  publication-title: IEEE Trans Ind Electron
– volume: 45
  start-page: 1372
  issue: 7
  year: 2015
  end-page: 1385
  article-title: Reinforcement‐learning‐based robust controller design for continuous‐time uncertain nonlinear systems subject to input constraints
  publication-title: IEEE Trans Cybern
– volume: 27
  start-page: 1562
  issue: 7
  year: 2016
  end-page: 1571
  article-title: Neural network control‐based adaptive learning design for nonlinear systems with full‐state constraints
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 29
  start-page: 560
  issue: 3
  year: 2018
  end-page: 572
  article-title: Nonlinear process fault diagnosis based on serial principal component analysis
  publication-title: IEEE Trans Neural Networks Learn Syst
– volume: 41
  start-page: 779
  issue: 5
  year: 2005
  end-page: 791
  article-title: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
  publication-title: Automatica
– volume: 17
  start-page: 894
  issue: 11
  year: 2006
  end-page: 899
  article-title: Process control based on principal component analysis for maize drying
  publication-title: Food Control
– volume: 27
  start-page: 2513
  issue: 12
  year: 2016
  end-page: 2525
  article-title: A theoretical foundation of goal representation heuristic dynamic programming
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 56
  start-page: 900
  issue: 3
  year: 2009
  end-page: 906
  article-title: From PID to active disturbance rejection control
  publication-title: IEEE Trans Ind Electron
– volume: 45
  start-page: 1893
  issue: 10
  year: 2000
  end-page: 1899
  article-title: Dynamic surface control for a class of nonlinear systems
  publication-title: IEEE Trans Automat Contr
– volume: 24
  start-page: 99
  issue: 1
  year: 2000
  end-page: 110
  article-title: Principal component analysis for nonlinear model reference adaptive control
  publication-title: Comput Chem Eng
– volume: 45
  start-page: 918
  issue: 4
  year: 2009
  end-page: 927
  article-title: Barrier Lyapunov functions for the control of output‐constrained nonlinear systems
  publication-title: Automatica
– volume: 50
  start-page: 4056
  issue: 11
  year: 2020
  end-page: 4067
  article-title: ADP‐based robust tracking control for a class of nonlinear systems with unmatched uncertainties
  publication-title: IEEE Trans Syst Man Cybern Syst
– volume: 18
  start-page: 1020
  issue: 3
  year: 2016
  end-page: 1029
  article-title: Design and comparison base analysis of adaptive estimator for completely unknown linear systems in the presence of OE noise and constant input time delay
  publication-title: Asian J Control
– volume: 23
  start-page: 991
  issue: 9
  year: 2013
  end-page: 1012
  article-title: Computationally efficient simultaneous policy update algorithm for nonlinear state feedback control with Galerkin's method
  publication-title: Int J Robust Nonlinear Control
– volume: 33
  start-page: 1539
  issue: 8
  year: 1997
  end-page: 1543
  article-title: A dynamic recurrent neural‐network‐based adaptive observer for a class of nonlinear systems
  publication-title: Automatica
– volume: 64
  start-page: 4423
  issue: 11
  year: 2019
  end-page: 4438
  article-title: Reinforcement learning‐based adaptive optimal exponential tracking control of linear systems with unknown dynamics
  publication-title: IEEE Trans Automat Contr
– volume: 45
  start-page: 477
  issue: 2
  year: 2009
  end-page: 484
  article-title: Adaptive optimal control for continuous‐time linear systems based on policy iteration
  publication-title: Automatica
– volume: 17
  start-page: 1025
  issue: 5
  year: 2009
  end-page: 1043
  article-title: Fuzzy state‐space modeling and robust observer‐based control design for nonlinear partial differential systems
  publication-title: IEEE Trans Fuzzy Syst
– volume: 64
  start-page: 70
  year: 2016
  end-page: 75
  article-title: Barrier Lyapunov functions‐based adaptive control for a class of nonlinear pure‐feedback systems with full state constraints
  publication-title: Automatica
– volume: 11
  start-page: 443
  issue: 2
  year: 1999
  end-page: 482
  article-title: Mixtures of probabilistic principal component analyzers
  publication-title: Neural Comput
– volume: 78
  start-page: 144
  year: 2017
  end-page: 152
  article-title: H∞ control of linear discrete‐time systems: off‐policy reinforcement learning
  publication-title: Automatica
– start-page: 1
  year: 2021
  end-page: 12
  article-title: Hamiltonian‐driven adaptive dynamic programming with approximation errors
  publication-title: IEEE Trans Cybern
– volume: 46
  start-page: 1041
  issue: 5
  year: 2016
  end-page: 1050
  article-title: Off‐policy actor‐critic structure for optimal control of unknown systems with disturbances
  publication-title: IEEE Trans Cybern
– year: 2002
– volume: 50
  start-page: 193
  issue: 1
  year: 2014
  end-page: 202
  article-title: Integral reinforcement learning and experience replay for adaptive optimal control of partially‐unknown constrained‐input continuous‐time systems
  publication-title: Automatica
– volume: 46
  start-page: 85
  issue: 1
  year: 2015
  end-page: 95
  article-title: Adaptive neural control of a class of output‐constrained nonaffine systems
  publication-title: IEEE Trans Cybern
– volume: 34
  start-page: 825
  issue: 7
  year: 1998
  end-page: 840
  article-title: Design of robust adaptive controllers for nonlinear systems with dynamic uncertainties
  publication-title: Automatica
– volume: 61
  start-page: 611
  issue: 3
  year: 1999
  end-page: 622
  article-title: Probabilistic principal component analysis
  publication-title: J R Stat Soc Ser B
– volume: 71
  start-page: 150
  year: 2015
  end-page: 158
  article-title: Reinforcement learning solution for HJB equation arising in constrained optimal control problem
  publication-title: Neural Netw
– volume: 78
  start-page: 3
  issue: 1
  year: 2012
  end-page: 13
  article-title: A three‐network architecture for on‐line learning and optimization based on adaptive dynamic programming
  publication-title: Neurocomputing
– volume: 12
  start-page: 519
  issue: 6
  year: 2002
  end-page: 535
  article-title: A subspace approach to balanced truncation for model reduction of nonlinear control systems
  publication-title: Int J Robust Nonlinear Control
– volume: 21
  start-page: 1339
  issue: 8
  year: 2010
  end-page: 1345
  article-title: Adaptive neural control for output feedback nonlinear systems using a barrier Lyapunov function
  publication-title: IEEE Trans Neural Netw
– year: 2017
– volume: 31
  start-page: 259
  issue: 1
  year: 2020
  end-page: 273
  article-title: Learning‐based robust tracking control of quadrotor with time‐varying and coupling uncertainties
  publication-title: IEEE Trans Neural Netw Learn Syst
– volume: 84
  start-page: 2008
  issue: 12
  year: 2011
  end-page: 2023
  article-title: Control of nonlinear systems with partial state constraints using a barrier Lyapunov function
  publication-title: Int J Control
– volume: 61
  start-page: 437
  issue: 3
  year: 2011
  end-page: 446
  article-title: Variable window adaptive kernel principal component analysis for nonlinear nonstationary process monitoring
  publication-title: Comput Ind Eng
– volume: 30
  start-page: 2101
  issue: 6
  year: 2022
  end-page: 2112
  article-title: Robust actor‐critic learning for continuous‐time nonlinear systems with unmodeled dynamics
  publication-title: IEEE Trans Fuzzy Syst
– volume: 22
  start-page: 237
  issue: 3
  year: 2009
  end-page: 246
  article-title: Neural network approach to continuous‐time direct adaptive optimal control for partially unknown nonlinear systems
  publication-title: Neural Netw
– ident: e_1_2_11_18_1
  doi: 10.1109/TSMCB.2005.862486
– ident: e_1_2_11_33_1
  doi: 10.1109/TSMC.2015.2429555
– ident: e_1_2_11_44_1
  doi: 10.1109/TCYB.2015.2421338
– ident: e_1_2_11_6_1
  doi: 10.1109/TNNLS.2015.2508926
– ident: e_1_2_11_69_1
  doi: 10.1109/ACSSC.2010.5757875
– ident: e_1_2_11_35_1
  doi: 10.1109/TSG.2019.2942770
– ident: e_1_2_11_49_1
  doi: 10.1016/j.automatica.2013.09.043
– ident: e_1_2_11_43_1
  doi: 10.1016/j.automatica.2008.08.017
– ident: e_1_2_11_31_1
  doi: 10.1016/j.neunet.2009.03.012
– ident: e_1_2_11_5_1
  doi: 10.1049/iet-cta.2015.0019
– ident: e_1_2_11_24_1
  doi: 10.1109/TFUZZ.2009.2020506
– ident: e_1_2_11_20_1
  doi: 10.1109/TAC.2016.2550518
– ident: e_1_2_11_73_1
  doi: 10.1016/j.automatica.2010.02.018
– ident: e_1_2_11_68_1
  doi: 10.1109/TSMC.2019.2895692
– ident: e_1_2_11_38_1
  doi: 10.1016/j.neucom.2011.05.031
– ident: e_1_2_11_16_1
  doi: 10.1016/j.automatica.2017.03.033
– ident: e_1_2_11_28_1
  doi: 10.1002/9781119132677
– ident: e_1_2_11_3_1
  doi: 10.1080/00207179.2011.631192
– volume: 25
  start-page: 1665
  issue: 7
  year: 2014
  ident: e_1_2_11_14_1
  article-title: Backstepping control for output‐constrained nonlinear systems based on nonlinear mapping
  publication-title: Neural Comput Appl
– volume-title: Principal Component Analysis
  year: 2002
  ident: e_1_2_11_50_1
– ident: e_1_2_11_9_1
  doi: 10.1016/j.automatica.2015.10.034
– ident: e_1_2_11_45_1
  doi: 10.1016/j.neucom.2017.01.076
– ident: e_1_2_11_48_1
  doi: 10.1109/ICRA48506.2021.9561870
– ident: e_1_2_11_34_1
  doi: 10.1007/s11768-011-0166-4
– ident: e_1_2_11_58_1
  doi: 10.1109/IJCNN.2013.6707098
– start-page: 96
  year: 1996
  ident: e_1_2_11_72_1
  article-title: Constrained nonlinear optimal control: a converse HJB approach
  publication-title: Control Dyn Syst
– ident: e_1_2_11_23_1
  doi: 10.1109/37.845037
– ident: e_1_2_11_10_1
  doi: 10.1109/TAC.2000.880994
– ident: e_1_2_11_7_1
  doi: 10.1109/TNNLS.2018.2803827
– ident: e_1_2_11_56_1
  doi: 10.1034/j.1600-0870.2001.00251.x
– ident: e_1_2_11_52_1
  doi: 10.1162/089976699300016728
– ident: e_1_2_11_42_1
  doi: 10.1109/TFUZZ.2021.3075501
– ident: e_1_2_11_39_1
  doi: 10.1002/rnc.2814
– ident: e_1_2_11_70_1
  doi: 10.1016/j.foodcont.2005.06.008
– volume: 38
  start-page: 225
  issue: 2
  year: 2021
  ident: e_1_2_11_21_1
  article-title: Data‐based feedback relearning algorithm for robust control of SGCMG gimbal servo system with multi‐source disturbance
  publication-title: Trans Nanjing Univ Aeronaut Astronaut
– start-page: 1
  year: 2021
  ident: e_1_2_11_41_1
  article-title: Hamiltonian‐driven adaptive dynamic programming with approximation errors
  publication-title: IEEE Trans Cybern
– ident: e_1_2_11_8_1
  doi: 10.1016/S0005-1098(00)00116-3
– ident: e_1_2_11_57_1
  doi: 10.1109/TNNLS.2016.2635111
– volume: 46
  start-page: 85
  issue: 1
  year: 2015
  ident: e_1_2_11_15_1
  article-title: Adaptive neural control of a class of output‐constrained nonaffine systems
  publication-title: IEEE Trans Cybern
– ident: e_1_2_11_65_1
  doi: 10.1109/TNNLS.2013.2294968
– ident: e_1_2_11_26_1
  doi: 10.1016/j.automatica.2016.12.009
– ident: e_1_2_11_55_1
  doi: 10.1016/j.asoc.2013.01.006
– ident: e_1_2_11_37_1
  doi: 10.1016/j.automatica.2010.10.033
– ident: e_1_2_11_61_1
  doi: 10.1016/j.neunet.2015.08.007
– ident: e_1_2_11_11_1
  doi: 10.1016/j.jfranklin.2011.08.004
– ident: e_1_2_11_71_1
  doi: 10.1016/j.cie.2011.02.014
– ident: e_1_2_11_25_1
  doi: 10.1109/TIE.2017.2764842
– ident: e_1_2_11_17_1
  doi: 10.1016/S0005-1098(98)00018-1
– ident: e_1_2_11_19_1
  doi: 10.1002/asjc.1184
– ident: e_1_2_11_4_1
  doi: 10.1016/j.isatra.2015.05.014
– ident: e_1_2_11_30_1
  doi: 10.1109/TSMCB.2008.924139
– ident: e_1_2_11_13_1
  doi: 10.1007/s12555-012-0403-8
– ident: e_1_2_11_32_1
  doi: 10.1109/TNNLS.2019.2900510
– ident: e_1_2_11_66_1
  doi: 10.1109/TAC.2019.2905215
– ident: e_1_2_11_63_1
  doi: 10.1109/TCYB.2014.2319577
– ident: e_1_2_11_51_1
  doi: 10.1111/1467-9868.00196
– ident: e_1_2_11_67_1
  doi: 10.1016/j.automatica.2004.11.034
– ident: e_1_2_11_27_1
  doi: 10.1109/9.159566
– ident: e_1_2_11_53_1
  doi: 10.1016/S0098-1354(00)00312-4
– ident: e_1_2_11_40_1
  doi: 10.1049/cit2.12015
– ident: e_1_2_11_36_1
  doi: 10.1016/j.neunet.2009.03.008
– ident: e_1_2_11_59_1
  doi: 10.1109/TCYB.2015.2417170
– ident: e_1_2_11_12_1
  doi: 10.1109/TNN.2010.2047115
– ident: e_1_2_11_64_1
  doi: 10.1109/TNNLS.2020.2969215
– ident: e_1_2_11_60_1
  doi: 10.1109/TNNLS.2020.3007414
– ident: e_1_2_11_2_1
  doi: 10.1016/j.automatica.2008.11.017
– volume-title: Reinforcement Learning: An Introduction
  year: 1998
  ident: e_1_2_11_29_1
– start-page: 493
  volume-title: Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches
  year: 1992
  ident: e_1_2_11_62_1
– ident: e_1_2_11_75_1
  doi: 10.1109/TNNLS.2015.2490698
– ident: e_1_2_11_22_1
  doi: 10.1109/TIE.2008.2011621
– ident: e_1_2_11_46_1
  doi: 10.1109/TNNLS.2017.2751018
– ident: e_1_2_11_54_1
  doi: 10.1002/rnc.657
– ident: e_1_2_11_74_1
  doi: 10.1016/S0005-1098(97)00065-4
– ident: e_1_2_11_47_1
  doi: 10.1002/wics.56
SSID ssj0009924
Score 2.378308
Snippet In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with...
SourceID proquest
crossref
wiley
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1607
SubjectTerms Algorithms
asymmetric state constraints
Asymmetry
Control algorithms
Control systems design
Control theory
Data sampling
Machine learning
Mapping
neural networks
Nonlinear control
Nonlinear systems
principal component analysis
Principal components analysis
reinforcement learning
Robust control
Sampling methods
Title Improved off‐policy reinforcement learning algorithm for robust control of unmodeled nonlinear system with asymmetric state constraints
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Frnc.6432
https://www.proquest.com/docview/2766316522
Volume 33
WOSCitedRecordID wos000871443400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVWIB
  databaseName: Wiley Online Library Full Collection 2020
  customDbUrl:
  eissn: 1099-1239
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0009924
  issn: 1049-8923
  databaseCode: DRFUL
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3LSsQwFA0y40IXvsXxRQTRVbFNH0mXog4uZBBRcFeSNB0HnAdtR3Dn3o2fMN8yn-KXeJO0joKC4KqL5oZLkpt7cknOQegQMqSMQiWcOGWBE2ScOiIUmSMo4RnzPOVF3IhN0E6H3d_H19WtSv0WxvJDfBbcdGSY_VoHOBfFyYw0NIf4gXQK22-TwLINGqh5ftO-u5pR7sZW0hYwsMMAx9TUsy45qW2_J6MZwvyKU02iaS__x8UVtFTBS3xq18MqmlODNbT4hXRwHb3aOoJK8TDL3l_eRoYZGOfKcKhKUy7ElZhEF_PH7jDvlQ99DD-nk3woxkWJqwvu0AMeD6yYTooH1mWeY0sPjXWNdzrhxXO_r3W7JDavl7RxYZQpymID3bUvbs8unUqSwZGAC4ijAP6FIkzDKHN9PyApwAUv9Th1XRVJ6pNYaY48IVOfQXokQcSJUByOmYoRGnJ_EzXAG7WFMCDBmFEey1hCR5SwFI5-cD5SBBAE83kLHddzk8iKr1w795hYpmWSwPAmenhb6OCz5chydPzQZree3qSK0iIhFPCWFwEEbaEjM5G_2ic3nTP93f5rwx20oJXp7QXvXdQo87HaQ_PyqewV-X61Vj8A3Qry4Q
linkProvider Wiley-Blackwell
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NahsxEB6CXWh7aJq0pW6cVIHSnJZ4tT_SklNwYhzimBJs8G3RarVJIP5hd13orfde-gh5ljxKniQjaTdOIIFCT3tYjRgkjebTIH0fwDfMkDIMVOJEKfcdPxPMSYIkcxJGRcZdV7mhMGITbDjkk0n0Yw0O6rcwlh_ioeCmI8Ps1zrAdUF6f8UammMAYT7F_bfp4yoKGtA8Ou-NByvO3chq2iIIdjgCmZp7tkP3a9un2WgFMR8DVZNpeuv_5eN7eFcBTHJoV8QGrKnZJrx9RDv4Af7YSoJKyTzL7n7_XRhuYJIrw6IqTcGQVHISF0RcX8zzq_JySvDn7U0-T5ZFSaor7tgDWc6snE5KZtZnkRNLEE10lff2RhS_plOt3CWJeb-kjQujTVEWH2HcOx51-04lyuBIRAbUUQgAgyRIgzDreJ5PUwQMbuoK1umoUDKPRkqz5CUy9TgmSOqHgiZK4EFTccoC4X2CBnqjPgNBLBhxJiIZSeyIUZ7i4Q9PSIoihuCeaMFePTmxrBjLtXPXseVapjEOb6yHtwW7Dy0XlqXjmTbten7jKk6LmDJEXG6IILQF381Mvmgfnw-7-vvlXxt-hdf90dkgHpwMT7fgjdapt9e929Ao86XahlfyZ3lV5DvVwr0HVnr20Q
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NSiQxEC7EWRY9uOsfjrprFmT31Did_kkaT6I7rCiDyAremnSSHgXnh-4ewZt3Lz7CPIuPsk-ylaTbcWEFwVMfOglFKpX6UiTfB7CLGVLGkc68RPHQC3PBvCzKci9jVOTc97UfCys2wXo9fnmZnM3BfvMWxvFDPBfcTGTY_doEuB6rfG_GGlpgAGE-xf23FUZJjFHZOjrvXpzOOHcTp2mLINjjCGQa7tkO3Wv6_puNZhDzJVC1mab76V02foalGmCSA7cilmFOD1dg8QXt4Co8uEqCVmSU53_uH8eWG5gU2rKoSlswJLWcRJ-Im_6ouK6uBgR_Pk2LUTYpK1JfcccRyGTo5HQUGTqbRUEcQTQxVd6nqSjvBgOj3CWJfb9kOpdWm6Iq1-Ci-_P34S-vFmXwJCID6mkEgFEWqSjOO0EQUoWAwVe-YJ2OjiULaKINS14mVcAxQdIwFjTTAg-amlMWiWAd5tEavQEEsWDCmUhkInEgRrnCwx-ekDRFDMED0YYfjXNSWTOWG-NuUse1TFOc3tRMbxu-PbccO5aO_7TZbvyb1nFappQh4vJjBKFt-G49-Wr_9Lx3aL6bb224Ax_Pjrrp6XHvZAsWjEy9u-29DfNVMdFf4IO8ra7L4mu9bv8C8cD2TA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improved+off%E2%80%90policy+reinforcement+learning+algorithm+for%C2%A0robust+control+of+unmodeled+nonlinear+system+with%C2%A0asymmetric+state+constraints&rft.jtitle=International+journal+of+robust+and+nonlinear+control&rft.au=Zhang%2C+Yong&rft.au=Mu%2C+Chaoxu&rft.au=Feng%2C+Yanghe&rft.au=Zhao%2C+Zhijia&rft.date=2023-02-01&rft.issn=1049-8923&rft.eissn=1099-1239&rft.volume=33&rft.issue=3&rft.spage=1607&rft.epage=1632&rft_id=info:doi/10.1002%2Frnc.6432&rft.externalDBID=10.1002%252Frnc.6432&rft.externalDocID=RNC6432
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1049-8923&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1049-8923&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1049-8923&client=summon