Improved off‐policy reinforcement learning algorithm for robust control of unmodeled nonlinear system with asymmetric state constraints
In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the map...
Uložené v:
| Vydané v: | International journal of robust and nonlinear control Ročník 33; číslo 3; s. 1607 - 1632 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Bognor Regis
Wiley Subscription Services, Inc
01.02.2023
|
| Predmet: | |
| ISSN: | 1049-8923, 1099-1239 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the mapping state has better response speed and amplitude than the original state. Then, an auxiliary mapping error system is constructed for the off‐policy robust controller design. At the same time, an innovative network dimensionality reduction method based on principal component analysis is proposed to simplify the useless activation function of action network in off‐policy algorithm, which can effectively reduce the computational burden of data episodes. Considering the uncertain data caused by disturbances, a dominant data sampling method is designed to extract samples that are beneficial to algorithm convergence. On this basis, the improved off‐policy robust control algorithm is constructed. Based on an industrial manipulator system, the effectiveness of the dominant data sampling method and the improved off‐policy robust control algorithm is verified by comparative simulation. |
|---|---|
| AbstractList | In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with asymmetric state constraints. An improved nonlinear mapping is defined for the asymmetric state constraint problem, which can ensure that the mapping state has better response speed and amplitude than the original state. Then, an auxiliary mapping error system is constructed for the off‐policy robust controller design. At the same time, an innovative network dimensionality reduction method based on principal component analysis is proposed to simplify the useless activation function of action network in off‐policy algorithm, which can effectively reduce the computational burden of data episodes. Considering the uncertain data caused by disturbances, a dominant data sampling method is designed to extract samples that are beneficial to algorithm convergence. On this basis, the improved off‐policy robust control algorithm is constructed. Based on an industrial manipulator system, the effectiveness of the dominant data sampling method and the improved off‐policy robust control algorithm is verified by comparative simulation. |
| Author | Mu, Chaoxu Zhao, Zhijia Feng, Yanghe Zhang, Yong |
| Author_xml | – sequence: 1 givenname: Yong orcidid: 0000-0003-4385-712X surname: Zhang fullname: Zhang, Yong organization: Tianjin University – sequence: 2 givenname: Chaoxu surname: Mu fullname: Mu, Chaoxu email: cxmu@tju.edu.cn organization: Tianjin University – sequence: 3 givenname: Yanghe surname: Feng fullname: Feng, Yanghe organization: National University of Defense Technology – sequence: 4 givenname: Zhijia orcidid: 0000-0001-5893-0233 surname: Zhao fullname: Zhao, Zhijia organization: Guangzhou University |
| BookMark | eNp1kM9O3DAQxq0KpPJP6iNY4sIliz1OnORYraBFQkVCcI68zoQaOfZie0G59d5LH4Fn4VH6JDjdnhCcZqT5fd_MfPtkx3mHhHzhbMEZg9Pg9EKWAj6RPc7atuAg2p25L9uiaUF8Jvsx3jOWZ1Dukd8X4zr4R-ypH4a_v_6svTV6ogGNG3zQOKJL1KIKzrg7quydDyb9HGkevjwHv9rERLV3KXibHejGjb5Hm-3yWda4LKRxiglH-pR1L88qTuOIKRhNY1IJZ3FMQRmX4iHZHZSNePS_HpDb87Ob5ffi8urbxfLrZaGhFVCgrGS1qvpKDkyIEnoOLe-5qhlDqWsBLbKGNyvdi6bkAkqpYIWqBo4N1JUSB-R465s_f9hgTN293wSXV3ZQSym4rAAytdhSOvgYAw6dNvliMz-rjO046-bAuxx4NweeBSdvBOtgRhWm99Biiz4Zi9OHXHf9Y_mPfwW59Zcm |
| CitedBy_id | crossref_primary_10_1109_TETCI_2023_3301789 crossref_primary_10_1016_j_jfranklin_2025_108035 |
| Cites_doi | 10.1109/TSMCB.2005.862486 10.1109/TSMC.2015.2429555 10.1109/TCYB.2015.2421338 10.1109/TNNLS.2015.2508926 10.1109/ACSSC.2010.5757875 10.1109/TSG.2019.2942770 10.1016/j.automatica.2013.09.043 10.1016/j.automatica.2008.08.017 10.1016/j.neunet.2009.03.012 10.1049/iet-cta.2015.0019 10.1109/TFUZZ.2009.2020506 10.1109/TAC.2016.2550518 10.1016/j.automatica.2010.02.018 10.1109/TSMC.2019.2895692 10.1016/j.neucom.2011.05.031 10.1016/j.automatica.2017.03.033 10.1002/9781119132677 10.1080/00207179.2011.631192 10.1016/j.automatica.2015.10.034 10.1016/j.neucom.2017.01.076 10.1109/ICRA48506.2021.9561870 10.1007/s11768-011-0166-4 10.1109/IJCNN.2013.6707098 10.1109/37.845037 10.1109/TAC.2000.880994 10.1109/TNNLS.2018.2803827 10.1034/j.1600-0870.2001.00251.x 10.1162/089976699300016728 10.1109/TFUZZ.2021.3075501 10.1002/rnc.2814 10.1016/j.foodcont.2005.06.008 10.1016/S0005-1098(00)00116-3 10.1109/TNNLS.2016.2635111 10.1109/TNNLS.2013.2294968 10.1016/j.automatica.2016.12.009 10.1016/j.asoc.2013.01.006 10.1016/j.automatica.2010.10.033 10.1016/j.neunet.2015.08.007 10.1016/j.jfranklin.2011.08.004 10.1016/j.cie.2011.02.014 10.1109/TIE.2017.2764842 10.1016/S0005-1098(98)00018-1 10.1002/asjc.1184 10.1016/j.isatra.2015.05.014 10.1109/TSMCB.2008.924139 10.1007/s12555-012-0403-8 10.1109/TNNLS.2019.2900510 10.1109/TAC.2019.2905215 10.1109/TCYB.2014.2319577 10.1111/1467-9868.00196 10.1016/j.automatica.2004.11.034 10.1109/9.159566 10.1016/S0098-1354(00)00312-4 10.1049/cit2.12015 10.1016/j.neunet.2009.03.008 10.1109/TCYB.2015.2417170 10.1109/TNN.2010.2047115 10.1109/TNNLS.2020.2969215 10.1109/TNNLS.2020.3007414 10.1016/j.automatica.2008.11.017 10.1109/TNNLS.2015.2490698 10.1109/TIE.2008.2011621 10.1109/TNNLS.2017.2751018 10.1002/rnc.657 10.1016/S0005-1098(97)00065-4 10.1002/wics.56 |
| ContentType | Journal Article |
| Copyright | 2022 John Wiley & Sons Ltd. 2023 John Wiley & Sons, Ltd. |
| Copyright_xml | – notice: 2022 John Wiley & Sons Ltd. – notice: 2023 John Wiley & Sons, Ltd. |
| DBID | AAYXX CITATION 7SC 7SP 7TB 8FD FR3 JQ2 L7M L~C L~D |
| DOI | 10.1002/rnc.6432 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | CrossRef Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1099-1239 |
| EndPage | 1632 |
| ExternalDocumentID | 10_1002_rnc_6432 RNC6432 |
| Genre | article |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China funderid: 62022061 – fundername: Natural Science Foundation of Tianjin City funderid: 20JCYBJC00880 – fundername: National Key Research and Development Program of China funderid: 2021YFB1714700 |
| GroupedDBID | .3N .GA .Y3 05W 0R~ 10A 1L6 1OB 1OC 31~ 33P 3SF 3WU 4.4 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 5GY 5VS 66C 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHHS AAHQN AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAZKR ABCQN ABCUV ABEML ABIJN ABJNI ACAHQ ACBWZ ACCFJ ACCZN ACGFO ACGFS ACIWK ACPOU ACRPL ACSCC ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADNMO ADOZA ADXAS ADZMN ADZOD AEEZP AEIGN AEIMD AENEX AEQDE AEUQT AEUYR AFBPY AFFPM AFGKR AFPWT AFWVQ AFZJQ AHBTC AI. AIAGR AITYG AIURR AIWBW AJBDE AJXKR ALAGY ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ASPBG ATUGU AUFTA AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BMNLL BMXJE BNHUX BROTX BRXPI BY8 CMOOK CS3 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM DU5 EBS EJD F00 F01 F04 FEDTE G-S G.N GNP GODZA H.T H.X HF~ HGLYW HHY HHZ HVGLF HZ~ IX1 J0M JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES M59 MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A NF~ NNB O66 O9- P2P P2W P2X P4D PALCI Q.N Q11 QB0 QRW R.K RIWAO RJQFR ROL RWI RX1 RYL SAMSI SUPJJ TUS UB1 V2E VH1 W8V W99 WBKPD WH7 WIH WIK WJL WLBEL WOHZO WQJ WRC WWI WXSBR WYISQ XG1 XV2 ZZTAW ~IA ~WT AAMMB AAYXX AEFGJ AEYWJ AGHNM AGQPQ AGXDD AGYGG AIDQK AIDYY AIQQE AMVHM CITATION O8X 7SC 7SP 7TB 8FD FR3 JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c2932-e6565b5d56f03342d1291d1a700e6c7329e0818bcd38413246a2bea721e8275a3 |
| IEDL.DBID | DRFUL |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000871443400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1049-8923 |
| IngestDate | Fri Jul 25 12:07:24 EDT 2025 Sat Nov 29 02:15:51 EST 2025 Tue Nov 18 21:11:13 EST 2025 Wed Jan 22 16:18:42 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c2932-e6565b5d56f03342d1291d1a700e6c7329e0818bcd38413246a2bea721e8275a3 |
| Notes | Funding information National Key Research and Development Program of China, Grant/Award Number: 2021YFB1714700; National Natural Science Foundation of China, Grant/Award Number: 62022061; Natural Science Foundation of Tianjin City, Grant/Award Number: 20JCYBJC00880 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-5893-0233 0000-0003-4385-712X |
| PQID | 2766316522 |
| PQPubID | 1026344 |
| PageCount | 26 |
| ParticipantIDs | proquest_journals_2766316522 crossref_citationtrail_10_1002_rnc_6432 crossref_primary_10_1002_rnc_6432 wiley_primary_10_1002_rnc_6432_RNC6432 |
| PublicationCentury | 2000 |
| PublicationDate | February 2023 2023-02-00 20230201 |
| PublicationDateYYYYMMDD | 2023-02-01 |
| PublicationDate_xml | – month: 02 year: 2023 text: February 2023 |
| PublicationDecade | 2020 |
| PublicationPlace | Bognor Regis |
| PublicationPlace_xml | – name: Bognor Regis |
| PublicationTitle | International journal of robust and nonlinear control |
| PublicationYear | 2023 |
| Publisher | Wiley Subscription Services, Inc |
| Publisher_xml | – name: Wiley Subscription Services, Inc |
| References | 2009; 45 2017; 81 2015; 71 2000; 45 2013; 23 2002; 12 2006; 36 2008; 38 2011; 61 2014; 25 2020; 11 2012; 10 2017; 238 2015; 46 2009; 56 2015; 45 2010; 21 2021; 38 2021; 32 2019; 64 2013; 13 2020; 50 2017; 78 1999; 11 2022; 30 2010; 2 2014; 50 2016; 46 2001; 53 2009; 17 65 2009; 22 2018; 29 2015; 58 2021; 6 2000; 24 2006; 17 2011; 84 2000; 20 1998 2005; 41 1996 1992; 37 1992 2002 1999; 61 2016; 18 2015; 9 2012; 78 2012; 349 2011; 9 2010; 46 2000; 36 2020; 31 1997; 33 2021 2016; 64 2017 2016; 61 2011; 47 2016; 27 1998; 34 e_1_2_11_70_1 e_1_2_11_32_1 e_1_2_11_55_1 e_1_2_11_30_1 e_1_2_11_57_1 e_1_2_11_36_1 e_1_2_11_51_1 e_1_2_11_74_1 e_1_2_11_13_1 e_1_2_11_34_1 e_1_2_11_53_1 e_1_2_11_11_1 e_1_2_11_6_1 Tao G (e_1_2_11_14_1) 2014; 25 e_1_2_11_27_1 e_1_2_11_4_1 e_1_2_11_48_1 e_1_2_11_2_1 Sutton Richard S (e_1_2_11_29_1) 1998 e_1_2_11_60_1 e_1_2_11_20_1 e_1_2_11_45_1 e_1_2_11_66_1 e_1_2_11_47_1 e_1_2_11_68_1 e_1_2_11_24_1 e_1_2_11_8_1 e_1_2_11_22_1 e_1_2_11_43_1 e_1_2_11_64_1 e_1_2_11_17_1 Wenchao M (e_1_2_11_15_1) 2015; 46 e_1_2_11_59_1 Nevistic V (e_1_2_11_72_1) 1996 e_1_2_11_38_1 e_1_2_11_19_1 e_1_2_11_71_1 e_1_2_11_10_1 Zhang Yong M (e_1_2_11_21_1) 2021; 38 e_1_2_11_31_1 e_1_2_11_56_1 e_1_2_11_58_1 e_1_2_11_35_1 e_1_2_11_52_1 e_1_2_11_73_1 e_1_2_11_12_1 e_1_2_11_33_1 e_1_2_11_54_1 e_1_2_11_75_1 e_1_2_11_7_1 e_1_2_11_28_1 e_1_2_11_5_1 e_1_2_11_26_1 e_1_2_11_3_1 e_1_2_11_49_1 Werbos PJ (e_1_2_11_62_1) 1992 e_1_2_11_61_1 e_1_2_11_44_1 e_1_2_11_67_1 e_1_2_11_46_1 e_1_2_11_69_1 e_1_2_11_25_1 e_1_2_11_40_1 e_1_2_11_63_1 e_1_2_11_9_1 e_1_2_11_23_1 e_1_2_11_42_1 e_1_2_11_65_1 Jolliffe IT (e_1_2_11_50_1) 2002 e_1_2_11_18_1 e_1_2_11_16_1 e_1_2_11_37_1 e_1_2_11_39_1 Yang Y (e_1_2_11_41_1) 2021 |
| References_xml | – volume: 6 start-page: 203 issue: 2 year: 2021 end-page: 212 article-title: Learning‐based control for discrete‐time constrained nonzero‐sum games publication-title: CAAI Trans Intell Technol – volume: 25 start-page: 882 issue: 5 year: 2014 end-page: 893 article-title: Robust adaptive dynamic programming and feedback stabilization of nonlinear systems publication-title: IEEE Trans Neural Netw Learn Syst – volume: 10 start-page: 684 issue: 4 year: 2012 end-page: 696 article-title: Adaptive fuzzy backstepping dynamic surface control for output‐constrained non‐smooth nonlinear dynamic system publication-title: Int J Control Autom Syst – volume: 37 start-page: 1283 issue: 9 year: 1992 end-page: 1293 article-title: Disturbance attenuation and control via measurement feedback in nonlinear systems publication-title: IEEE Trans Automat Contr – volume: 61 start-page: 4170 issue: 12 year: 2016 end-page: 4175 article-title: Adaptive dynamic programming for stochastic systems with state and control dependent noise publication-title: IEEE Trans Automat Contr – volume: 81 start-page: 232 year: 2017 end-page: 239 article-title: Adaptive neural dynamic surface control of strict‐feedback nonlinear systems with full state constraints and unmodeled dynamics publication-title: Automatica – volume: 2 start-page: 54 issue: 1 year: 2010 end-page: 60 article-title: Importance sampling: a review publication-title: Wiley Interdiscip Rev Comput Stat – volume: 38 start-page: 898 issue: 4 year: 2008 end-page: 900 article-title: ADP: the key direction for future research in intelligent control and understanding brain intelligence publication-title: IEEE Trans Syst Man Cybern B Cybern – volume: 9 start-page: 353 issue: 3 year: 2011 end-page: 360 article-title: Adaptive dynamic programming for online solution of a zero‐sum differential game publication-title: J Control Theory Appl – volume: 349 start-page: 531 issue: 2 year: 2012 end-page: 558 article-title: Integrated guidance and autopilot design for a chasing UAV via high‐order sliding modes publication-title: J Frankl Inst – volume: 25 start-page: 1665 issue: 7 year: 2014 end-page: 1674 article-title: Backstepping control for output‐constrained nonlinear systems based on nonlinear mapping publication-title: Neural Comput Appl – volume: 47 start-page: 207 issue: 1 year: 2011 end-page: 214 article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero‐sum differential games publication-title: Automatica – volume: 32 start-page: 2650 issue: 6 year: 2021 end-page: 2662 article-title: Output‐feedback robust control of uncertain systems via online data‐driven learning publication-title: IEEE Trans Neural Netw Learn Syst – volume: 9 start-page: 2312 issue: 15 year: 2015 end-page: 2319 article-title: Backstepping dynamic surface control for a class of non‐linear systems with time‐varying output constraints publication-title: IET Control Theory Appl – volume: 36 start-page: 1835 issue: 12 year: 2000 end-page: 1846 article-title: Adaptive neural network control for strict‐feedback nonlinear systems using backstepping design publication-title: Automatica – year: 1998 – volume: 38 start-page: 225 issue: 2 year: 2021 end-page: 236 article-title: Data‐based feedback relearning algorithm for robust control of SGCMG gimbal servo system with multi‐source disturbance publication-title: Trans Nanjing Univ Aeronaut Astronaut – volume: 53 start-page: 599 issue: 5 year: 2001 end-page: 615 article-title: Nonlinear principal component analysis by neural networks publication-title: Tellus A – volume: 238 start-page: 377 year: 2017 end-page: 386 article-title: Data‐driven adaptive dynamic programming for continuous‐time fully cooperative games with partially constrained inputs publication-title: Neurocomputing – volume: 13 start-page: 2375 issue: 5 year: 2013 end-page: 2389 article-title: Nonlinear identification of a gasoline HCCI engine using neural networks coupled with principal component analysis publication-title: Appl Soft Comput – volume: 11 start-page: 1748 issue: 2 year: 2020 end-page: 1758 article-title: Energy‐storage‐based intelligent frequency control of microgrid with stochastic model uncertainties publication-title: IEEE Trans Smart Grid – volume: 45 start-page: 65 issue: 1 year: 2015 end-page: 76 article-title: Off‐policy reinforcement learning for control design publication-title: IEEE Trans Cybern – start-page: 96 year: 1996 end-page: 021 article-title: Constrained nonlinear optimal control: a converse HJB approach publication-title: Control Dyn Syst – volume: 58 start-page: 96 year: 2015 end-page: 104 article-title: Adaptive neural network control of unknown nonlinear affine systems with input deadzone and output constraint publication-title: ISA Trans – volume: 36 start-page: 509 issue: 3 year: 2006 end-page: 519 article-title: Mode‐independent robust stabilization for uncertain markovian jump nonlinear systems via fuzzy control publication-title: IEEE Trans Syst Man Cybern Cybern – volume: 20 start-page: 38 issue: 3 year: 2000 end-page: 52 article-title: Tutorial overview of model predictive control publication-title: IEEE Control Syst Mag – volume: 46 start-page: 878 issue: 5 year: 2010 end-page: 888 article-title: Online actor–Critic algorithm to solve the continuous‐time infinite horizon optimal control problem publication-title: Automatica – start-page: 493 year: 1992 end-page: 526 – volume: 29 start-page: 2099 issue: 6 year: 2018 end-page: 2111 article-title: Adaptive constrained optimal control design for data‐based nonlinear discrete‐time systems with critic‐only structure publication-title: IEEE Trans Neural Networks Learn Syst – volume: 46 start-page: 334 issue: 3 year: 2016 end-page: 344 article-title: Adaptive neural impedance control of a robotic manipulator with input saturation publication-title: IEEE Trans Syst Man Cybern Syst – volume: 31 start-page: 5522 issue: 12 year: 2020 end-page: 5533 article-title: Adaptive optimal control for stochastic multiplayer differential games using on‐policy and off‐policy reinforcement learning publication-title: IEEE Trans Neural Netw Learn Syst – volume: 22 start-page: 200 issue: 3 year: 2009 end-page: 212 article-title: Intelligence in the brain: a theory of how it works and how to build it publication-title: Neural Netw – volume: 29 start-page: 5554 issue: 11 year: 2018 end-page: 5564 article-title: Adaptive neural control for robotic manipulators with output constraints and uncertainties publication-title: IEEE Trans Neural Netw Learn Syst – volume: 65 start-page: 3480 issue: 4 end-page: 3490 article-title: Dynamic behavior of terminal sliding mode control publication-title: IEEE Trans Ind Electron – volume: 45 start-page: 1372 issue: 7 year: 2015 end-page: 1385 article-title: Reinforcement‐learning‐based robust controller design for continuous‐time uncertain nonlinear systems subject to input constraints publication-title: IEEE Trans Cybern – volume: 27 start-page: 1562 issue: 7 year: 2016 end-page: 1571 article-title: Neural network control‐based adaptive learning design for nonlinear systems with full‐state constraints publication-title: IEEE Trans Neural Netw Learn Syst – volume: 29 start-page: 560 issue: 3 year: 2018 end-page: 572 article-title: Nonlinear process fault diagnosis based on serial principal component analysis publication-title: IEEE Trans Neural Networks Learn Syst – volume: 41 start-page: 779 issue: 5 year: 2005 end-page: 791 article-title: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach publication-title: Automatica – volume: 17 start-page: 894 issue: 11 year: 2006 end-page: 899 article-title: Process control based on principal component analysis for maize drying publication-title: Food Control – volume: 27 start-page: 2513 issue: 12 year: 2016 end-page: 2525 article-title: A theoretical foundation of goal representation heuristic dynamic programming publication-title: IEEE Trans Neural Netw Learn Syst – volume: 56 start-page: 900 issue: 3 year: 2009 end-page: 906 article-title: From PID to active disturbance rejection control publication-title: IEEE Trans Ind Electron – volume: 45 start-page: 1893 issue: 10 year: 2000 end-page: 1899 article-title: Dynamic surface control for a class of nonlinear systems publication-title: IEEE Trans Automat Contr – volume: 24 start-page: 99 issue: 1 year: 2000 end-page: 110 article-title: Principal component analysis for nonlinear model reference adaptive control publication-title: Comput Chem Eng – volume: 45 start-page: 918 issue: 4 year: 2009 end-page: 927 article-title: Barrier Lyapunov functions for the control of output‐constrained nonlinear systems publication-title: Automatica – volume: 50 start-page: 4056 issue: 11 year: 2020 end-page: 4067 article-title: ADP‐based robust tracking control for a class of nonlinear systems with unmatched uncertainties publication-title: IEEE Trans Syst Man Cybern Syst – volume: 18 start-page: 1020 issue: 3 year: 2016 end-page: 1029 article-title: Design and comparison base analysis of adaptive estimator for completely unknown linear systems in the presence of OE noise and constant input time delay publication-title: Asian J Control – volume: 23 start-page: 991 issue: 9 year: 2013 end-page: 1012 article-title: Computationally efficient simultaneous policy update algorithm for nonlinear state feedback control with Galerkin's method publication-title: Int J Robust Nonlinear Control – volume: 33 start-page: 1539 issue: 8 year: 1997 end-page: 1543 article-title: A dynamic recurrent neural‐network‐based adaptive observer for a class of nonlinear systems publication-title: Automatica – volume: 64 start-page: 4423 issue: 11 year: 2019 end-page: 4438 article-title: Reinforcement learning‐based adaptive optimal exponential tracking control of linear systems with unknown dynamics publication-title: IEEE Trans Automat Contr – volume: 45 start-page: 477 issue: 2 year: 2009 end-page: 484 article-title: Adaptive optimal control for continuous‐time linear systems based on policy iteration publication-title: Automatica – volume: 17 start-page: 1025 issue: 5 year: 2009 end-page: 1043 article-title: Fuzzy state‐space modeling and robust observer‐based control design for nonlinear partial differential systems publication-title: IEEE Trans Fuzzy Syst – volume: 64 start-page: 70 year: 2016 end-page: 75 article-title: Barrier Lyapunov functions‐based adaptive control for a class of nonlinear pure‐feedback systems with full state constraints publication-title: Automatica – volume: 11 start-page: 443 issue: 2 year: 1999 end-page: 482 article-title: Mixtures of probabilistic principal component analyzers publication-title: Neural Comput – volume: 78 start-page: 144 year: 2017 end-page: 152 article-title: H∞ control of linear discrete‐time systems: off‐policy reinforcement learning publication-title: Automatica – start-page: 1 year: 2021 end-page: 12 article-title: Hamiltonian‐driven adaptive dynamic programming with approximation errors publication-title: IEEE Trans Cybern – volume: 46 start-page: 1041 issue: 5 year: 2016 end-page: 1050 article-title: Off‐policy actor‐critic structure for optimal control of unknown systems with disturbances publication-title: IEEE Trans Cybern – year: 2002 – volume: 50 start-page: 193 issue: 1 year: 2014 end-page: 202 article-title: Integral reinforcement learning and experience replay for adaptive optimal control of partially‐unknown constrained‐input continuous‐time systems publication-title: Automatica – volume: 46 start-page: 85 issue: 1 year: 2015 end-page: 95 article-title: Adaptive neural control of a class of output‐constrained nonaffine systems publication-title: IEEE Trans Cybern – volume: 34 start-page: 825 issue: 7 year: 1998 end-page: 840 article-title: Design of robust adaptive controllers for nonlinear systems with dynamic uncertainties publication-title: Automatica – volume: 61 start-page: 611 issue: 3 year: 1999 end-page: 622 article-title: Probabilistic principal component analysis publication-title: J R Stat Soc Ser B – volume: 71 start-page: 150 year: 2015 end-page: 158 article-title: Reinforcement learning solution for HJB equation arising in constrained optimal control problem publication-title: Neural Netw – volume: 78 start-page: 3 issue: 1 year: 2012 end-page: 13 article-title: A three‐network architecture for on‐line learning and optimization based on adaptive dynamic programming publication-title: Neurocomputing – volume: 12 start-page: 519 issue: 6 year: 2002 end-page: 535 article-title: A subspace approach to balanced truncation for model reduction of nonlinear control systems publication-title: Int J Robust Nonlinear Control – volume: 21 start-page: 1339 issue: 8 year: 2010 end-page: 1345 article-title: Adaptive neural control for output feedback nonlinear systems using a barrier Lyapunov function publication-title: IEEE Trans Neural Netw – year: 2017 – volume: 31 start-page: 259 issue: 1 year: 2020 end-page: 273 article-title: Learning‐based robust tracking control of quadrotor with time‐varying and coupling uncertainties publication-title: IEEE Trans Neural Netw Learn Syst – volume: 84 start-page: 2008 issue: 12 year: 2011 end-page: 2023 article-title: Control of nonlinear systems with partial state constraints using a barrier Lyapunov function publication-title: Int J Control – volume: 61 start-page: 437 issue: 3 year: 2011 end-page: 446 article-title: Variable window adaptive kernel principal component analysis for nonlinear nonstationary process monitoring publication-title: Comput Ind Eng – volume: 30 start-page: 2101 issue: 6 year: 2022 end-page: 2112 article-title: Robust actor‐critic learning for continuous‐time nonlinear systems with unmodeled dynamics publication-title: IEEE Trans Fuzzy Syst – volume: 22 start-page: 237 issue: 3 year: 2009 end-page: 246 article-title: Neural network approach to continuous‐time direct adaptive optimal control for partially unknown nonlinear systems publication-title: Neural Netw – ident: e_1_2_11_18_1 doi: 10.1109/TSMCB.2005.862486 – ident: e_1_2_11_33_1 doi: 10.1109/TSMC.2015.2429555 – ident: e_1_2_11_44_1 doi: 10.1109/TCYB.2015.2421338 – ident: e_1_2_11_6_1 doi: 10.1109/TNNLS.2015.2508926 – ident: e_1_2_11_69_1 doi: 10.1109/ACSSC.2010.5757875 – ident: e_1_2_11_35_1 doi: 10.1109/TSG.2019.2942770 – ident: e_1_2_11_49_1 doi: 10.1016/j.automatica.2013.09.043 – ident: e_1_2_11_43_1 doi: 10.1016/j.automatica.2008.08.017 – ident: e_1_2_11_31_1 doi: 10.1016/j.neunet.2009.03.012 – ident: e_1_2_11_5_1 doi: 10.1049/iet-cta.2015.0019 – ident: e_1_2_11_24_1 doi: 10.1109/TFUZZ.2009.2020506 – ident: e_1_2_11_20_1 doi: 10.1109/TAC.2016.2550518 – ident: e_1_2_11_73_1 doi: 10.1016/j.automatica.2010.02.018 – ident: e_1_2_11_68_1 doi: 10.1109/TSMC.2019.2895692 – ident: e_1_2_11_38_1 doi: 10.1016/j.neucom.2011.05.031 – ident: e_1_2_11_16_1 doi: 10.1016/j.automatica.2017.03.033 – ident: e_1_2_11_28_1 doi: 10.1002/9781119132677 – ident: e_1_2_11_3_1 doi: 10.1080/00207179.2011.631192 – volume: 25 start-page: 1665 issue: 7 year: 2014 ident: e_1_2_11_14_1 article-title: Backstepping control for output‐constrained nonlinear systems based on nonlinear mapping publication-title: Neural Comput Appl – volume-title: Principal Component Analysis year: 2002 ident: e_1_2_11_50_1 – ident: e_1_2_11_9_1 doi: 10.1016/j.automatica.2015.10.034 – ident: e_1_2_11_45_1 doi: 10.1016/j.neucom.2017.01.076 – ident: e_1_2_11_48_1 doi: 10.1109/ICRA48506.2021.9561870 – ident: e_1_2_11_34_1 doi: 10.1007/s11768-011-0166-4 – ident: e_1_2_11_58_1 doi: 10.1109/IJCNN.2013.6707098 – start-page: 96 year: 1996 ident: e_1_2_11_72_1 article-title: Constrained nonlinear optimal control: a converse HJB approach publication-title: Control Dyn Syst – ident: e_1_2_11_23_1 doi: 10.1109/37.845037 – ident: e_1_2_11_10_1 doi: 10.1109/TAC.2000.880994 – ident: e_1_2_11_7_1 doi: 10.1109/TNNLS.2018.2803827 – ident: e_1_2_11_56_1 doi: 10.1034/j.1600-0870.2001.00251.x – ident: e_1_2_11_52_1 doi: 10.1162/089976699300016728 – ident: e_1_2_11_42_1 doi: 10.1109/TFUZZ.2021.3075501 – ident: e_1_2_11_39_1 doi: 10.1002/rnc.2814 – ident: e_1_2_11_70_1 doi: 10.1016/j.foodcont.2005.06.008 – volume: 38 start-page: 225 issue: 2 year: 2021 ident: e_1_2_11_21_1 article-title: Data‐based feedback relearning algorithm for robust control of SGCMG gimbal servo system with multi‐source disturbance publication-title: Trans Nanjing Univ Aeronaut Astronaut – start-page: 1 year: 2021 ident: e_1_2_11_41_1 article-title: Hamiltonian‐driven adaptive dynamic programming with approximation errors publication-title: IEEE Trans Cybern – ident: e_1_2_11_8_1 doi: 10.1016/S0005-1098(00)00116-3 – ident: e_1_2_11_57_1 doi: 10.1109/TNNLS.2016.2635111 – volume: 46 start-page: 85 issue: 1 year: 2015 ident: e_1_2_11_15_1 article-title: Adaptive neural control of a class of output‐constrained nonaffine systems publication-title: IEEE Trans Cybern – ident: e_1_2_11_65_1 doi: 10.1109/TNNLS.2013.2294968 – ident: e_1_2_11_26_1 doi: 10.1016/j.automatica.2016.12.009 – ident: e_1_2_11_55_1 doi: 10.1016/j.asoc.2013.01.006 – ident: e_1_2_11_37_1 doi: 10.1016/j.automatica.2010.10.033 – ident: e_1_2_11_61_1 doi: 10.1016/j.neunet.2015.08.007 – ident: e_1_2_11_11_1 doi: 10.1016/j.jfranklin.2011.08.004 – ident: e_1_2_11_71_1 doi: 10.1016/j.cie.2011.02.014 – ident: e_1_2_11_25_1 doi: 10.1109/TIE.2017.2764842 – ident: e_1_2_11_17_1 doi: 10.1016/S0005-1098(98)00018-1 – ident: e_1_2_11_19_1 doi: 10.1002/asjc.1184 – ident: e_1_2_11_4_1 doi: 10.1016/j.isatra.2015.05.014 – ident: e_1_2_11_30_1 doi: 10.1109/TSMCB.2008.924139 – ident: e_1_2_11_13_1 doi: 10.1007/s12555-012-0403-8 – ident: e_1_2_11_32_1 doi: 10.1109/TNNLS.2019.2900510 – ident: e_1_2_11_66_1 doi: 10.1109/TAC.2019.2905215 – ident: e_1_2_11_63_1 doi: 10.1109/TCYB.2014.2319577 – ident: e_1_2_11_51_1 doi: 10.1111/1467-9868.00196 – ident: e_1_2_11_67_1 doi: 10.1016/j.automatica.2004.11.034 – ident: e_1_2_11_27_1 doi: 10.1109/9.159566 – ident: e_1_2_11_53_1 doi: 10.1016/S0098-1354(00)00312-4 – ident: e_1_2_11_40_1 doi: 10.1049/cit2.12015 – ident: e_1_2_11_36_1 doi: 10.1016/j.neunet.2009.03.008 – ident: e_1_2_11_59_1 doi: 10.1109/TCYB.2015.2417170 – ident: e_1_2_11_12_1 doi: 10.1109/TNN.2010.2047115 – ident: e_1_2_11_64_1 doi: 10.1109/TNNLS.2020.2969215 – ident: e_1_2_11_60_1 doi: 10.1109/TNNLS.2020.3007414 – ident: e_1_2_11_2_1 doi: 10.1016/j.automatica.2008.11.017 – volume-title: Reinforcement Learning: An Introduction year: 1998 ident: e_1_2_11_29_1 – start-page: 493 volume-title: Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches year: 1992 ident: e_1_2_11_62_1 – ident: e_1_2_11_75_1 doi: 10.1109/TNNLS.2015.2490698 – ident: e_1_2_11_22_1 doi: 10.1109/TIE.2008.2011621 – ident: e_1_2_11_46_1 doi: 10.1109/TNNLS.2017.2751018 – ident: e_1_2_11_54_1 doi: 10.1002/rnc.657 – ident: e_1_2_11_74_1 doi: 10.1016/S0005-1098(97)00065-4 – ident: e_1_2_11_47_1 doi: 10.1002/wics.56 |
| SSID | ssj0009924 |
| Score | 2.3782196 |
| Snippet | In this article, an improved data‐based off‐policy reinforcement learning algorithm is proposed for the robust control of unmodeled nonlinear systems with... |
| SourceID | proquest crossref wiley |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1607 |
| SubjectTerms | Algorithms asymmetric state constraints Asymmetry Control algorithms Control systems design Control theory Data sampling Machine learning Mapping neural networks Nonlinear control Nonlinear systems principal component analysis Principal components analysis reinforcement learning Robust control Sampling methods |
| Title | Improved off‐policy reinforcement learning algorithm for robust control of unmodeled nonlinear system with asymmetric state constraints |
| URI | https://onlinelibrary.wiley.com/doi/abs/10.1002%2Frnc.6432 https://www.proquest.com/docview/2766316522 |
| Volume | 33 |
| WOSCitedRecordID | wos000871443400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVWIB databaseName: Wiley Online Library Full Collection 2020 customDbUrl: eissn: 1099-1239 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0009924 issn: 1049-8923 databaseCode: DRFUL dateStart: 19960101 isFulltext: true titleUrlDefault: https://onlinelibrary.wiley.com providerName: Wiley-Blackwell |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3LSsQwFA0yutCFb3F8EUF0VWzz6GMp6uBCBhEVdyVJUxXmIW1HcOfejZ8w3zKf4pd4k3QcBQVBKHTRpFyS3NyTy805CO0lcJANdcA9mWfUY1wzTzBOPCWZT0UEAdRKJ9ycR-12fHubXNRVleYujOOH-Ey4Gc-w-7VxcCHLwwlpaAH-A-EUtt9pAsuWNdD0yWXr-nxCuZs4SVvAwF4MOGZMPeuTw3Hf78FogjC_4lQbaFoL_zFxEc3X8BIfufWwhKZ0bxnNfSEdXEGvLo-gM9zP8_eXt0fLDIwLbTlUlU0X4lpM4g6Lzl2_eKjuuxg-joZFXw7KCtcF7vAHPOg5MZ0M95zJosCOHhqbHO9oKMrnbtfodilsby-ZzqVVpqjKVXTdOr06PvNqSQZPAS4gngb4xyXPeJj7lDKSAVwIskBEvq9DFVGSaMORJ1VGYwiPhIWCSC3gmKljEnFB11ADrNHrCJM85rGmQQ5HKKYUF0zKkEmhKTyJEE10MJ6bVNV85ca4TuqYlkkKw5ua4W2i3c-Wj46j44c2W-PpTWsvLVMSAd4KQoCgTbRvJ_LX_ull-9i8N_7acBPNGmV6V-C9hRpVMdDbaEY9VQ9lsVOv1Q9ZUvNy |
| linkProvider | Wiley-Blackwell |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3dShwxFD7IWmi9qG1tcVttUyj1anAmP_ODV6Iulm6XIlq8G5JMxgrurszMFnrnvTd9BJ_FR_FJPElmXAsVhMLAXEwyHHJycr4cku8D-JThRjY2kQhUWbCAC8MDyQUNtOIhkwkmUCed8GOYjEbp8XH2fQG2urswnh_iruBmI8Ot1zbAbUF6c84aWmEAYT7F9XeR4ywSPVjcPRgcDeecu5nXtEUQHKQIZDru2ZBudn3_zkZziHkfqLpMM1j-LxtfwPMWYJJtPyNewoKZvIKle7SDK3DpKwmmINOyvLn4c-64gUllHIuqdgVD0spJnBB5djKtTpufY4Ifr6-qqZrVDWmPuOMfyGzi5XQKMvE2y4p4gmhiq7zXV7L-PR5b5S5N3P0l27l22hRN_RqOBnuHO_tBK8oQaEQGNDAIAIUShYjLkDFOCwQMURHJJAxNrBNGM2NZ8pQuWIoJkvJYUmUkbjRNShMh2RvooTVmFQgtU5EaFpW4ieJaC8mVirmShuGTSdmHjc45uW4Zy61xZ7nnWqY5Dm9uh7cPH-9annuWjn-0Wev8m7dxWuc0QcQVxQhC-_DZefLB_vnBaMe-3z624Qd4un_4bZgPv4y-voNnVqfeH_deg15Tzcw6PNG_mtO6et9O3FvXlfdi |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSx0xFD6IithFq1bx1kcjlHY1OJPHPOiqqBdLLxeRWtwNSSajgvfBzNxCd91305_gb_Gn-Et6ksx4LbRQKAzMYpLhkJOT8-WQfB_Amww3srGJRKDKggVcGB5ILmigFQ-ZTDCBOumEL4NkOEwvL7OzBXjf3YXx_BCPBTcbGW69tgFupkV5OGcNrTCAMJ_i-rvERRZjVC4dn_cvBnPO3cxr2iIIDlIEMh33bEgPu76_Z6M5xHwKVF2m6b_4LxvX4HkLMMkHPyPWYcGMN-DZE9rBl_DDVxJMQSZl-fD959RxA5PKOBZV7QqGpJWTuCLy9mpS3TTXI4If7--qiZrVDWmPuOMfyGzs5XQKMvY2y4p4gmhiq7z3d7L-NhpZ5S5N3P0l27l22hRNvQkX_ZPPR6dBK8oQaEQGNDAIAIUShYjLkDFOCwQMURHJJAxNrBNGM2NZ8pQuWIoJkvJYUmUkbjRNShMh2RYsojVmGwgtU5EaFpW4ieJaC8mVirmShuGTSdmDd51zct0yllvjbnPPtUxzHN7cDm8PDh5bTj1Lxx_a7Hb-zds4rXOaIOKKYgShPXjrPPnX_vn58Mi-X_1rw9ewcnbczwcfh592YNXK1PvT3ruw2FQzswfL-mtzU1f77bz9BXJZ9t0 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improved+off%E2%80%90policy+reinforcement+learning+algorithm+for+robust+control+of+unmodeled+nonlinear+system+with+asymmetric+state+constraints&rft.jtitle=International+journal+of+robust+and+nonlinear+control&rft.au=Zhang%2C+Yong&rft.au=Mu%2C+Chaoxu&rft.au=Feng%2C+Yanghe&rft.au=Zhao%2C+Zhijia&rft.date=2023-02-01&rft.pub=Wiley+Subscription+Services%2C+Inc&rft.issn=1049-8923&rft.eissn=1099-1239&rft.volume=33&rft.issue=3&rft.spage=1607&rft.epage=1632&rft_id=info:doi/10.1002%2Frnc.6432&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1049-8923&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1049-8923&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1049-8923&client=summon |