Adaptive Q-Learning Based Model-Free H Control of Continuous-Time Nonlinear Systems: Theory and Application

Although model based <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control scheme for nonlinear continuous-time (CT) systems with unknown system dynamics has been extensively studied, model-free <inline-formula><tex-math...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on emerging topics in computational intelligence Ročník 9; číslo 2; s. 1143 - 1152
Hlavní autori: Zhao, Jun, Lv, Yongfeng, Wang, Zhangu, Zhao, Ziliang
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: IEEE 01.04.2025
Predmet:
ISSN:2471-285X, 2471-285X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Although model based <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control scheme for nonlinear continuous-time (CT) systems with unknown system dynamics has been extensively studied, model-free <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control of nonlinear CT systems via Q-learning is still a challenging problem. This paper develops a novel Q-learning based model-free <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control scheme for nonlinear CT systems, where the adaptive critic and actor continuously and simultaneously update each other, eliminating the need for iterative steps. As a result, a hybrid structure is avoided and there is no longer a requirement for an initial stabilizing control policy. To obtain the <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control of the CT nonlinear system, the Q-learning strategy is introduced to online resolve the <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control problem in a non-iterative approach, where the system dynamics are not required. In addition, a new learning law is further developed by utilizing a sliding mode scheme to online update the critic neural network (NN) weights. Due to the strong convergence of critic NN weights, the actor NN used in most <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control algorithms is removed. Finally, numerical simulation and experimental results of an adaptive cruise control (ACC) system based on a real vehicle effectively demonstrate the feasibility of the presented control method and learning algorithm.
AbstractList Although model based <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control scheme for nonlinear continuous-time (CT) systems with unknown system dynamics has been extensively studied, model-free <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control of nonlinear CT systems via Q-learning is still a challenging problem. This paper develops a novel Q-learning based model-free <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control scheme for nonlinear CT systems, where the adaptive critic and actor continuously and simultaneously update each other, eliminating the need for iterative steps. As a result, a hybrid structure is avoided and there is no longer a requirement for an initial stabilizing control policy. To obtain the <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control of the CT nonlinear system, the Q-learning strategy is introduced to online resolve the <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control problem in a non-iterative approach, where the system dynamics are not required. In addition, a new learning law is further developed by utilizing a sliding mode scheme to online update the critic neural network (NN) weights. Due to the strong convergence of critic NN weights, the actor NN used in most <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control algorithms is removed. Finally, numerical simulation and experimental results of an adaptive cruise control (ACC) system based on a real vehicle effectively demonstrate the feasibility of the presented control method and learning algorithm.
Author Zhao, Jun
Zhao, Ziliang
Lv, Yongfeng
Wang, Zhangu
Author_xml – sequence: 1
  givenname: Jun
  orcidid: 0000-0003-2908-2583
  surname: Zhao
  fullname: Zhao, Jun
  email: skdzhaojun@sdust.edu.cn
  organization: College of Transportation, Shandong Key Laboratory of Hydrogen Electric Hybrid Power System Control and Safety, Shandong University of Science and Technology, Qingdao, China
– sequence: 2
  givenname: Yongfeng
  orcidid: 0000-0002-9139-7220
  surname: Lv
  fullname: Lv, Yongfeng
  email: lvyilian1989@foxmail.com
  organization: College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan, China
– sequence: 3
  givenname: Zhangu
  orcidid: 0000-0001-5251-1580
  surname: Wang
  fullname: Wang, Zhangu
  email: wangzhangu1@163.com
  organization: College of Transportation, Shandong Key Laboratory of Hydrogen Electric Hybrid Power System Control and Safety, Shandong University of Science and Technology, Qingdao, China
– sequence: 4
  givenname: Ziliang
  orcidid: 0000-0002-1629-677X
  surname: Zhao
  fullname: Zhao, Ziliang
  email: zhaoziliang1@sdust.edu.cn
  organization: College of Transportation, Shandong Key Laboratory of Hydrogen Electric Hybrid Power System Control and Safety, Shandong University of Science and Technology, Qingdao, China
BookMark eNp9kM1KAzEUhYNUsNa-gLjIC0zN3_y5q8XaQlXEEdwNmcmNRqfJkEyFvr3T2kVx4eqezXc49ztHA-ssIHRJyYRSkl8Xd8VsOWGEiQkXIs9ScoKGTKQ0Yln8NjjKZ2gcwichhOUx5bEYoq-pkm1nvgE_RyuQ3hr7jm9lAIUfnIImmnsAvMAzZzvvGuz0Phq7cZsQFWYN-NHZxtiexS_b0ME63ODiA5zfYmkVnrZtY2rZGWcv0KmWTYDx4Y7Q67yfvohWT_fL2XQV1SwlXZTwrObARSVY_wulHIBADglNqpwqlskqBkoIaK1ywnOtNZUsUSLjKUgFFR-h7Le39i4ED7qsTbdf0HlpmpKScuet3Hsrd97Kg7ceZX_Q1pu19Nv_oatfyADAEZAknDHCfwBzinyN
CODEN ITETCU
CitedBy_id crossref_primary_10_1088_1402_4896_ad98cc
crossref_primary_10_1109_TASE_2025_3603734
crossref_primary_10_1038_s41598_025_97930_3
crossref_primary_10_1109_JPHOT_2024_3471827
Cites_doi 10.1109/TCSII.2021.3112050
10.1016/j.amc.2022.127400
10.1016/j.sysconle.2016.12.003
10.1016/j.automatica.2004.11.034
10.1016/j.automatica.2021.110153
10.1109/TNNLS.2022.3148376
10.1109/CDC.2009.5399753
10.1016/j.automatica.2012.06.096
10.1109/TSMC.2022.3207575
10.1109/TSMC.2022.3173275
10.1080/00207179.2017.1381763
10.1109/TNNLS.2015.2441749
10.1049/iet-cta.2012.0313
10.1109/TETCI.2022.3145706
10.1109/TNNLS.2014.2328590
10.1109/TNNLS.2020.3007414
10.1109/TNNLS.2021.3135405
10.1109/TITS.2020.3001600
10.1109/TNNLS.2015.2461452
10.1109/ICNN.1994.374604
10.1109/TNNLS.2021.3053269
10.1109/TSMC.2020.3003224
10.1016/j.automatica.2010.02.018
10.1016/j.isatra.2016.10.019
10.1109/CDC40024.2019.9030116
10.1007/978-1-4471-4757-2_9
10.1016/j.automatica.2012.06.008
10.1109/JAS.2014.7004668
10.1109/TNNLS.2022.3226518
10.1016/j.sysconle.2021.104880
10.1109/TSMC.2018.2863708
10.1109/TCYB.2014.2313915
10.1007/978-3-319-50815-3_11
10.1109/TETCI.2022.3140375
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/TETCI.2024.3449870
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISSN 2471-285X
EndPage 1152
ExternalDocumentID 10_1109_TETCI_2024_3449870
10663220
Genre orig-research
GrantInformation_xml – fundername: Anhui Province Key Laboratory of Advanced Numerical Control Servo Technology
  grantid: XJSK202301
– fundername: Natural Science Foundation of Shandong Provincial
  grantid: ZR2022QF011
– fundername: Development Plan for Youth Innovation Teams in Higher Education Institutions in Shandong Province
  grantid: 2023KJ094
– fundername: National Natural Science Foundation of China
  grantid: 62203279; 62103296
  funderid: 10.13039/501100001809
GroupedDBID 0R~
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
ACGFS
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
JAVBF
OCL
RIA
RIE
AAYXX
CITATION
ID FETCH-LOGICAL-c270t-638c3e34b42987113ee0e9e616b91d28ab5e100effd9039fff1a26d4837eadeb3
IEDL.DBID RIE
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001308170300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2471-285X
IngestDate Sat Nov 29 08:04:54 EST 2025
Tue Nov 18 21:13:14 EST 2025
Wed Aug 27 01:37:13 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c270t-638c3e34b42987113ee0e9e616b91d28ab5e100effd9039fff1a26d4837eadeb3
ORCID 0000-0002-1629-677X
0000-0002-9139-7220
0000-0003-2908-2583
0000-0001-5251-1580
PageCount 10
ParticipantIDs ieee_primary_10663220
crossref_citationtrail_10_1109_TETCI_2024_3449870
crossref_primary_10_1109_TETCI_2024_3449870
PublicationCentury 2000
PublicationDate 2025-04-01
PublicationDateYYYYMMDD 2025-04-01
PublicationDate_xml – month: 04
  year: 2025
  text: 2025-04-01
  day: 01
PublicationDecade 2020
PublicationTitle IEEE transactions on emerging topics in computational intelligence
PublicationTitleAbbrev TETCI
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
References ref13
ref35
ref12
ref34
ref15
ref14
ref36
ref31
ref30
ref11
ref33
ref10
ref32
ref1
ref17
ref16
ref19
ref18
Werbos (ref2) 1992
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref30
  doi: 10.1109/TCSII.2021.3112050
– ident: ref19
  doi: 10.1016/j.amc.2022.127400
– ident: ref21
  doi: 10.1016/j.sysconle.2016.12.003
– ident: ref32
  doi: 10.1016/j.automatica.2004.11.034
– ident: ref20
  doi: 10.1016/j.automatica.2021.110153
– ident: ref12
  doi: 10.1109/TNNLS.2022.3148376
– ident: ref17
  doi: 10.1109/CDC.2009.5399753
– ident: ref5
  doi: 10.1016/j.automatica.2012.06.096
– ident: ref24
  doi: 10.1109/TSMC.2022.3207575
– ident: ref10
  doi: 10.1109/TSMC.2022.3173275
– ident: ref28
  doi: 10.1080/00207179.2017.1381763
– ident: ref13
  doi: 10.1109/TNNLS.2015.2441749
– ident: ref36
  doi: 10.1049/iet-cta.2012.0313
– ident: ref1
  doi: 10.1109/TETCI.2022.3145706
– ident: ref22
  doi: 10.1109/TNNLS.2014.2328590
– ident: ref9
  doi: 10.1080/00207179.2017.1381763
– ident: ref29
  doi: 10.1109/TNNLS.2020.3007414
– ident: ref3
  doi: 10.1109/TNNLS.2021.3135405
– ident: ref35
  doi: 10.1109/TITS.2020.3001600
– ident: ref14
  doi: 10.1109/TNNLS.2015.2461452
– ident: ref16
  doi: 10.1109/ICNN.1994.374604
– ident: ref7
  doi: 10.1109/TNNLS.2021.3053269
– ident: ref33
  doi: 10.1109/TSMC.2020.3003224
– ident: ref25
  doi: 10.1016/j.automatica.2010.02.018
– ident: ref23
  doi: 10.1016/j.isatra.2016.10.019
– ident: ref31
  doi: 10.1109/CDC40024.2019.9030116
– ident: ref11
  doi: 10.1007/978-1-4471-4757-2_9
– ident: ref18
  doi: 10.1016/j.automatica.2012.06.008
– ident: ref6
  doi: 10.1109/JAS.2014.7004668
– volume-title: Handbook of Intelligent Control Neural Fuzzy & Adaptive Approaches
  year: 1992
  ident: ref2
  article-title: Approximate dynamic programming for real-time control and neural modeling
– ident: ref4
  doi: 10.1109/TNNLS.2022.3226518
– ident: ref34
  doi: 10.1016/j.sysconle.2021.104880
– ident: ref27
  doi: 10.1109/TSMC.2018.2863708
– ident: ref15
  doi: 10.1109/TCYB.2014.2313915
– ident: ref8
  doi: 10.1007/978-3-319-50815-3_11
– ident: ref26
  doi: 10.1109/TETCI.2022.3140375
SSID ssj0002951354
Score 2.3264422
Snippet Although model based <inline-formula><tex-math notation="LaTeX">H_{\infty }</tex-math></inline-formula> control scheme for nonlinear continuous-time (CT)...
SourceID crossref
ieee
SourceType Enrichment Source
Index Database
Publisher
StartPage 1143
SubjectTerms <named-content xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" content-type="math" xlink:type="simple"> <inline-formula> <tex-math notation="LaTeX"> H_{\infty }</tex-math> </inline-formula> </named-content> control
Artificial neural networks
Control systems
Cost function
Heuristic algorithms
learning law
Nonlinear systems
Q-learning
Reinforcement learning
System dynamics
Title Adaptive Q-Learning Based Model-Free H Control of Continuous-Time Nonlinear Systems: Theory and Application
URI https://ieeexplore.ieee.org/document/10663220
Volume 9
WOSCitedRecordID wos001308170300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2471-285X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002951354
  issn: 2471-285X
  databaseCode: RIE
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF60ePDiAyvWF3vwJlv3kWSz3mqx1EtRqNBb2GQnUixpia2_332ktR4UvA1hA0m-3cx-s_PNIHSj0gKSVFsEcs1d6AZIqrghkArhhJ5aJKHZhByN0slEPTdida-FAQCffAZdZ_qzfDMvVi5UZle49Y-cW4a-K6UMYq1NQIXbvYKIo7Uwhqq78eO4_2QpII-6IoosuaY_nM9WNxXvTAaH_3yMI3TQ7BpxL8B8jHagOkHvPaMX7m-FX0hTJvUNP1ivZLDrcDYjgxoAD3E_ZKPjeenNabWybJ847QcehUIZusZN5fJ7HMT6WFcG974Pt9vodWDfc0ia3gmk4JIuiV1WhQAR5dbfWE7EBAAFBQlLcsUMT3UeA6MUytIoKlRZlkzzxLj68i6FOhenqFXNKzhD2DIOmksTG-EsWWqZFEwyk3OhjI5ZB7H1R82KprC4628xyzzBoCrzQGQOiKwBooNuN_csQlmNP0e3HQpbIwMA579cv0D73LXp9Qk2l6i1rFdwhfaKz-X0o7720-YLk4_AeA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF5EBb34wIr1uQdvkrqPvNZbLZYWa1Co0FvYZCdSLGmJrb_ffaS1HhS8DWEDSb7dzH6z880gdC3iHMJYagQyyUzoBrxYMOVBzLkRekoeumYTUZLEo5F4rsXqVgsDADb5DFrGtGf5apovTKhMr3DtHxnTDH0r8H1GnVxrFVJherfAA38pjSHidvgw7PQ1CWR-i_u-ptfkh_tZ66di3Ul3_58PcoD26n0jbjugD9EGlEfova3kzPyv8ItXF0p9w_faLylsepxNvG4FgHu44_LR8bSw5rhcaL7vGfUHTlypDFnhunb5HXZyfSxLhdvfx9sN9NrV79nz6u4JXs4iMvf0wso5cD_THkezIsoBCAgIaZgJqlgsswAoIVAUShAuiqKgkoXKVJg3SdQZP0ab5bSEE4Q15yBZpALFjRUVMgpzGlGVMS6UDGgT0eVHTfO6tLjpcDFJLcUgIrVApAaItAaiiW5W98xcYY0_RzcMCmsjHQCnv1y_Qju94dMgHfSTxzO0y0zTXptuc44259UCLtB2_jkff1SXdgp9AcFIw78
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adaptive+Q-Learning+Based+Model-Free+H+Control+of+Continuous-Time+Nonlinear+Systems%3A+Theory+and+Application&rft.jtitle=IEEE+transactions+on+emerging+topics+in+computational+intelligence&rft.au=Zhao%2C+Jun&rft.au=Lv%2C+Yongfeng&rft.au=Wang%2C+Zhangu&rft.au=Zhao%2C+Ziliang&rft.date=2025-04-01&rft.pub=IEEE&rft.eissn=2471-285X&rft.volume=9&rft.issue=2&rft.spage=1143&rft.epage=1152&rft_id=info:doi/10.1109%2FTETCI.2024.3449870&rft.externalDocID=10663220
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2471-285X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2471-285X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2471-285X&client=summon