Deep reinforcement learning with shallow controllers: An experimental application to PID tuning

Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep R...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Control engineering practice Ročník 121; s. 105046
Hlavní autori: Lawrence, Nathan P., Forbes, Michael G., Loewen, Philip D., McClement, Daniel G., Backström, Johan U., Gopaluni, R. Bhushan
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 01.04.2022
Predmet:
ISSN:0967-0661, 1873-6939
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence. [Display omitted] •Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods.
AbstractList Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence. [Display omitted] •Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods.
ArticleNumber 105046
Author Gopaluni, R. Bhushan
Lawrence, Nathan P.
McClement, Daniel G.
Loewen, Philip D.
Backström, Johan U.
Forbes, Michael G.
Author_xml – sequence: 1
  givenname: Nathan P.
  orcidid: 0000-0002-7147-0048
  surname: Lawrence
  fullname: Lawrence, Nathan P.
  email: lawrence@math.ubc.ca
  organization: Department of Mathematics, University of British Columbia, Vancouver BC, Canada
– sequence: 2
  givenname: Michael G.
  orcidid: 0000-0001-6233-2172
  surname: Forbes
  fullname: Forbes, Michael G.
  organization: Honeywell Process Solutions, North Vancouver, BC, Canada
– sequence: 3
  givenname: Philip D.
  orcidid: 0000-0001-9761-6239
  surname: Loewen
  fullname: Loewen, Philip D.
  organization: Department of Mathematics, University of British Columbia, Vancouver BC, Canada
– sequence: 4
  givenname: Daniel G.
  orcidid: 0000-0002-2401-9019
  surname: McClement
  fullname: McClement, Daniel G.
  organization: Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, BC, Canada
– sequence: 5
  givenname: Johan U.
  surname: Backström
  fullname: Backström, Johan U.
  organization: Backstrom Systems Engineering Ltd., Canada
– sequence: 6
  givenname: R. Bhushan
  orcidid: 0000-0002-4321-0468
  surname: Gopaluni
  fullname: Gopaluni, R. Bhushan
  email: bhushan.gopaluni@ubc.ca
  organization: Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, BC, Canada
BookMark eNqNkF9LwzAUxYNMcJt-h3yBzqRp09YHYW7-GQz0QZ9DTG-2jJiWJDr99qZOEHzRpwsHzu-ecyZo5DoHCGFKZpRQfr6bqSS4Te-lmuUkp0kuScGP0JjWFct4w5oRGpOGVxnhnJ6gSQg7kqxNQ8dILAF67ME43XkFL-AitiC9M26D9yZucdhKa7s9Tm-i76wFHy7w3GF478GbwSAtln1vjZLRdA7HDj-slji-DoxTdKylDXD2fafo6eb6cXGXre9vV4v5OlOM1jEroCgL3ZY5tLRWNVW0aIkGrglhNYNWMq0JI8-KtRVhwPOyUtAynUMlmyp1nKLLA1f5LgQPWigTv_JEL40VlIhhLrETP3OJYS5xmCsB6l-APrWT_uM_1quDFVLBNwNeBGXApYDGg4qi7czfkE_ETY_y
CitedBy_id crossref_primary_10_1002_asjc_3572
crossref_primary_10_1016_j_conengprac_2025_106522
crossref_primary_10_1016_j_dche_2023_100131
crossref_primary_10_1016_j_ifacol_2025_08_085
crossref_primary_10_1016_j_conengprac_2025_106562
crossref_primary_10_1007_s00521_022_07710_7
crossref_primary_10_1080_00207721_2025_2469821
crossref_primary_10_1016_j_compchemeng_2025_109392
crossref_primary_10_1007_s10489_024_05720_7
crossref_primary_10_1002_cjce_24508
crossref_primary_10_1016_j_cherd_2023_05_041
crossref_primary_10_1016_j_ifacol_2025_07_150
crossref_primary_10_2478_pjct_2023_0016
crossref_primary_10_1016_j_compchemeng_2022_107760
crossref_primary_10_3390_pr11010123
crossref_primary_10_1016_j_jmapro_2024_09_019
crossref_primary_10_1007_s13369_024_08962_2
crossref_primary_10_1016_j_conengprac_2023_105480
crossref_primary_10_1016_j_oceaneng_2023_114621
crossref_primary_10_1109_ACCESS_2023_3334148
crossref_primary_10_1016_j_compchemeng_2025_109262
crossref_primary_10_1134_S1054661823030021
crossref_primary_10_1016_j_jprocont_2022_08_002
crossref_primary_10_1016_j_compchemeng_2024_108783
crossref_primary_10_3390_app142210585
crossref_primary_10_1016_j_apenergy_2023_122310
crossref_primary_10_1109_TSM_2022_3225480
crossref_primary_10_1016_j_jmapro_2024_01_085
crossref_primary_10_3390_app13042674
crossref_primary_10_1007_s00521_023_09112_9
crossref_primary_10_1016_j_sysconle_2024_105714
crossref_primary_10_1016_j_conengprac_2024_106072
crossref_primary_10_1016_j_compchemeng_2024_108826
crossref_primary_10_1360_SSI_2025_0108
crossref_primary_10_1016_j_ifacol_2023_10_598
crossref_primary_10_3390_s22103753
crossref_primary_10_1016_j_conengprac_2022_105294
crossref_primary_10_1109_TASE_2025_3562398
crossref_primary_10_1016_j_conengprac_2025_106342
crossref_primary_10_1016_j_automatica_2024_111642
crossref_primary_10_1016_j_conengprac_2023_105610
crossref_primary_10_3390_pr13061791
crossref_primary_10_1109_ACCESS_2023_3315118
crossref_primary_10_1007_s12555_024_0990_1
crossref_primary_10_1109_ACCESS_2024_3440580
crossref_primary_10_1016_j_compchemeng_2023_108511
crossref_primary_10_1007_s10845_025_02651_z
crossref_primary_10_1016_j_ceja_2025_100753
crossref_primary_10_1016_j_engappai_2025_111616
crossref_primary_10_3390_app12115716
crossref_primary_10_1109_TVT_2024_3443123
crossref_primary_10_1016_j_conengprac_2023_105462
crossref_primary_10_1109_TSM_2023_3266220
crossref_primary_10_1177_01423312241229492
crossref_primary_10_1016_j_compchemeng_2025_109363
crossref_primary_10_1016_j_cherd_2023_07_049
crossref_primary_10_1016_j_conengprac_2024_105841
crossref_primary_10_1109_TASE_2025_3600504
crossref_primary_10_1002_asjc_3060
crossref_primary_10_1016_j_compchemeng_2023_108393
crossref_primary_10_1016_j_ress_2024_110639
crossref_primary_10_1016_j_compchemeng_2023_108386
crossref_primary_10_1016_j_jfranklin_2022_06_019
crossref_primary_10_3390_electronics13245039
crossref_primary_10_1002_aic_18245
crossref_primary_10_1007_s13369_024_09797_7
crossref_primary_10_1051_jnwpu_20244250912
crossref_primary_10_1016_j_ifacol_2023_10_923
crossref_primary_10_1016_j_compchemeng_2024_108723
Cites_doi 10.1016/j.compchemeng.2017.10.008
10.1016/j.jprocont.2007.10.016
10.1002/rnc.822
10.1016/j.jprocont.2018.01.010
10.1016/j.compchemeng.2019.106649
10.1016/j.ifacol.2019.09.173
10.1016/j.jprocont.2021.06.004
10.1021/acs.iecr.0c05678
10.1016/j.compchemeng.2020.106886
10.1016/j.asoc.2009.10.018
10.1016/0005-1098(84)90014-1
10.1016/0098-1354(92)80045-B
10.1016/j.jprocont.2018.07.013
10.1021/acs.iecr.9b00704
10.1002/aic.17306
10.1016/j.eswa.2017.03.002
10.1016/j.conengprac.2018.01.006
10.1016/j.jprocont.2018.11.004
10.1016/j.ifacol.2015.09.022
10.1016/j.compchemeng.2020.107133
10.1016/j.jprocont.2020.02.003
10.1002/aic.16689
10.1016/j.ifacol.2021.08.321
10.1016/S0959-1524(02)00062-8
10.1016/j.ifacol.2020.12.129
10.1016/j.ifacol.2018.09.241
10.1038/nature14236
10.1126/scirobotics.abc5986
10.3182/20130703-3-FR-4038.00129
10.1016/j.asoc.2014.06.037
10.1016/j.compchemeng.2019.05.029
ContentType Journal Article
Copyright 2021 Elsevier Ltd
Copyright_xml – notice: 2021 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.conengprac.2021.105046
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1873-6939
ExternalDocumentID 10_1016_j_conengprac_2021_105046
S0967066121002963
GroupedDBID --K
--M
.~1
0R~
1B1
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
6J9
6TJ
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
ABFNM
ABFRF
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ADBBV
ADEZE
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SES
SET
SEW
SPC
SPCBC
SST
SSZ
T5K
UNMZH
WUQ
XFK
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c318t-4e454fd52ed18c81c14d0fe6f00383eda3ff030bc3d703e6257ced3f2e7a97693
ISICitedReferencesCount 78
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000819660400010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0967-0661
IngestDate Sat Nov 29 07:06:24 EST 2025
Tue Nov 18 22:43:39 EST 2025
Fri Feb 23 02:40:08 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Deep learning
PID control
Process systems engineering
Process control
Reinforcement learning
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c318t-4e454fd52ed18c81c14d0fe6f00383eda3ff030bc3d703e6257ced3f2e7a97693
ORCID 0000-0001-9761-6239
0000-0002-4321-0468
0000-0001-6233-2172
0000-0002-2401-9019
0000-0002-7147-0048
ParticipantIDs crossref_citationtrail_10_1016_j_conengprac_2021_105046
crossref_primary_10_1016_j_conengprac_2021_105046
elsevier_sciencedirect_doi_10_1016_j_conengprac_2021_105046
PublicationCentury 2000
PublicationDate April 2022
2022-04-00
PublicationDateYYYYMMDD 2022-04-01
PublicationDate_xml – month: 04
  year: 2022
  text: April 2022
PublicationDecade 2020
PublicationTitle Control engineering practice
PublicationYear 2022
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Shipman, Coetzee (b41) 2019; 52
Carlucho, De Paula, Villar, Acosta (b8) 2017; 80
Nian, Liu, Huang (b34) 2020; 139
Lee, Lee (b26) 2008; 18
Shin, Badgwell, Liu, Lee (b40) 2019; 127
Sutton, Barto (b45) 2018
Forbes, Patwardhan, Hamadah, Gopaluni (b12) 2015; 48
Achiam (b1) 2018
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare (b32) 2015; 518
Silver, Lever, Heess, Degris, Wierstra, Riedmiller (b42) 2014
Petsagkourakis, Sandoval, Bradford, Zhang, del Rio-Chanona (b37) 2020; 133
Joshi, Makker, Kodamana, Kandath (b19) 2021
Hoskins, Himmelblau (b18) 1992; 16
Hausknecht, Stone (b16) 2016
Lee, Hwangbo, Wellhausen, Koltun, Hutter (b25) 2020; 5
(b17) 2007
Haarnoja, Zhou, Abbeel, Levine (b15) 2018
Sedighizadeh, Rezazadeh (b39) 2008
Bao, Zhu, Qian (b3) 2021
Lee, Shin, Realff (b27) 2018; 114
Fujimoto, Hoof, Meger (b13) 2018
Cui, Zhu, Fujisaki, Kanokogi, Matsubara (b10) 2018
Brujeni, Lee, Shah (b7) 2010
Berner, Soltesz, Hägglund, Åström (b5) 2018; 73
Spielberg, Tulsyan, Lawrence, Loewen, Bhushan Gopaluni (b44) 2019; 65
Kingma, Ba (b22) 2014
Mowbray, Smith, Del Rio-Chanona, Zhang (b33) 2021
Berger, da Fonseca Neto (b4) 2013; 46
Noel, Pandian (b35) 2014; 23
McClement, Lawrence, Loewen, Forbes, Backström, Gopaluni (b31) 2021; 54
Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa (b29) 2015
Åström, Hägglund (b2) 1984; 20
Yoo, Kim, Kim, Lee (b50) 2021; 144
Wang, Velswamy, Huang (b49) 2018; 51
Konda, Tsitsiklis (b23) 2000
Wakitani, Yamamoto, Gopaluni (b48) 2019; 58
Brockman, Cheung, Pettersson, Schneider, Schulman, Tang (b6) 2016
Ge, Li, Chang (b14) 2018; 64
Skogestad (b43) 2003; 13
Cho, van Merrienboer, Gulcehre, Bahdanau, Bougares, Schwenk (b9) 2014
Sutton, McAllester, Singh, Mansour (b46) 1999
Kaisare, Lee, Lee (b20) 2003; 13
Dogru, Wieczorek, Velswamy, Ibrahim, Huang (b11) 2021; 104
Pandian, Noel (b36) 2018; 69
Kim, Park, Yoo, Oh, Lee, Lee (b21) 2020; 87
Schulman, Wolski, Dhariwal, Radford, Klimov (b38) 2017
Syafiie, Tadeo, Martinez, Alvarez (b47) 2011; 11
Ma, Zhu, Benton, Romagnoli (b30) 2019; 75
Levine, Kumar, Tucker, Fu (b28) 2020
Lawrence, Stewart, Loewen, Forbes, Backstrom, Gopaluni (b24) 2020; 53
Lillicrap (10.1016/j.conengprac.2021.105046_b29) 2015
Silver (10.1016/j.conengprac.2021.105046_b42) 2014
Ma (10.1016/j.conengprac.2021.105046_b30) 2019; 75
Sedighizadeh (10.1016/j.conengprac.2021.105046_b39) 2008
Åström (10.1016/j.conengprac.2021.105046_b2) 1984; 20
Pandian (10.1016/j.conengprac.2021.105046_b36) 2018; 69
Sutton (10.1016/j.conengprac.2021.105046_b46) 1999
Berner (10.1016/j.conengprac.2021.105046_b5) 2018; 73
Haarnoja (10.1016/j.conengprac.2021.105046_b15) 2018
Achiam (10.1016/j.conengprac.2021.105046_b1) 2018
Hausknecht (10.1016/j.conengprac.2021.105046_b16) 2016
Cho (10.1016/j.conengprac.2021.105046_b9) 2014
Dogru (10.1016/j.conengprac.2021.105046_b11) 2021; 104
Forbes (10.1016/j.conengprac.2021.105046_b12) 2015; 48
Skogestad (10.1016/j.conengprac.2021.105046_b43) 2003; 13
Lee (10.1016/j.conengprac.2021.105046_b27) 2018; 114
Levine (10.1016/j.conengprac.2021.105046_b28) 2020
McClement (10.1016/j.conengprac.2021.105046_b31) 2021; 54
Lee (10.1016/j.conengprac.2021.105046_b25) 2020; 5
Nian (10.1016/j.conengprac.2021.105046_b34) 2020; 139
Yoo (10.1016/j.conengprac.2021.105046_b50) 2021; 144
Wakitani (10.1016/j.conengprac.2021.105046_b48) 2019; 58
Sutton (10.1016/j.conengprac.2021.105046_b45) 2018
Noel (10.1016/j.conengprac.2021.105046_b35) 2014; 23
Schulman (10.1016/j.conengprac.2021.105046_b38) 2017
Syafiie (10.1016/j.conengprac.2021.105046_b47) 2011; 11
Shipman (10.1016/j.conengprac.2021.105046_b41) 2019; 52
Spielberg (10.1016/j.conengprac.2021.105046_b44) 2019; 65
Bao (10.1016/j.conengprac.2021.105046_b3) 2021
Brujeni (10.1016/j.conengprac.2021.105046_b7) 2010
Mnih (10.1016/j.conengprac.2021.105046_b32) 2015; 518
Brockman (10.1016/j.conengprac.2021.105046_b6) 2016
(10.1016/j.conengprac.2021.105046_b17) 2007
Kingma (10.1016/j.conengprac.2021.105046_b22) 2014
Hoskins (10.1016/j.conengprac.2021.105046_b18) 1992; 16
Konda (10.1016/j.conengprac.2021.105046_b23) 2000
Wang (10.1016/j.conengprac.2021.105046_b49) 2018; 51
Shin (10.1016/j.conengprac.2021.105046_b40) 2019; 127
Kaisare (10.1016/j.conengprac.2021.105046_b20) 2003; 13
Berger (10.1016/j.conengprac.2021.105046_b4) 2013; 46
Mowbray (10.1016/j.conengprac.2021.105046_b33) 2021
Joshi (10.1016/j.conengprac.2021.105046_b19) 2021
Carlucho (10.1016/j.conengprac.2021.105046_b8) 2017; 80
Kim (10.1016/j.conengprac.2021.105046_b21) 2020; 87
Ge (10.1016/j.conengprac.2021.105046_b14) 2018; 64
Cui (10.1016/j.conengprac.2021.105046_b10) 2018
Lawrence (10.1016/j.conengprac.2021.105046_b24) 2020; 53
Lee (10.1016/j.conengprac.2021.105046_b26) 2008; 18
Fujimoto (10.1016/j.conengprac.2021.105046_b13) 2018
Petsagkourakis (10.1016/j.conengprac.2021.105046_b37) 2020; 133
References_xml – volume: 53
  start-page: 236
  year: 2020
  end-page: 241
  ident: b24
  article-title: Optimal PID and antiwindup control design as a reinforcement learning problem
  publication-title: IFAC-PapersOnLine
– volume: 51
  start-page: 31
  year: 2018
  end-page: 36
  ident: b49
  article-title: A novel approach to feedback control with deep reinforcement learning
  publication-title: IFAC-PapersOnLine
– volume: 75
  start-page: 40
  year: 2019
  end-page: 47
  ident: b30
  article-title: Continuous control of a polymerization system with deep reinforcement learning
  publication-title: Journal of Process Control
– year: 2007
  ident: b17
  article-title: UDC2500 universal digital controller product manual
– volume: 127
  start-page: 282
  year: 2019
  end-page: 294
  ident: b40
  article-title: Reinforcement Learning – Overview of recent progress and implications for process control
  publication-title: Computers & Chemical Engineering
– volume: 11
  start-page: 73
  year: 2011
  end-page: 82
  ident: b47
  article-title: Model-free control based on reinforcement learning for a wastewater treatment problem
  publication-title: Applied Soft Computing
– volume: 114
  start-page: 111
  year: 2018
  end-page: 121
  ident: b27
  article-title: Machine learning: Overview of the recent progresses and implications for the process systems engineering field
  publication-title: Computers & Chemical Engineering
– volume: 46
  start-page: 534
  year: 2013
  end-page: 539
  ident: b4
  article-title: Neurodynamic programming approach for the PID controller adaptation
  publication-title: IFAC Proceedings Volumes
– volume: 139
  year: 2020
  ident: b34
  article-title: A review On reinforcement learning: Introduction and applications in industrial process control
  publication-title: Computers & Chemical Engineering
– start-page: 257
  year: 2008
  end-page: 262
  ident: b39
  article-title: Adaptive PID controller based on reinforcement learning for wind turbine control
  publication-title: Proceedings of world academy of science, engineering and technology, Vol. 27
– year: 2016
  ident: b16
  article-title: Deep reinforcement learning in parameterized action space
– volume: 20
  start-page: 645
  year: 1984
  end-page: 651
  ident: b2
  article-title: Automatic tuning of simple regulators with specifications on phase and amplitude margins
  publication-title: Automatica
– volume: 87
  start-page: 166
  year: 2020
  end-page: 178
  ident: b21
  article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system
  publication-title: Journal of Process Control
– volume: 69
  start-page: 16
  year: 2018
  end-page: 29
  ident: b36
  article-title: Control of a bioreactor using a new partially supervised reinforcement learning algorithm
  publication-title: Journal of Process Control
– volume: 48
  start-page: 531
  year: 2015
  end-page: 538
  ident: b12
  article-title: Model predictive control in industry: Challenges and opportunities
  publication-title: IFAC-PapersOnLine
– start-page: 387
  year: 2014
  end-page: 395
  ident: b42
  article-title: Deterministic policy gradient algorithms
  publication-title: International conference on machine learning
– year: 2020
  ident: b28
  article-title: Offline reinforcement learning: Tutorial, review, and perspectives on open problems
– year: 2015
  ident: b29
  article-title: Continuous control with deep reinforcement learning
– start-page: 453
  year: 2010
  end-page: 458
  ident: b7
  article-title: Dynamic tuning of PI-controllers based on model-free reinforcement learning methods
  publication-title: ICCAS 2010
– year: 2014
  ident: b9
  article-title: Learning phrase representations using RNN encoder-decoder for statistical machine translation
– volume: 64
  start-page: 15
  year: 2018
  end-page: 26
  ident: b14
  article-title: An approximate dynamic programming method for the optimal control of Alkai-Surfactant-Polymer flooding
  publication-title: Journal of Process Control
– year: 2021
  ident: b33
  article-title: Using process data to generate an optimal control policy via apprenticeship and reinforcement learning
  publication-title: AIChE Journal
– start-page: 304
  year: 2018
  end-page: 309
  ident: b10
  article-title: Factorial kernel dynamic policy programming for vinyl acetate monomer plant model control
  publication-title: 2018 IEEE 14th international conference on automation science and engineering (CASE)
– volume: 133
  year: 2020
  ident: b37
  article-title: Reinforcement learning for batch bioprocess optimization
  publication-title: Computers & Chemical Engineering
– volume: 13
  start-page: 347
  year: 2003
  end-page: 363
  ident: b20
  article-title: Simulation based strategy for nonlinear optimal control: Application to a microbial cell reactor
  publication-title: International Journal of Robust and Nonlinear Control
– start-page: 1587
  year: 2018
  end-page: 1596
  ident: b13
  article-title: Addressing function approximation error in actor-critic methods
  publication-title: International conference on machine learning
– start-page: 1008
  year: 2000
  end-page: 1014
  ident: b23
  article-title: Actor-critic algorithms
  publication-title: Advances in neural information processing systems
– volume: 54
  start-page: 685
  year: 2021
  end-page: 692
  ident: b31
  article-title: A meta-reinforcement learning approach to process control
  publication-title: IFAC-PapersOnLine
– volume: 80
  start-page: 183
  year: 2017
  end-page: 199
  ident: b8
  article-title: Incremental Q-learning strategy for adaptive PID control of mobile robots
  publication-title: Expert Systems with Applications
– year: 2021
  ident: b19
  article-title: Application of twin delayed deep deterministic policy gradient learning for the control of transesterification process
– volume: 18
  start-page: 533
  year: 2008
  end-page: 542
  ident: b26
  article-title: Value function-based approach to the scheduling of multiple controllers
  publication-title: Journal of Process Control
– year: 2018
  ident: b45
  publication-title: Reinforcement learning: An introduction
– year: 2018
  ident: b15
  article-title: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor
– volume: 52
  start-page: 111
  year: 2019
  end-page: 116
  ident: b41
  article-title: Reinforcement learning and deep neural networks for pi controller tuning
  publication-title: IFAC-PapersOnLine
– volume: 5
  start-page: eabc5986
  year: 2020
  ident: b25
  article-title: Learning quadrupedal locomotion over challenging terrain
  publication-title: Science Robotics
– volume: 65
  year: 2019
  ident: b44
  article-title: Toward self-driving processes: A deep reinforcement learning approach to control
  publication-title: AIChE Journal
– volume: 104
  start-page: 86
  year: 2021
  end-page: 100
  ident: b11
  article-title: Online reinforcement learning for a continuous space system with experimental validation
  publication-title: Journal of Process Control
– volume: 23
  start-page: 444
  year: 2014
  end-page: 451
  ident: b35
  article-title: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach
  publication-title: Applied Soft Computing
– year: 2018
  ident: b1
  article-title: Spinning up in deep reinforcement learning
– year: 2016
  ident: b6
  article-title: OpenAI Gym
– year: 2017
  ident: b38
  article-title: Proximal policy optimization algorithms
– volume: 13
  start-page: 291
  year: 2003
  end-page: 309
  ident: b43
  article-title: Simple analytic rules for model reduction and PID controller tuning
  publication-title: Journal of Process Control
– volume: 58
  start-page: 11419
  year: 2019
  end-page: 11429
  ident: b48
  article-title: Design and application of a database-driven PID controller with data-driven updating algorithm
  publication-title: Industrial and Engineering Chemistry Research
– volume: 144
  year: 2021
  ident: b50
  article-title: Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation
  publication-title: Computers & Chemical Engineering
– volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: b32
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– start-page: 1057
  year: 1999
  end-page: 1063
  ident: b46
  article-title: Policy gradient methods for reinforcement learning with function approximation
  publication-title: NIPs, Vol. 99
– volume: 73
  start-page: 124
  year: 2018
  end-page: 133
  ident: b5
  article-title: An experimental comparison of PID autotuners
  publication-title: Control Engineering Practice
– volume: 16
  start-page: 241
  year: 1992
  end-page: 251
  ident: b18
  article-title: Process control via artificial neural networks and reinforcement learning
  publication-title: Computers & Chemical Engineering
– year: 2021
  ident: b3
  article-title: A deep reinforcement learning approach to improve the learning performance in process control
  publication-title: Industrial and Engineering Chemistry Research
– year: 2014
  ident: b22
  article-title: Adam: A method for stochastic optimization
– volume: 114
  start-page: 111
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b27
  article-title: Machine learning: Overview of the recent progresses and implications for the process systems engineering field
  publication-title: Computers & Chemical Engineering
  doi: 10.1016/j.compchemeng.2017.10.008
– year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b45
– year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b1
– volume: 18
  start-page: 533
  issue: 6
  year: 2008
  ident: 10.1016/j.conengprac.2021.105046_b26
  article-title: Value function-based approach to the scheduling of multiple controllers
  publication-title: Journal of Process Control
  doi: 10.1016/j.jprocont.2007.10.016
– start-page: 257
  year: 2008
  ident: 10.1016/j.conengprac.2021.105046_b39
  article-title: Adaptive PID controller based on reinforcement learning for wind turbine control
– volume: 13
  start-page: 347
  issue: 3–4
  year: 2003
  ident: 10.1016/j.conengprac.2021.105046_b20
  article-title: Simulation based strategy for nonlinear optimal control: Application to a microbial cell reactor
  publication-title: International Journal of Robust and Nonlinear Control
  doi: 10.1002/rnc.822
– year: 2017
  ident: 10.1016/j.conengprac.2021.105046_b38
– volume: 64
  start-page: 15
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b14
  article-title: An approximate dynamic programming method for the optimal control of Alkai-Surfactant-Polymer flooding
  publication-title: Journal of Process Control
  doi: 10.1016/j.jprocont.2018.01.010
– volume: 133
  year: 2020
  ident: 10.1016/j.conengprac.2021.105046_b37
  article-title: Reinforcement learning for batch bioprocess optimization
  publication-title: Computers & Chemical Engineering
  doi: 10.1016/j.compchemeng.2019.106649
– start-page: 453
  year: 2010
  ident: 10.1016/j.conengprac.2021.105046_b7
  article-title: Dynamic tuning of PI-controllers based on model-free reinforcement learning methods
– volume: 52
  start-page: 111
  issue: 14
  year: 2019
  ident: 10.1016/j.conengprac.2021.105046_b41
  article-title: Reinforcement learning and deep neural networks for pi controller tuning
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2019.09.173
– start-page: 304
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b10
  article-title: Factorial kernel dynamic policy programming for vinyl acetate monomer plant model control
– volume: 104
  start-page: 86
  year: 2021
  ident: 10.1016/j.conengprac.2021.105046_b11
  article-title: Online reinforcement learning for a continuous space system with experimental validation
  publication-title: Journal of Process Control
  doi: 10.1016/j.jprocont.2021.06.004
– year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b15
– start-page: 1587
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b13
  article-title: Addressing function approximation error in actor-critic methods
– year: 2007
  ident: 10.1016/j.conengprac.2021.105046_b17
– year: 2021
  ident: 10.1016/j.conengprac.2021.105046_b3
  article-title: A deep reinforcement learning approach to improve the learning performance in process control
  publication-title: Industrial and Engineering Chemistry Research
  doi: 10.1021/acs.iecr.0c05678
– volume: 139
  year: 2020
  ident: 10.1016/j.conengprac.2021.105046_b34
  article-title: A review On reinforcement learning: Introduction and applications in industrial process control
  publication-title: Computers & Chemical Engineering
  doi: 10.1016/j.compchemeng.2020.106886
– volume: 11
  start-page: 73
  issue: 1
  year: 2011
  ident: 10.1016/j.conengprac.2021.105046_b47
  article-title: Model-free control based on reinforcement learning for a wastewater treatment problem
  publication-title: Applied Soft Computing
  doi: 10.1016/j.asoc.2009.10.018
– volume: 20
  start-page: 645
  issue: 5
  year: 1984
  ident: 10.1016/j.conengprac.2021.105046_b2
  article-title: Automatic tuning of simple regulators with specifications on phase and amplitude margins
  publication-title: Automatica
  doi: 10.1016/0005-1098(84)90014-1
– volume: 16
  start-page: 241
  issue: 4
  year: 1992
  ident: 10.1016/j.conengprac.2021.105046_b18
  article-title: Process control via artificial neural networks and reinforcement learning
  publication-title: Computers & Chemical Engineering
  doi: 10.1016/0098-1354(92)80045-B
– volume: 69
  start-page: 16
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b36
  article-title: Control of a bioreactor using a new partially supervised reinforcement learning algorithm
  publication-title: Journal of Process Control
  doi: 10.1016/j.jprocont.2018.07.013
– volume: 58
  start-page: 11419
  issue: 26
  year: 2019
  ident: 10.1016/j.conengprac.2021.105046_b48
  article-title: Design and application of a database-driven PID controller with data-driven updating algorithm
  publication-title: Industrial and Engineering Chemistry Research
  doi: 10.1021/acs.iecr.9b00704
– start-page: 1008
  year: 2000
  ident: 10.1016/j.conengprac.2021.105046_b23
  article-title: Actor-critic algorithms
– year: 2021
  ident: 10.1016/j.conengprac.2021.105046_b33
  article-title: Using process data to generate an optimal control policy via apprenticeship and reinforcement learning
  publication-title: AIChE Journal
  doi: 10.1002/aic.17306
– year: 2015
  ident: 10.1016/j.conengprac.2021.105046_b29
– year: 2014
  ident: 10.1016/j.conengprac.2021.105046_b22
– year: 2014
  ident: 10.1016/j.conengprac.2021.105046_b9
– volume: 80
  start-page: 183
  year: 2017
  ident: 10.1016/j.conengprac.2021.105046_b8
  article-title: Incremental Q-learning strategy for adaptive PID control of mobile robots
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2017.03.002
– year: 2020
  ident: 10.1016/j.conengprac.2021.105046_b28
– volume: 73
  start-page: 124
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b5
  article-title: An experimental comparison of PID autotuners
  publication-title: Control Engineering Practice
  doi: 10.1016/j.conengprac.2018.01.006
– volume: 75
  start-page: 40
  year: 2019
  ident: 10.1016/j.conengprac.2021.105046_b30
  article-title: Continuous control of a polymerization system with deep reinforcement learning
  publication-title: Journal of Process Control
  doi: 10.1016/j.jprocont.2018.11.004
– volume: 48
  start-page: 531
  issue: 8
  year: 2015
  ident: 10.1016/j.conengprac.2021.105046_b12
  article-title: Model predictive control in industry: Challenges and opportunities
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2015.09.022
– volume: 144
  year: 2021
  ident: 10.1016/j.conengprac.2021.105046_b50
  article-title: Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation
  publication-title: Computers & Chemical Engineering
  doi: 10.1016/j.compchemeng.2020.107133
– volume: 87
  start-page: 166
  year: 2020
  ident: 10.1016/j.conengprac.2021.105046_b21
  article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system
  publication-title: Journal of Process Control
  doi: 10.1016/j.jprocont.2020.02.003
– volume: 65
  year: 2019
  ident: 10.1016/j.conengprac.2021.105046_b44
  article-title: Toward self-driving processes: A deep reinforcement learning approach to control
  publication-title: AIChE Journal
  doi: 10.1002/aic.16689
– year: 2016
  ident: 10.1016/j.conengprac.2021.105046_b16
– volume: 54
  start-page: 685
  year: 2021
  ident: 10.1016/j.conengprac.2021.105046_b31
  article-title: A meta-reinforcement learning approach to process control
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2021.08.321
– volume: 13
  start-page: 291
  issue: 4
  year: 2003
  ident: 10.1016/j.conengprac.2021.105046_b43
  article-title: Simple analytic rules for model reduction and PID controller tuning
  publication-title: Journal of Process Control
  doi: 10.1016/S0959-1524(02)00062-8
– volume: 53
  start-page: 236
  year: 2020
  ident: 10.1016/j.conengprac.2021.105046_b24
  article-title: Optimal PID and antiwindup control design as a reinforcement learning problem
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2020.12.129
– start-page: 387
  year: 2014
  ident: 10.1016/j.conengprac.2021.105046_b42
  article-title: Deterministic policy gradient algorithms
– volume: 51
  start-page: 31
  issue: 18
  year: 2018
  ident: 10.1016/j.conengprac.2021.105046_b49
  article-title: A novel approach to feedback control with deep reinforcement learning
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2018.09.241
– start-page: 1057
  year: 1999
  ident: 10.1016/j.conengprac.2021.105046_b46
  article-title: Policy gradient methods for reinforcement learning with function approximation
– volume: 518
  start-page: 529
  issue: 7540
  year: 2015
  ident: 10.1016/j.conengprac.2021.105046_b32
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 5
  start-page: eabc5986
  issue: 47
  year: 2020
  ident: 10.1016/j.conengprac.2021.105046_b25
  article-title: Learning quadrupedal locomotion over challenging terrain
  publication-title: Science Robotics
  doi: 10.1126/scirobotics.abc5986
– year: 2016
  ident: 10.1016/j.conengprac.2021.105046_b6
– year: 2021
  ident: 10.1016/j.conengprac.2021.105046_b19
– volume: 46
  start-page: 534
  issue: 11
  year: 2013
  ident: 10.1016/j.conengprac.2021.105046_b4
  article-title: Neurodynamic programming approach for the PID controller adaptation
  publication-title: IFAC Proceedings Volumes
  doi: 10.3182/20130703-3-FR-4038.00129
– volume: 23
  start-page: 444
  year: 2014
  ident: 10.1016/j.conengprac.2021.105046_b35
  article-title: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach
  publication-title: Applied Soft Computing
  doi: 10.1016/j.asoc.2014.06.037
– volume: 127
  start-page: 282
  year: 2019
  ident: 10.1016/j.conengprac.2021.105046_b40
  article-title: Reinforcement Learning – Overview of recent progress and implications for process control
  publication-title: Computers & Chemical Engineering
  doi: 10.1016/j.compchemeng.2019.05.029
SSID ssj0016991
Score 2.6251624
Snippet Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 105046
SubjectTerms Deep learning
PID control
Process control
Process systems engineering
Reinforcement learning
Title Deep reinforcement learning with shallow controllers: An experimental application to PID tuning
URI https://dx.doi.org/10.1016/j.conengprac.2021.105046
Volume 121
WOSCitedRecordID wos000819660400010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-6939
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016991
  issn: 0967-0661
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdZu4f1YeyTtduKHvZmHPwtqXsKTdd1dCWwDvJmbEmGleCE1G36j_T_3ckn2e5WWMfYiwk2ipy7X06n093vCPmgM1XwgFU-L4vCT2Je-mXACp9VQoB3kPASu5acsrMzPp-L2Wh062phrhesrvnNjVj9V1XDPVC2KZ39C3V3Xwo34DMoHa6gdrg-SPFTrVfeWreMqLIN_rnWEDbqemn6pyw3Lkt9Ybo-YHjwDt3_4GjbOKizk6nXXNVupXPkBjbTXfe0hl3hVZfrU2zakkI05iZU783G_eq3LtFS2QR-77h7drrUG7SKGPbxpt2jr_IQ8977Mnk30IYwYPfbZ760cTVXW9MnMrUByswk5iFV-1ijeeYs9jOB9Eed_cYS69_WAgxLXIAqaxCC-e1jmDw0nY2D5Bf67XZB_2amNDNGhpcWLNMjsh2xVICx3J6cHM2_dMdTmcBWjO4VbYoYJg7eP9_9fs_Alzl_Rp7aTQidIHiek5GuX5CdATXlS5IbGNE7MKIORtTAiFoY0QGMDuikpkMQ0QGIaLOkACKKIHpFvn86Oj_87NtmHL4Es9_4iU7SpFJppFXIJQ9lmKig0lllzpZjrYq4qmDBKGWsYBHRsK1mUqu4ijQrhGm4-Zps1SCZNyAqHTGRFkxnKUskDwplGDKjOGZVqWQqdwlzosqlZao3DVMWuUtJvMh7IedGyDkKeZeE3cgVsrU8YMxHp43cep3oTeYApD-O3vun0W_Jk_7_8I5sNesr_Z48ltfNj8v1vkXdTxA1r3o
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+reinforcement+learning+with+shallow+controllers%3A+An+experimental+application+to+PID+tuning&rft.jtitle=Control+engineering+practice&rft.au=Lawrence%2C+Nathan+P.&rft.au=Forbes%2C+Michael+G.&rft.au=Loewen%2C+Philip+D.&rft.au=McClement%2C+Daniel+G.&rft.date=2022-04-01&rft.pub=Elsevier+Ltd&rft.issn=0967-0661&rft.eissn=1873-6939&rft.volume=121&rft_id=info:doi/10.1016%2Fj.conengprac.2021.105046&rft.externalDocID=S0967066121002963
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0967-0661&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0967-0661&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0967-0661&client=summon