Deep reinforcement learning with shallow controllers: An experimental application to PID tuning
Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep R...
Uložené v:
| Vydané v: | Control engineering practice Ročník 121; s. 105046 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Ltd
01.04.2022
|
| Predmet: | |
| ISSN: | 0967-0661, 1873-6939 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence.
[Display omitted]
•Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods. |
|---|---|
| AbstractList | Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence.
[Display omitted]
•Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods. |
| ArticleNumber | 105046 |
| Author | Gopaluni, R. Bhushan Lawrence, Nathan P. McClement, Daniel G. Loewen, Philip D. Backström, Johan U. Forbes, Michael G. |
| Author_xml | – sequence: 1 givenname: Nathan P. orcidid: 0000-0002-7147-0048 surname: Lawrence fullname: Lawrence, Nathan P. email: lawrence@math.ubc.ca organization: Department of Mathematics, University of British Columbia, Vancouver BC, Canada – sequence: 2 givenname: Michael G. orcidid: 0000-0001-6233-2172 surname: Forbes fullname: Forbes, Michael G. organization: Honeywell Process Solutions, North Vancouver, BC, Canada – sequence: 3 givenname: Philip D. orcidid: 0000-0001-9761-6239 surname: Loewen fullname: Loewen, Philip D. organization: Department of Mathematics, University of British Columbia, Vancouver BC, Canada – sequence: 4 givenname: Daniel G. orcidid: 0000-0002-2401-9019 surname: McClement fullname: McClement, Daniel G. organization: Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, BC, Canada – sequence: 5 givenname: Johan U. surname: Backström fullname: Backström, Johan U. organization: Backstrom Systems Engineering Ltd., Canada – sequence: 6 givenname: R. Bhushan orcidid: 0000-0002-4321-0468 surname: Gopaluni fullname: Gopaluni, R. Bhushan email: bhushan.gopaluni@ubc.ca organization: Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, BC, Canada |
| BookMark | eNqNkF9LwzAUxYNMcJt-h3yBzqRp09YHYW7-GQz0QZ9DTG-2jJiWJDr99qZOEHzRpwsHzu-ecyZo5DoHCGFKZpRQfr6bqSS4Te-lmuUkp0kuScGP0JjWFct4w5oRGpOGVxnhnJ6gSQg7kqxNQ8dILAF67ME43XkFL-AitiC9M26D9yZucdhKa7s9Tm-i76wFHy7w3GF478GbwSAtln1vjZLRdA7HDj-slji-DoxTdKylDXD2fafo6eb6cXGXre9vV4v5OlOM1jEroCgL3ZY5tLRWNVW0aIkGrglhNYNWMq0JI8-KtRVhwPOyUtAynUMlmyp1nKLLA1f5LgQPWigTv_JEL40VlIhhLrETP3OJYS5xmCsB6l-APrWT_uM_1quDFVLBNwNeBGXApYDGg4qi7czfkE_ETY_y |
| CitedBy_id | crossref_primary_10_1002_asjc_3572 crossref_primary_10_1016_j_conengprac_2025_106522 crossref_primary_10_1016_j_dche_2023_100131 crossref_primary_10_1016_j_ifacol_2025_08_085 crossref_primary_10_1016_j_conengprac_2025_106562 crossref_primary_10_1007_s00521_022_07710_7 crossref_primary_10_1080_00207721_2025_2469821 crossref_primary_10_1016_j_compchemeng_2025_109392 crossref_primary_10_1007_s10489_024_05720_7 crossref_primary_10_1002_cjce_24508 crossref_primary_10_1016_j_cherd_2023_05_041 crossref_primary_10_1016_j_ifacol_2025_07_150 crossref_primary_10_2478_pjct_2023_0016 crossref_primary_10_1016_j_compchemeng_2022_107760 crossref_primary_10_3390_pr11010123 crossref_primary_10_1016_j_jmapro_2024_09_019 crossref_primary_10_1007_s13369_024_08962_2 crossref_primary_10_1016_j_conengprac_2023_105480 crossref_primary_10_1016_j_oceaneng_2023_114621 crossref_primary_10_1109_ACCESS_2023_3334148 crossref_primary_10_1016_j_compchemeng_2025_109262 crossref_primary_10_1134_S1054661823030021 crossref_primary_10_1016_j_jprocont_2022_08_002 crossref_primary_10_1016_j_compchemeng_2024_108783 crossref_primary_10_3390_app142210585 crossref_primary_10_1016_j_apenergy_2023_122310 crossref_primary_10_1109_TSM_2022_3225480 crossref_primary_10_1016_j_jmapro_2024_01_085 crossref_primary_10_3390_app13042674 crossref_primary_10_1007_s00521_023_09112_9 crossref_primary_10_1016_j_sysconle_2024_105714 crossref_primary_10_1016_j_conengprac_2024_106072 crossref_primary_10_1016_j_compchemeng_2024_108826 crossref_primary_10_1360_SSI_2025_0108 crossref_primary_10_1016_j_ifacol_2023_10_598 crossref_primary_10_3390_s22103753 crossref_primary_10_1016_j_conengprac_2022_105294 crossref_primary_10_1109_TASE_2025_3562398 crossref_primary_10_1016_j_conengprac_2025_106342 crossref_primary_10_1016_j_automatica_2024_111642 crossref_primary_10_1016_j_conengprac_2023_105610 crossref_primary_10_3390_pr13061791 crossref_primary_10_1109_ACCESS_2023_3315118 crossref_primary_10_1007_s12555_024_0990_1 crossref_primary_10_1109_ACCESS_2024_3440580 crossref_primary_10_1016_j_compchemeng_2023_108511 crossref_primary_10_1007_s10845_025_02651_z crossref_primary_10_1016_j_ceja_2025_100753 crossref_primary_10_1016_j_engappai_2025_111616 crossref_primary_10_3390_app12115716 crossref_primary_10_1109_TVT_2024_3443123 crossref_primary_10_1016_j_conengprac_2023_105462 crossref_primary_10_1109_TSM_2023_3266220 crossref_primary_10_1177_01423312241229492 crossref_primary_10_1016_j_compchemeng_2025_109363 crossref_primary_10_1016_j_cherd_2023_07_049 crossref_primary_10_1016_j_conengprac_2024_105841 crossref_primary_10_1109_TASE_2025_3600504 crossref_primary_10_1002_asjc_3060 crossref_primary_10_1016_j_compchemeng_2023_108393 crossref_primary_10_1016_j_ress_2024_110639 crossref_primary_10_1016_j_compchemeng_2023_108386 crossref_primary_10_1016_j_jfranklin_2022_06_019 crossref_primary_10_3390_electronics13245039 crossref_primary_10_1002_aic_18245 crossref_primary_10_1007_s13369_024_09797_7 crossref_primary_10_1051_jnwpu_20244250912 crossref_primary_10_1016_j_ifacol_2023_10_923 crossref_primary_10_1016_j_compchemeng_2024_108723 |
| Cites_doi | 10.1016/j.compchemeng.2017.10.008 10.1016/j.jprocont.2007.10.016 10.1002/rnc.822 10.1016/j.jprocont.2018.01.010 10.1016/j.compchemeng.2019.106649 10.1016/j.ifacol.2019.09.173 10.1016/j.jprocont.2021.06.004 10.1021/acs.iecr.0c05678 10.1016/j.compchemeng.2020.106886 10.1016/j.asoc.2009.10.018 10.1016/0005-1098(84)90014-1 10.1016/0098-1354(92)80045-B 10.1016/j.jprocont.2018.07.013 10.1021/acs.iecr.9b00704 10.1002/aic.17306 10.1016/j.eswa.2017.03.002 10.1016/j.conengprac.2018.01.006 10.1016/j.jprocont.2018.11.004 10.1016/j.ifacol.2015.09.022 10.1016/j.compchemeng.2020.107133 10.1016/j.jprocont.2020.02.003 10.1002/aic.16689 10.1016/j.ifacol.2021.08.321 10.1016/S0959-1524(02)00062-8 10.1016/j.ifacol.2020.12.129 10.1016/j.ifacol.2018.09.241 10.1038/nature14236 10.1126/scirobotics.abc5986 10.3182/20130703-3-FR-4038.00129 10.1016/j.asoc.2014.06.037 10.1016/j.compchemeng.2019.05.029 |
| ContentType | Journal Article |
| Copyright | 2021 Elsevier Ltd |
| Copyright_xml | – notice: 2021 Elsevier Ltd |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.conengprac.2021.105046 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1873-6939 |
| ExternalDocumentID | 10_1016_j_conengprac_2021_105046 S0967066121002963 |
| GroupedDBID | --K --M .~1 0R~ 1B1 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 6J9 6TJ 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO ABFNM ABFRF ABJNI ABMAC ABTAH ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ADBBV ADEZE ADMUD ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA HVGLF HZ~ IHE J1W JJJVA KOM LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SDF SDG SES SET SEW SPC SPCBC SST SSZ T5K UNMZH WUQ XFK XPP ZMT ZY4 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c318t-4e454fd52ed18c81c14d0fe6f00383eda3ff030bc3d703e6257ced3f2e7a97693 |
| ISICitedReferencesCount | 78 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000819660400010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0967-0661 |
| IngestDate | Sat Nov 29 07:06:24 EST 2025 Tue Nov 18 22:43:39 EST 2025 Fri Feb 23 02:40:08 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Deep learning PID control Process systems engineering Process control Reinforcement learning |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c318t-4e454fd52ed18c81c14d0fe6f00383eda3ff030bc3d703e6257ced3f2e7a97693 |
| ORCID | 0000-0001-9761-6239 0000-0002-4321-0468 0000-0001-6233-2172 0000-0002-2401-9019 0000-0002-7147-0048 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_conengprac_2021_105046 crossref_primary_10_1016_j_conengprac_2021_105046 elsevier_sciencedirect_doi_10_1016_j_conengprac_2021_105046 |
| PublicationCentury | 2000 |
| PublicationDate | April 2022 2022-04-00 |
| PublicationDateYYYYMMDD | 2022-04-01 |
| PublicationDate_xml | – month: 04 year: 2022 text: April 2022 |
| PublicationDecade | 2020 |
| PublicationTitle | Control engineering practice |
| PublicationYear | 2022 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Shipman, Coetzee (b41) 2019; 52 Carlucho, De Paula, Villar, Acosta (b8) 2017; 80 Nian, Liu, Huang (b34) 2020; 139 Lee, Lee (b26) 2008; 18 Shin, Badgwell, Liu, Lee (b40) 2019; 127 Sutton, Barto (b45) 2018 Forbes, Patwardhan, Hamadah, Gopaluni (b12) 2015; 48 Achiam (b1) 2018 Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare (b32) 2015; 518 Silver, Lever, Heess, Degris, Wierstra, Riedmiller (b42) 2014 Petsagkourakis, Sandoval, Bradford, Zhang, del Rio-Chanona (b37) 2020; 133 Joshi, Makker, Kodamana, Kandath (b19) 2021 Hoskins, Himmelblau (b18) 1992; 16 Hausknecht, Stone (b16) 2016 Lee, Hwangbo, Wellhausen, Koltun, Hutter (b25) 2020; 5 (b17) 2007 Haarnoja, Zhou, Abbeel, Levine (b15) 2018 Sedighizadeh, Rezazadeh (b39) 2008 Bao, Zhu, Qian (b3) 2021 Lee, Shin, Realff (b27) 2018; 114 Fujimoto, Hoof, Meger (b13) 2018 Cui, Zhu, Fujisaki, Kanokogi, Matsubara (b10) 2018 Brujeni, Lee, Shah (b7) 2010 Berner, Soltesz, Hägglund, Åström (b5) 2018; 73 Spielberg, Tulsyan, Lawrence, Loewen, Bhushan Gopaluni (b44) 2019; 65 Kingma, Ba (b22) 2014 Mowbray, Smith, Del Rio-Chanona, Zhang (b33) 2021 Berger, da Fonseca Neto (b4) 2013; 46 Noel, Pandian (b35) 2014; 23 McClement, Lawrence, Loewen, Forbes, Backström, Gopaluni (b31) 2021; 54 Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa (b29) 2015 Åström, Hägglund (b2) 1984; 20 Yoo, Kim, Kim, Lee (b50) 2021; 144 Wang, Velswamy, Huang (b49) 2018; 51 Konda, Tsitsiklis (b23) 2000 Wakitani, Yamamoto, Gopaluni (b48) 2019; 58 Brockman, Cheung, Pettersson, Schneider, Schulman, Tang (b6) 2016 Ge, Li, Chang (b14) 2018; 64 Skogestad (b43) 2003; 13 Cho, van Merrienboer, Gulcehre, Bahdanau, Bougares, Schwenk (b9) 2014 Sutton, McAllester, Singh, Mansour (b46) 1999 Kaisare, Lee, Lee (b20) 2003; 13 Dogru, Wieczorek, Velswamy, Ibrahim, Huang (b11) 2021; 104 Pandian, Noel (b36) 2018; 69 Kim, Park, Yoo, Oh, Lee, Lee (b21) 2020; 87 Schulman, Wolski, Dhariwal, Radford, Klimov (b38) 2017 Syafiie, Tadeo, Martinez, Alvarez (b47) 2011; 11 Ma, Zhu, Benton, Romagnoli (b30) 2019; 75 Levine, Kumar, Tucker, Fu (b28) 2020 Lawrence, Stewart, Loewen, Forbes, Backstrom, Gopaluni (b24) 2020; 53 Lillicrap (10.1016/j.conengprac.2021.105046_b29) 2015 Silver (10.1016/j.conengprac.2021.105046_b42) 2014 Ma (10.1016/j.conengprac.2021.105046_b30) 2019; 75 Sedighizadeh (10.1016/j.conengprac.2021.105046_b39) 2008 Åström (10.1016/j.conengprac.2021.105046_b2) 1984; 20 Pandian (10.1016/j.conengprac.2021.105046_b36) 2018; 69 Sutton (10.1016/j.conengprac.2021.105046_b46) 1999 Berner (10.1016/j.conengprac.2021.105046_b5) 2018; 73 Haarnoja (10.1016/j.conengprac.2021.105046_b15) 2018 Achiam (10.1016/j.conengprac.2021.105046_b1) 2018 Hausknecht (10.1016/j.conengprac.2021.105046_b16) 2016 Cho (10.1016/j.conengprac.2021.105046_b9) 2014 Dogru (10.1016/j.conengprac.2021.105046_b11) 2021; 104 Forbes (10.1016/j.conengprac.2021.105046_b12) 2015; 48 Skogestad (10.1016/j.conengprac.2021.105046_b43) 2003; 13 Lee (10.1016/j.conengprac.2021.105046_b27) 2018; 114 Levine (10.1016/j.conengprac.2021.105046_b28) 2020 McClement (10.1016/j.conengprac.2021.105046_b31) 2021; 54 Lee (10.1016/j.conengprac.2021.105046_b25) 2020; 5 Nian (10.1016/j.conengprac.2021.105046_b34) 2020; 139 Yoo (10.1016/j.conengprac.2021.105046_b50) 2021; 144 Wakitani (10.1016/j.conengprac.2021.105046_b48) 2019; 58 Sutton (10.1016/j.conengprac.2021.105046_b45) 2018 Noel (10.1016/j.conengprac.2021.105046_b35) 2014; 23 Schulman (10.1016/j.conengprac.2021.105046_b38) 2017 Syafiie (10.1016/j.conengprac.2021.105046_b47) 2011; 11 Shipman (10.1016/j.conengprac.2021.105046_b41) 2019; 52 Spielberg (10.1016/j.conengprac.2021.105046_b44) 2019; 65 Bao (10.1016/j.conengprac.2021.105046_b3) 2021 Brujeni (10.1016/j.conengprac.2021.105046_b7) 2010 Mnih (10.1016/j.conengprac.2021.105046_b32) 2015; 518 Brockman (10.1016/j.conengprac.2021.105046_b6) 2016 (10.1016/j.conengprac.2021.105046_b17) 2007 Kingma (10.1016/j.conengprac.2021.105046_b22) 2014 Hoskins (10.1016/j.conengprac.2021.105046_b18) 1992; 16 Konda (10.1016/j.conengprac.2021.105046_b23) 2000 Wang (10.1016/j.conengprac.2021.105046_b49) 2018; 51 Shin (10.1016/j.conengprac.2021.105046_b40) 2019; 127 Kaisare (10.1016/j.conengprac.2021.105046_b20) 2003; 13 Berger (10.1016/j.conengprac.2021.105046_b4) 2013; 46 Mowbray (10.1016/j.conengprac.2021.105046_b33) 2021 Joshi (10.1016/j.conengprac.2021.105046_b19) 2021 Carlucho (10.1016/j.conengprac.2021.105046_b8) 2017; 80 Kim (10.1016/j.conengprac.2021.105046_b21) 2020; 87 Ge (10.1016/j.conengprac.2021.105046_b14) 2018; 64 Cui (10.1016/j.conengprac.2021.105046_b10) 2018 Lawrence (10.1016/j.conengprac.2021.105046_b24) 2020; 53 Lee (10.1016/j.conengprac.2021.105046_b26) 2008; 18 Fujimoto (10.1016/j.conengprac.2021.105046_b13) 2018 Petsagkourakis (10.1016/j.conengprac.2021.105046_b37) 2020; 133 |
| References_xml | – volume: 53 start-page: 236 year: 2020 end-page: 241 ident: b24 article-title: Optimal PID and antiwindup control design as a reinforcement learning problem publication-title: IFAC-PapersOnLine – volume: 51 start-page: 31 year: 2018 end-page: 36 ident: b49 article-title: A novel approach to feedback control with deep reinforcement learning publication-title: IFAC-PapersOnLine – volume: 75 start-page: 40 year: 2019 end-page: 47 ident: b30 article-title: Continuous control of a polymerization system with deep reinforcement learning publication-title: Journal of Process Control – year: 2007 ident: b17 article-title: UDC2500 universal digital controller product manual – volume: 127 start-page: 282 year: 2019 end-page: 294 ident: b40 article-title: Reinforcement Learning – Overview of recent progress and implications for process control publication-title: Computers & Chemical Engineering – volume: 11 start-page: 73 year: 2011 end-page: 82 ident: b47 article-title: Model-free control based on reinforcement learning for a wastewater treatment problem publication-title: Applied Soft Computing – volume: 114 start-page: 111 year: 2018 end-page: 121 ident: b27 article-title: Machine learning: Overview of the recent progresses and implications for the process systems engineering field publication-title: Computers & Chemical Engineering – volume: 46 start-page: 534 year: 2013 end-page: 539 ident: b4 article-title: Neurodynamic programming approach for the PID controller adaptation publication-title: IFAC Proceedings Volumes – volume: 139 year: 2020 ident: b34 article-title: A review On reinforcement learning: Introduction and applications in industrial process control publication-title: Computers & Chemical Engineering – start-page: 257 year: 2008 end-page: 262 ident: b39 article-title: Adaptive PID controller based on reinforcement learning for wind turbine control publication-title: Proceedings of world academy of science, engineering and technology, Vol. 27 – year: 2016 ident: b16 article-title: Deep reinforcement learning in parameterized action space – volume: 20 start-page: 645 year: 1984 end-page: 651 ident: b2 article-title: Automatic tuning of simple regulators with specifications on phase and amplitude margins publication-title: Automatica – volume: 87 start-page: 166 year: 2020 end-page: 178 ident: b21 article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system publication-title: Journal of Process Control – volume: 69 start-page: 16 year: 2018 end-page: 29 ident: b36 article-title: Control of a bioreactor using a new partially supervised reinforcement learning algorithm publication-title: Journal of Process Control – volume: 48 start-page: 531 year: 2015 end-page: 538 ident: b12 article-title: Model predictive control in industry: Challenges and opportunities publication-title: IFAC-PapersOnLine – start-page: 387 year: 2014 end-page: 395 ident: b42 article-title: Deterministic policy gradient algorithms publication-title: International conference on machine learning – year: 2020 ident: b28 article-title: Offline reinforcement learning: Tutorial, review, and perspectives on open problems – year: 2015 ident: b29 article-title: Continuous control with deep reinforcement learning – start-page: 453 year: 2010 end-page: 458 ident: b7 article-title: Dynamic tuning of PI-controllers based on model-free reinforcement learning methods publication-title: ICCAS 2010 – year: 2014 ident: b9 article-title: Learning phrase representations using RNN encoder-decoder for statistical machine translation – volume: 64 start-page: 15 year: 2018 end-page: 26 ident: b14 article-title: An approximate dynamic programming method for the optimal control of Alkai-Surfactant-Polymer flooding publication-title: Journal of Process Control – year: 2021 ident: b33 article-title: Using process data to generate an optimal control policy via apprenticeship and reinforcement learning publication-title: AIChE Journal – start-page: 304 year: 2018 end-page: 309 ident: b10 article-title: Factorial kernel dynamic policy programming for vinyl acetate monomer plant model control publication-title: 2018 IEEE 14th international conference on automation science and engineering (CASE) – volume: 133 year: 2020 ident: b37 article-title: Reinforcement learning for batch bioprocess optimization publication-title: Computers & Chemical Engineering – volume: 13 start-page: 347 year: 2003 end-page: 363 ident: b20 article-title: Simulation based strategy for nonlinear optimal control: Application to a microbial cell reactor publication-title: International Journal of Robust and Nonlinear Control – start-page: 1587 year: 2018 end-page: 1596 ident: b13 article-title: Addressing function approximation error in actor-critic methods publication-title: International conference on machine learning – start-page: 1008 year: 2000 end-page: 1014 ident: b23 article-title: Actor-critic algorithms publication-title: Advances in neural information processing systems – volume: 54 start-page: 685 year: 2021 end-page: 692 ident: b31 article-title: A meta-reinforcement learning approach to process control publication-title: IFAC-PapersOnLine – volume: 80 start-page: 183 year: 2017 end-page: 199 ident: b8 article-title: Incremental Q-learning strategy for adaptive PID control of mobile robots publication-title: Expert Systems with Applications – year: 2021 ident: b19 article-title: Application of twin delayed deep deterministic policy gradient learning for the control of transesterification process – volume: 18 start-page: 533 year: 2008 end-page: 542 ident: b26 article-title: Value function-based approach to the scheduling of multiple controllers publication-title: Journal of Process Control – year: 2018 ident: b45 publication-title: Reinforcement learning: An introduction – year: 2018 ident: b15 article-title: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor – volume: 52 start-page: 111 year: 2019 end-page: 116 ident: b41 article-title: Reinforcement learning and deep neural networks for pi controller tuning publication-title: IFAC-PapersOnLine – volume: 5 start-page: eabc5986 year: 2020 ident: b25 article-title: Learning quadrupedal locomotion over challenging terrain publication-title: Science Robotics – volume: 65 year: 2019 ident: b44 article-title: Toward self-driving processes: A deep reinforcement learning approach to control publication-title: AIChE Journal – volume: 104 start-page: 86 year: 2021 end-page: 100 ident: b11 article-title: Online reinforcement learning for a continuous space system with experimental validation publication-title: Journal of Process Control – volume: 23 start-page: 444 year: 2014 end-page: 451 ident: b35 article-title: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach publication-title: Applied Soft Computing – year: 2018 ident: b1 article-title: Spinning up in deep reinforcement learning – year: 2016 ident: b6 article-title: OpenAI Gym – year: 2017 ident: b38 article-title: Proximal policy optimization algorithms – volume: 13 start-page: 291 year: 2003 end-page: 309 ident: b43 article-title: Simple analytic rules for model reduction and PID controller tuning publication-title: Journal of Process Control – volume: 58 start-page: 11419 year: 2019 end-page: 11429 ident: b48 article-title: Design and application of a database-driven PID controller with data-driven updating algorithm publication-title: Industrial and Engineering Chemistry Research – volume: 144 year: 2021 ident: b50 article-title: Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation publication-title: Computers & Chemical Engineering – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: b32 article-title: Human-level control through deep reinforcement learning publication-title: Nature – start-page: 1057 year: 1999 end-page: 1063 ident: b46 article-title: Policy gradient methods for reinforcement learning with function approximation publication-title: NIPs, Vol. 99 – volume: 73 start-page: 124 year: 2018 end-page: 133 ident: b5 article-title: An experimental comparison of PID autotuners publication-title: Control Engineering Practice – volume: 16 start-page: 241 year: 1992 end-page: 251 ident: b18 article-title: Process control via artificial neural networks and reinforcement learning publication-title: Computers & Chemical Engineering – year: 2021 ident: b3 article-title: A deep reinforcement learning approach to improve the learning performance in process control publication-title: Industrial and Engineering Chemistry Research – year: 2014 ident: b22 article-title: Adam: A method for stochastic optimization – volume: 114 start-page: 111 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b27 article-title: Machine learning: Overview of the recent progresses and implications for the process systems engineering field publication-title: Computers & Chemical Engineering doi: 10.1016/j.compchemeng.2017.10.008 – year: 2018 ident: 10.1016/j.conengprac.2021.105046_b45 – year: 2018 ident: 10.1016/j.conengprac.2021.105046_b1 – volume: 18 start-page: 533 issue: 6 year: 2008 ident: 10.1016/j.conengprac.2021.105046_b26 article-title: Value function-based approach to the scheduling of multiple controllers publication-title: Journal of Process Control doi: 10.1016/j.jprocont.2007.10.016 – start-page: 257 year: 2008 ident: 10.1016/j.conengprac.2021.105046_b39 article-title: Adaptive PID controller based on reinforcement learning for wind turbine control – volume: 13 start-page: 347 issue: 3–4 year: 2003 ident: 10.1016/j.conengprac.2021.105046_b20 article-title: Simulation based strategy for nonlinear optimal control: Application to a microbial cell reactor publication-title: International Journal of Robust and Nonlinear Control doi: 10.1002/rnc.822 – year: 2017 ident: 10.1016/j.conengprac.2021.105046_b38 – volume: 64 start-page: 15 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b14 article-title: An approximate dynamic programming method for the optimal control of Alkai-Surfactant-Polymer flooding publication-title: Journal of Process Control doi: 10.1016/j.jprocont.2018.01.010 – volume: 133 year: 2020 ident: 10.1016/j.conengprac.2021.105046_b37 article-title: Reinforcement learning for batch bioprocess optimization publication-title: Computers & Chemical Engineering doi: 10.1016/j.compchemeng.2019.106649 – start-page: 453 year: 2010 ident: 10.1016/j.conengprac.2021.105046_b7 article-title: Dynamic tuning of PI-controllers based on model-free reinforcement learning methods – volume: 52 start-page: 111 issue: 14 year: 2019 ident: 10.1016/j.conengprac.2021.105046_b41 article-title: Reinforcement learning and deep neural networks for pi controller tuning publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2019.09.173 – start-page: 304 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b10 article-title: Factorial kernel dynamic policy programming for vinyl acetate monomer plant model control – volume: 104 start-page: 86 year: 2021 ident: 10.1016/j.conengprac.2021.105046_b11 article-title: Online reinforcement learning for a continuous space system with experimental validation publication-title: Journal of Process Control doi: 10.1016/j.jprocont.2021.06.004 – year: 2018 ident: 10.1016/j.conengprac.2021.105046_b15 – start-page: 1587 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b13 article-title: Addressing function approximation error in actor-critic methods – year: 2007 ident: 10.1016/j.conengprac.2021.105046_b17 – year: 2021 ident: 10.1016/j.conengprac.2021.105046_b3 article-title: A deep reinforcement learning approach to improve the learning performance in process control publication-title: Industrial and Engineering Chemistry Research doi: 10.1021/acs.iecr.0c05678 – volume: 139 year: 2020 ident: 10.1016/j.conengprac.2021.105046_b34 article-title: A review On reinforcement learning: Introduction and applications in industrial process control publication-title: Computers & Chemical Engineering doi: 10.1016/j.compchemeng.2020.106886 – volume: 11 start-page: 73 issue: 1 year: 2011 ident: 10.1016/j.conengprac.2021.105046_b47 article-title: Model-free control based on reinforcement learning for a wastewater treatment problem publication-title: Applied Soft Computing doi: 10.1016/j.asoc.2009.10.018 – volume: 20 start-page: 645 issue: 5 year: 1984 ident: 10.1016/j.conengprac.2021.105046_b2 article-title: Automatic tuning of simple regulators with specifications on phase and amplitude margins publication-title: Automatica doi: 10.1016/0005-1098(84)90014-1 – volume: 16 start-page: 241 issue: 4 year: 1992 ident: 10.1016/j.conengprac.2021.105046_b18 article-title: Process control via artificial neural networks and reinforcement learning publication-title: Computers & Chemical Engineering doi: 10.1016/0098-1354(92)80045-B – volume: 69 start-page: 16 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b36 article-title: Control of a bioreactor using a new partially supervised reinforcement learning algorithm publication-title: Journal of Process Control doi: 10.1016/j.jprocont.2018.07.013 – volume: 58 start-page: 11419 issue: 26 year: 2019 ident: 10.1016/j.conengprac.2021.105046_b48 article-title: Design and application of a database-driven PID controller with data-driven updating algorithm publication-title: Industrial and Engineering Chemistry Research doi: 10.1021/acs.iecr.9b00704 – start-page: 1008 year: 2000 ident: 10.1016/j.conengprac.2021.105046_b23 article-title: Actor-critic algorithms – year: 2021 ident: 10.1016/j.conengprac.2021.105046_b33 article-title: Using process data to generate an optimal control policy via apprenticeship and reinforcement learning publication-title: AIChE Journal doi: 10.1002/aic.17306 – year: 2015 ident: 10.1016/j.conengprac.2021.105046_b29 – year: 2014 ident: 10.1016/j.conengprac.2021.105046_b22 – year: 2014 ident: 10.1016/j.conengprac.2021.105046_b9 – volume: 80 start-page: 183 year: 2017 ident: 10.1016/j.conengprac.2021.105046_b8 article-title: Incremental Q-learning strategy for adaptive PID control of mobile robots publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2017.03.002 – year: 2020 ident: 10.1016/j.conengprac.2021.105046_b28 – volume: 73 start-page: 124 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b5 article-title: An experimental comparison of PID autotuners publication-title: Control Engineering Practice doi: 10.1016/j.conengprac.2018.01.006 – volume: 75 start-page: 40 year: 2019 ident: 10.1016/j.conengprac.2021.105046_b30 article-title: Continuous control of a polymerization system with deep reinforcement learning publication-title: Journal of Process Control doi: 10.1016/j.jprocont.2018.11.004 – volume: 48 start-page: 531 issue: 8 year: 2015 ident: 10.1016/j.conengprac.2021.105046_b12 article-title: Model predictive control in industry: Challenges and opportunities publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2015.09.022 – volume: 144 year: 2021 ident: 10.1016/j.conengprac.2021.105046_b50 article-title: Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation publication-title: Computers & Chemical Engineering doi: 10.1016/j.compchemeng.2020.107133 – volume: 87 start-page: 166 year: 2020 ident: 10.1016/j.conengprac.2021.105046_b21 article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system publication-title: Journal of Process Control doi: 10.1016/j.jprocont.2020.02.003 – volume: 65 year: 2019 ident: 10.1016/j.conengprac.2021.105046_b44 article-title: Toward self-driving processes: A deep reinforcement learning approach to control publication-title: AIChE Journal doi: 10.1002/aic.16689 – year: 2016 ident: 10.1016/j.conengprac.2021.105046_b16 – volume: 54 start-page: 685 year: 2021 ident: 10.1016/j.conengprac.2021.105046_b31 article-title: A meta-reinforcement learning approach to process control publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2021.08.321 – volume: 13 start-page: 291 issue: 4 year: 2003 ident: 10.1016/j.conengprac.2021.105046_b43 article-title: Simple analytic rules for model reduction and PID controller tuning publication-title: Journal of Process Control doi: 10.1016/S0959-1524(02)00062-8 – volume: 53 start-page: 236 year: 2020 ident: 10.1016/j.conengprac.2021.105046_b24 article-title: Optimal PID and antiwindup control design as a reinforcement learning problem publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2020.12.129 – start-page: 387 year: 2014 ident: 10.1016/j.conengprac.2021.105046_b42 article-title: Deterministic policy gradient algorithms – volume: 51 start-page: 31 issue: 18 year: 2018 ident: 10.1016/j.conengprac.2021.105046_b49 article-title: A novel approach to feedback control with deep reinforcement learning publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2018.09.241 – start-page: 1057 year: 1999 ident: 10.1016/j.conengprac.2021.105046_b46 article-title: Policy gradient methods for reinforcement learning with function approximation – volume: 518 start-page: 529 issue: 7540 year: 2015 ident: 10.1016/j.conengprac.2021.105046_b32 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – volume: 5 start-page: eabc5986 issue: 47 year: 2020 ident: 10.1016/j.conengprac.2021.105046_b25 article-title: Learning quadrupedal locomotion over challenging terrain publication-title: Science Robotics doi: 10.1126/scirobotics.abc5986 – year: 2016 ident: 10.1016/j.conengprac.2021.105046_b6 – year: 2021 ident: 10.1016/j.conengprac.2021.105046_b19 – volume: 46 start-page: 534 issue: 11 year: 2013 ident: 10.1016/j.conengprac.2021.105046_b4 article-title: Neurodynamic programming approach for the PID controller adaptation publication-title: IFAC Proceedings Volumes doi: 10.3182/20130703-3-FR-4038.00129 – volume: 23 start-page: 444 year: 2014 ident: 10.1016/j.conengprac.2021.105046_b35 article-title: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach publication-title: Applied Soft Computing doi: 10.1016/j.asoc.2014.06.037 – volume: 127 start-page: 282 year: 2019 ident: 10.1016/j.conengprac.2021.105046_b40 article-title: Reinforcement Learning – Overview of recent progress and implications for process control publication-title: Computers & Chemical Engineering doi: 10.1016/j.compchemeng.2019.05.029 |
| SSID | ssj0016991 |
| Score | 2.6251624 |
| Snippet | Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 105046 |
| SubjectTerms | Deep learning PID control Process control Process systems engineering Reinforcement learning |
| Title | Deep reinforcement learning with shallow controllers: An experimental application to PID tuning |
| URI | https://dx.doi.org/10.1016/j.conengprac.2021.105046 |
| Volume | 121 |
| WOSCitedRecordID | wos000819660400010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6939 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0016991 issn: 0967-0661 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdZu4f1YeyTtduKHvZmHPwtqXsKTdd1dCWwDvJmbEmGleCE1G36j_T_3ckn2e5WWMfYiwk2ipy7X06n093vCPmgM1XwgFU-L4vCT2Je-mXACp9VQoB3kPASu5acsrMzPp-L2Wh062phrhesrvnNjVj9V1XDPVC2KZ39C3V3Xwo34DMoHa6gdrg-SPFTrVfeWreMqLIN_rnWEDbqemn6pyw3Lkt9Ybo-YHjwDt3_4GjbOKizk6nXXNVupXPkBjbTXfe0hl3hVZfrU2zakkI05iZU783G_eq3LtFS2QR-77h7drrUG7SKGPbxpt2jr_IQ8977Mnk30IYwYPfbZ760cTVXW9MnMrUByswk5iFV-1ijeeYs9jOB9Eed_cYS69_WAgxLXIAqaxCC-e1jmDw0nY2D5Bf67XZB_2amNDNGhpcWLNMjsh2xVICx3J6cHM2_dMdTmcBWjO4VbYoYJg7eP9_9fs_Alzl_Rp7aTQidIHiek5GuX5CdATXlS5IbGNE7MKIORtTAiFoY0QGMDuikpkMQ0QGIaLOkACKKIHpFvn86Oj_87NtmHL4Es9_4iU7SpFJppFXIJQ9lmKig0lllzpZjrYq4qmDBKGWsYBHRsK1mUqu4ijQrhGm4-Zps1SCZNyAqHTGRFkxnKUskDwplGDKjOGZVqWQqdwlzosqlZao3DVMWuUtJvMh7IedGyDkKeZeE3cgVsrU8YMxHp43cep3oTeYApD-O3vun0W_Jk_7_8I5sNesr_Z48ltfNj8v1vkXdTxA1r3o |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+reinforcement+learning+with+shallow+controllers%3A+An+experimental+application+to+PID+tuning&rft.jtitle=Control+engineering+practice&rft.au=Lawrence%2C+Nathan+P.&rft.au=Forbes%2C+Michael+G.&rft.au=Loewen%2C+Philip+D.&rft.au=McClement%2C+Daniel+G.&rft.date=2022-04-01&rft.pub=Elsevier+Ltd&rft.issn=0967-0661&rft.eissn=1873-6939&rft.volume=121&rft_id=info:doi/10.1016%2Fj.conengprac.2021.105046&rft.externalDocID=S0967066121002963 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0967-0661&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0967-0661&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0967-0661&client=summon |