Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation
•Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics of various phases of a batch run.•DDPG algorithm is modified with Monte-Carlo learning for stable agent training.•Suggested algorithm is appli...
Uloženo v:
| Vydáno v: | Computers & chemical engineering Ročník 144; s. 107133 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier Ltd
04.01.2021
|
| Témata: | |
| ISSN: | 0098-1354, 1873-4375 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | •Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics of various phases of a batch run.•DDPG algorithm is modified with Monte-Carlo learning for stable agent training.•Suggested algorithm is applied to a batch polymerization process control problem.
Batch process control represents a challenge given its dynamic operation over a large operating envelope. Nonlinear model predictive control (NMPC) is the current standard for optimal control of batch processes. The performance of conventional NMPC can be unsatisfactory in the presence of uncertainties. Reinforcement learning (RL) which can utilize simulation or real operation data is a viable alternative for such problems. To apply RL to batch process control effectively, however, choices such as the reward function design and value update method must be made carefully. This study proposes a phase segmentation approach for the reward function design and value/policy function representation. In addition, the deep deterministic policy gradient algorithm (DDPG) is modified with Monte-Carlo learning to ensure more stable and efficient learning behavior. A case study of a batch polymerization process producing polyols is used to demonstrate the improvement brought by the proposed approach and to highlight further issues. |
|---|---|
| AbstractList | •Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics of various phases of a batch run.•DDPG algorithm is modified with Monte-Carlo learning for stable agent training.•Suggested algorithm is applied to a batch polymerization process control problem.
Batch process control represents a challenge given its dynamic operation over a large operating envelope. Nonlinear model predictive control (NMPC) is the current standard for optimal control of batch processes. The performance of conventional NMPC can be unsatisfactory in the presence of uncertainties. Reinforcement learning (RL) which can utilize simulation or real operation data is a viable alternative for such problems. To apply RL to batch process control effectively, however, choices such as the reward function design and value update method must be made carefully. This study proposes a phase segmentation approach for the reward function design and value/policy function representation. In addition, the deep deterministic policy gradient algorithm (DDPG) is modified with Monte-Carlo learning to ensure more stable and efficient learning behavior. A case study of a batch polymerization process producing polyols is used to demonstrate the improvement brought by the proposed approach and to highlight further issues. |
| ArticleNumber | 107133 |
| Author | Yoo, Haeun Kim, Jong Woo Lee, Jay H. Kim, Boeun |
| Author_xml | – sequence: 1 givenname: Haeun surname: Yoo fullname: Yoo, Haeun organization: Department of Biomolecular and Chemical Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea – sequence: 2 givenname: Boeun surname: Kim fullname: Kim, Boeun organization: Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA – sequence: 3 givenname: Jong Woo surname: Kim fullname: Kim, Jong Woo organization: School of Chemical and Biological Engineering, Institute of Chemical Processes, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea – sequence: 4 givenname: Jay H. surname: Lee fullname: Lee, Jay H. email: jayhlee@kaist.ac.kr organization: Department of Biomolecular and Chemical Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea |
| BookMark | eNqNkN9qXCEQh6Wk0E3ad7APcDbq-ee5KmFpkkJKILTXouO463JWD2pa8g556HrYXpRe5UZBZ77fzHdJLkIMSMhnzrac8eH6uIV4WuCAJwz7rWBifR95274jGy7Htunasb8gG8Ym2fC27z6Qy5yPjDHRSbkhr0_og4sJVkChM-oUfNhTozNaGpfiT3qmEENJcabR1Y8CB7qkCJgzZvqc1_LvtQCbnU5zpBZxqUfBdPLB5-KBLnH28EL3SVu_xvz2pTIONYNm3K_JuvgYPpL3Ts8ZP_29r8jP268_dvfNw-Pdt93NQwOt4KVBOchhdHLsR8DBmMEOdR3uJsudEQKM1LIXxk5ayM6YvnOmt4McUTg7DQLaK_LlzIUUc07oFPjzBCVpPyvO1CpXHdU_ctUqV53lVsL0H2FJ1VR6eVPv7tyLdcVfHpPKUK0AWp8QirLRv4HyB-CUoyQ |
| CitedBy_id | crossref_primary_10_1016_j_cherd_2021_10_032 crossref_primary_10_1088_1361_6501_ad21cf crossref_primary_10_1002_bit_28784 crossref_primary_10_1016_j_compchemeng_2021_107527 crossref_primary_10_1016_j_neunet_2025_107461 crossref_primary_10_1016_j_compchemeng_2021_107489 crossref_primary_10_1016_j_arcontrol_2021_10_006 crossref_primary_10_1016_j_compchemeng_2025_109391 crossref_primary_10_1002_cjce_24508 crossref_primary_10_1016_j_engappai_2025_112127 crossref_primary_10_3390_pr11010123 crossref_primary_10_1016_j_compchemeng_2023_108258 crossref_primary_10_1109_TSMC_2025_3559766 crossref_primary_10_1016_j_compchemeng_2022_107962 crossref_primary_10_1016_j_jprocont_2023_103088 crossref_primary_10_1016_j_compchemeng_2024_108739 crossref_primary_10_1016_j_automatica_2023_111260 crossref_primary_10_1016_j_compchemeng_2022_107727 crossref_primary_10_3390_s24082461 crossref_primary_10_1007_s00603_025_04651_0 crossref_primary_10_1007_s11814_022_1294_x crossref_primary_10_1016_j_automatica_2025_112449 crossref_primary_10_1016_j_dche_2022_100023 crossref_primary_10_1016_j_compchemeng_2022_107819 crossref_primary_10_1016_j_jprocont_2022_12_004 crossref_primary_10_3389_fninf_2023_1096053 crossref_primary_10_1016_j_jprocont_2022_08_002 crossref_primary_10_1007_s43153_023_00422_y crossref_primary_10_3390_act13090376 crossref_primary_10_3390_app132212147 crossref_primary_10_1016_j_ifacol_2024_05_052 crossref_primary_10_1002_cjce_24913 crossref_primary_10_1016_j_compchemeng_2023_108325 crossref_primary_10_1080_00207543_2022_2153942 crossref_primary_10_1016_j_conengprac_2021_105046 crossref_primary_10_1016_j_ifacol_2025_07_188 crossref_primary_10_1016_j_compchemeng_2022_107658 crossref_primary_10_1016_j_est_2023_107999 crossref_primary_10_1016_j_jprocont_2022_12_010 crossref_primary_10_1002_rnc_7659 crossref_primary_10_1016_j_jtice_2021_06_050 crossref_primary_10_3390_pr10122514 crossref_primary_10_1016_j_engappai_2024_108006 crossref_primary_10_1016_j_jwpe_2025_107683 crossref_primary_10_3390_pr12040682 crossref_primary_10_1002_eng2_12668 crossref_primary_10_1016_j_dche_2025_100231 crossref_primary_10_1007_s11390_021_1350_8 crossref_primary_10_1007_s12555_024_0990_1 crossref_primary_10_1016_j_compchemeng_2023_108310 crossref_primary_10_1016_j_compchemeng_2025_109124 crossref_primary_10_1016_j_ifacol_2025_07_176 crossref_primary_10_1016_j_neunet_2022_10_016 crossref_primary_10_1016_j_compchemeng_2025_109005 crossref_primary_10_1016_j_ceja_2025_100753 crossref_primary_10_1016_j_compchemeng_2023_108558 crossref_primary_10_1016_j_engappai_2023_106197 crossref_primary_10_1016_j_compchemeng_2025_109406 crossref_primary_10_3390_pr10112311 crossref_primary_10_3390_pr13051352 crossref_primary_10_1016_j_arcontrol_2025_101014 crossref_primary_10_1016_j_automatica_2022_110665 crossref_primary_10_1016_j_conengprac_2023_105462 crossref_primary_10_3390_technologies13010004 crossref_primary_10_1002_stc_3035 crossref_primary_10_1016_j_dche_2023_100108 crossref_primary_10_1016_j_compchemeng_2025_109363 crossref_primary_10_1016_j_conengprac_2024_105841 crossref_primary_10_1016_j_ces_2024_120762 crossref_primary_10_1016_j_cherd_2022_01_041 crossref_primary_10_1016_j_psep_2024_11_059 crossref_primary_10_3390_app12063078 crossref_primary_10_1016_j_dche_2022_100049 crossref_primary_10_1109_TSG_2022_3195681 crossref_primary_10_1080_00207543_2021_1973138 crossref_primary_10_1016_j_compchemeng_2023_108386 crossref_primary_10_1016_j_compchemeng_2024_108601 crossref_primary_10_1007_s10489_024_05452_8 crossref_primary_10_1080_00207543_2023_2252108 crossref_primary_10_1016_j_cherd_2024_08_013 crossref_primary_10_7717_peerj_cs_2690 |
| Cites_doi | 10.1016/j.ifacol.2016.07.213 10.1016/S0098-1354(98)00301-9 10.1016/j.compchemeng.2019.106649 10.1016/j.cep.2006.06.021 10.1016/j.jprocont.2014.03.010 10.1038/nature16961 10.1103/PhysRev.36.823 10.1007/s12532-011-0026-8 10.1016/j.jprocont.2018.10.003 10.1016/j.jprocont.2013.08.008 10.1205/026387698525414 10.1016/j.jprocont.2014.05.008 10.1021/ie970738y 10.1002/aic.14144 10.1038/nature14236 10.1002/aic.16689 10.1016/j.ifacol.2015.08.175 10.1016/j.jprocont.2016.04.012 10.1016/j.automatica.2005.02.006 10.1002/rnc.1758 10.1016/j.ifacol.2018.11.036 10.1109/TCST.2005.852105 10.1016/S0959-1524(99)00049-9 10.1016/j.jprocont.2020.02.003 10.1007/s10107-004-0559-y 10.1016/j.eurpolymj.2015.04.018 10.1016/S0005-1098(96)00255-5 |
| ContentType | Journal Article |
| Copyright | 2020 |
| Copyright_xml | – notice: 2020 |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.compchemeng.2020.107133 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1873-4375 |
| ExternalDocumentID | 10_1016_j_compchemeng_2020_107133 S0098135420307912 |
| GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKC AAIKJ AAKOC AALRI AAMNW AAOAW AAQFI AAXUO ABJNI ABMAC ABNUV ABYKQ ACDAQ ACGFS ACRLP ADBBV ADEWK ADEZE ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHPOS AIEXJ AIKHN AITUG AJOXV AKURH ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG ENUVR EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA IHE J1W JJJVA KOM LG9 LX7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 ROL RPZ SBC SDF SDG SDP SES SPC SPCBC SSG SST SSZ T5K ~G- 29F 9DU AAQXK AATTM AAXKI AAYWO AAYXX ABFNM ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADMUD ADNMO AEIPS AEUPX AFFNX AFJKZ AFPUW AGQPQ AI. AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN BBWZM CITATION EFKBS EJD FEDTE FGOYB HLY HLZ HVGLF HZ~ NDZJH R2- SCE SEW VH1 WUQ ZY4 ~HD |
| ID | FETCH-LOGICAL-c321t-e86867f8757ce6bb6d60021f9d1fb22cb8a852bd9a284bb54fb5d687e2fd962c3 |
| ISICitedReferencesCount | 100 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000598170500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0098-1354 |
| IngestDate | Sat Nov 29 07:27:35 EST 2025 Tue Nov 18 21:27:31 EST 2025 Fri Feb 23 02:46:41 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Optimal control Actor-Critic Batch process Reinforcement learning |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c321t-e86867f8757ce6bb6d60021f9d1fb22cb8a852bd9a284bb54fb5d687e2fd962c3 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_compchemeng_2020_107133 crossref_primary_10_1016_j_compchemeng_2020_107133 elsevier_sciencedirect_doi_10_1016_j_compchemeng_2020_107133 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-01-04 |
| PublicationDateYYYYMMDD | 2021-01-04 |
| PublicationDate_xml | – month: 01 year: 2021 text: 2021-01-04 day: 04 |
| PublicationDecade | 2020 |
| PublicationTitle | Computers & chemical engineering |
| PublicationYear | 2021 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Thangavel, Lucia, Paulen, Engell (bib0040) 2018; 72 Jung, Nie, Lee, Biegler (bib0013) 2015; 28 Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, Wierstra (bib0017) 2016 Uhlenbeck, Ornstein (bib0043) 1930; 36 Wächter, Biegler (bib0044) 2006; 106 Puterman (bib0031) 2014 Wilson, Martinez (bib0045) 1997; 21 Spielberg, Tulsyan, Lawrence, Loewen, Bhushan Gopaluni (bib0037) 2019; 65 Ellis, Durand, Christofides (bib0007) 2014; 24 Tsitsiklis, Van Roy (bib0042) 1997 Tsitsiklis, Roy (bib0041) 2000 Mastan, Zhu (bib0021) 2015; 68 Nie, Biegler, Villa, Wassick (bib0026) 2013; 59 Barton, Allgor, Feehery, Galán (bib0002) 1998; 37 Abel, Helbig, Marquardt, Zwick, Daszkowski (bib0001) 2000; 10 Morari, H. Lee (bib0025) 1999; 23 Lee, Lee (bib0016) 2005; 41 Biegler (bib0003) 2007; 46 Petsagkourakis, Sandoval, Bradford, Zhang, del Rio-Chanona (bib0030) 2020; 133 Silver, Huang, Maddison, Guez, Sifre, Van Den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot (bib0035) 2016; 529 Lucia, Andersson, Brandt, Diehl, Engell (bib0018) 2014; 24 Martinez (bib0020) 1998; 76 Lee, Yu (bib0015) 1997; 33 Hart, Laird, Watson, Woodruff, Hackebeil, Nicholson, Siirola (bib0009) 2017; vol. 67 Chang, Liu, Henson (bib0006) 2016; 42 Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (bib0024) 2015; 518 Mayne, Kerrigan, Van Wyk, Falugi (bib0023) 2011; 21 Rawlings, Amrit (bib0033) 2009 Lucia, Finkler, Engell (bib0019) 2013; 23 Bonvin, Srinivasan, Ruppen (bib0005) 2001 . Santos, Bonzanini, Heirung, Mesbah (bib0034) 2019 Biegler (bib0004) 2007; 46 Paszke, Gross, Chintala, Chanan, Yang, DeVito, Lin, Desmaison, Antiga, Lerer (bib0027) 2017 Mayne (bib0022) 2000 Sutton, McAllester, Singh, Mansour (bib0039) 2000 Hart, Watson, Woodruff (bib0010) 2011; 3 Hausknecht, M., Stone, P., 2015. Deep reinforcement learning in parameterized action space. Peroni, Kaisare, Lee (bib0029) 2005; 13 Kim, Park, Yoo, Oh, Lee, Lee (bib0014) 2020; 87 Jang, Lee, Biegler (bib0012) 2016; 49 Sutton, Barto (bib0038) 2018 Fujimoto, Van Hoof, Meger (bib0008) 2018; vol. 4 Qin, Badgwell (bib0032) 2000 Silver, Lever, Heess, Degris, Wierstra, Riedmiller (bib0036) 2014 Paulson, Mesbah (bib0028) 2018; 51 Hart (10.1016/j.compchemeng.2020.107133_bib0009) 2017; vol. 67 Hart (10.1016/j.compchemeng.2020.107133_bib0010) 2011; 3 Jung (10.1016/j.compchemeng.2020.107133_bib0013) 2015; 28 Sutton (10.1016/j.compchemeng.2020.107133_bib0039) 2000 Chang (10.1016/j.compchemeng.2020.107133_bib0006) 2016; 42 Kim (10.1016/j.compchemeng.2020.107133_bib0014) 2020; 87 Qin (10.1016/j.compchemeng.2020.107133_bib0032) 2000 Santos (10.1016/j.compchemeng.2020.107133_bib0034) 2019 Lee (10.1016/j.compchemeng.2020.107133_bib0016) 2005; 41 Fujimoto (10.1016/j.compchemeng.2020.107133_bib0008) 2018; vol. 4 Bonvin (10.1016/j.compchemeng.2020.107133_bib0005) 2001 Uhlenbeck (10.1016/j.compchemeng.2020.107133_bib0043) 1930; 36 Abel (10.1016/j.compchemeng.2020.107133_bib0001) 2000; 10 Sutton (10.1016/j.compchemeng.2020.107133_bib0038) 2018 Biegler (10.1016/j.compchemeng.2020.107133_bib0004) 2007; 46 Martinez (10.1016/j.compchemeng.2020.107133_bib0020) 1998; 76 Rawlings (10.1016/j.compchemeng.2020.107133_bib0033) 2009 Wilson (10.1016/j.compchemeng.2020.107133_bib0045) 1997; 21 Morari (10.1016/j.compchemeng.2020.107133_bib0025) 1999; 23 Mayne (10.1016/j.compchemeng.2020.107133_bib0023) 2011; 21 Barton (10.1016/j.compchemeng.2020.107133_bib0002) 1998; 37 Lee (10.1016/j.compchemeng.2020.107133_bib0015) 1997; 33 Lucia (10.1016/j.compchemeng.2020.107133_bib0018) 2014; 24 Paulson (10.1016/j.compchemeng.2020.107133_bib0028) 2018; 51 Nie (10.1016/j.compchemeng.2020.107133_bib0026) 2013; 59 Mnih (10.1016/j.compchemeng.2020.107133_bib0024) 2015; 518 Jang (10.1016/j.compchemeng.2020.107133_bib0012) 2016; 49 Mastan (10.1016/j.compchemeng.2020.107133_bib0021) 2015; 68 Petsagkourakis (10.1016/j.compchemeng.2020.107133_bib0030) 2020; 133 Silver (10.1016/j.compchemeng.2020.107133_bib0035) 2016; 529 Paszke (10.1016/j.compchemeng.2020.107133_bib0027) 2017 Tsitsiklis (10.1016/j.compchemeng.2020.107133_bib0041) 2000 Peroni (10.1016/j.compchemeng.2020.107133_bib0029) 2005; 13 Tsitsiklis (10.1016/j.compchemeng.2020.107133_bib0042) 1997 Wächter (10.1016/j.compchemeng.2020.107133_bib0044) 2006; 106 Mayne (10.1016/j.compchemeng.2020.107133_bib0022) 2000 Lillicrap (10.1016/j.compchemeng.2020.107133_bib0017) 2016 10.1016/j.compchemeng.2020.107133_bib0011 Lucia (10.1016/j.compchemeng.2020.107133_bib0019) 2013; 23 Puterman (10.1016/j.compchemeng.2020.107133_bib0031) 2014 Biegler (10.1016/j.compchemeng.2020.107133_bib0003) 2007; 46 Thangavel (10.1016/j.compchemeng.2020.107133_bib0040) 2018; 72 Ellis (10.1016/j.compchemeng.2020.107133_bib0007) 2014; 24 Spielberg (10.1016/j.compchemeng.2020.107133_bib0037) 2019; 65 Silver (10.1016/j.compchemeng.2020.107133_bib0036) 2014 |
| References_xml | – reference: Hausknecht, M., Stone, P., 2015. Deep reinforcement learning in parameterized action space. – volume: 529 start-page: 484 year: 2016 ident: bib0035 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature – volume: 28 start-page: 164 year: 2015 end-page: 169 ident: bib0013 article-title: Model-based on-line optimization framework for semi-batch polymerization reactors publication-title: IFAC-PapersOnLine – year: 2016 ident: bib0017 article-title: Continuous control with deep reinforcement learning publication-title: Conference Paper at ICLR 2016 – volume: 23 start-page: 1306 year: 2013 end-page: 1319 ident: bib0019 article-title: Multi-stage nonlinear model predictive control applied to a semi-batch polymerization reactor under uncertainty publication-title: J. Process Control – volume: 49 start-page: 37 year: 2016 end-page: 42 ident: bib0012 article-title: A robust NMPC scheme for semi-batch polymerization reactors publication-title: IFAC-PapersOnLine – volume: 87 start-page: 166 year: 2020 end-page: 178 ident: bib0014 article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system publication-title: J. Process Control – volume: 21 start-page: 1341 year: 2011 end-page: 1353 ident: bib0023 article-title: Tube-based robust nonlinear model predictive control publication-title: Int. J. Robust Nonlinear Control – year: 2017 ident: bib0027 article-title: Automatic differentiation in PyTorch publication-title: NIPS 2017 Workshop Autodiff Submission – volume: 13 start-page: 786 year: 2005 end-page: 790 ident: bib0029 article-title: Optimal control of a fed-batch bioreactor using simulation-based approximate dynamic programming publication-title: IEEE Trans. Control Syst. Technol. – volume: 72 start-page: 39 year: 2018 end-page: 51 ident: bib0040 article-title: Dual robust nonlinear model predictive control: amulti-stage approach publication-title: J. Process Control – start-page: 369 year: 2000 end-page: 392 ident: bib0032 article-title: An overview of nonlinear model predictive control applications publication-title: Nonlinear Model Predictive Control – volume: 37 start-page: 966 year: 1998 end-page: 981 ident: bib0002 article-title: Dynamic optimization in a discontinuous world publication-title: Ind. Eng. Chem. Res. – volume: 59 start-page: 2515 year: 2013 end-page: 2529 ident: bib0026 article-title: Reactor modeling and recipe optimization of polyether polyol processes: polypropylene glycol publication-title: AlChE J. – start-page: 1641 year: 2019 end-page: 1647 ident: bib0034 article-title: A constraint-tightening approach to nonlinear model predictive control with chance constraints for stochastic systems publication-title: 2019 American Control Conference (ACC) – start-page: 1057 year: 2000 end-page: 1063 ident: bib0039 article-title: Policy gradient methods for reinforcement learning with function approximation publication-title: Advances in Neural Information Processing Systems – volume: 46 start-page: 1043 year: 2007 end-page: 1053 ident: bib0004 article-title: An overview of simultaneous strategies for dynamic optimization publication-title: Chem. Eng. Process. – volume: 21 year: 1997 ident: bib0045 article-title: Neuro-fuzzy modeling and control of a batch process involving simultaneous reaction and distillation publication-title: Comput. Chem. Eng. – volume: 3 start-page: 219 year: 2011 end-page: 260 ident: bib0010 article-title: Pyomo: modeling and solving mathematical programs in Python publication-title: Math. Program. Comput. – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: bib0024 article-title: Human-level control through deep reinforcement learning publication-title: Nature – volume: vol. 67 year: 2017 ident: bib0009 article-title: Pyomo–optimization modeling in Python – start-page: 1075 year: 1997 end-page: 1081 ident: bib0042 article-title: Analysis of temporal-difference learning with function approximation publication-title: Advances in Neural Information Processing Systems – volume: 24 start-page: 1156 year: 2014 end-page: 1178 ident: bib0007 article-title: A tutorial review of economic model predictive control methods publication-title: J. Process Control – volume: 46 start-page: 1043 year: 2007 end-page: 1053 ident: bib0003 article-title: An overview of simultaneous strategies for dynamic optimization publication-title: Chem. Eng. Process. – volume: 33 start-page: 763 year: 1997 end-page: 781 ident: bib0015 article-title: Worst-case formulations of model predictive control for systems with bounded parameters publication-title: Automatica – start-page: 23 year: 2000 end-page: 44 ident: bib0022 article-title: Nonlinear model predictive control: challenges and opportunities publication-title: Nonlinear Model Predictive Control – volume: 133 start-page: 106649 year: 2020 ident: bib0030 article-title: Reinforcement learning for batch bioprocess optimization publication-title: Comput. Chem. Eng. – volume: vol. 4 start-page: 2587 year: 2018 end-page: 2601 ident: bib0008 article-title: Addressing function approximation error in actor-critic methods publication-title: 35th International Conference on Machine Learning, ICML 2018 – volume: 68 start-page: 139 year: 2015 end-page: 160 ident: bib0021 article-title: Method of moments: a versatile tool for deterministic modeling of polymerization kinetics publication-title: Eur. Polym. J. – year: 2018 ident: bib0038 article-title: Reinforcement Learning: An Introduction – volume: 41 start-page: 1281 year: 2005 end-page: 1288 ident: bib0016 article-title: Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes publication-title: Automatica – volume: 36 start-page: 823 year: 1930 ident: bib0043 article-title: On the theory of the brownian motion publication-title: Phys. Rev. – year: 2014 ident: bib0036 article-title: Deterministic policy gradient algorithms – reference: . – start-page: 119 year: 2009 end-page: 138 ident: bib0033 article-title: Optimizing process economic performance using model predictive control publication-title: Nonlinear Model Predictive Control – volume: 65 start-page: e16689 year: 2019 ident: bib0037 article-title: Toward self-driving processes: a deep reinforcement learning approach to control publication-title: AlChE J. – volume: 51 start-page: 523 year: 2018 end-page: 534 ident: bib0028 article-title: Nonlinear model predictive control with explicit backoffs for stochastic systems under arbitrary uncertainty publication-title: IFAC-PapersOnLine – year: 2001 ident: bib0005 article-title: Dynamic Optimization in the Batch Chemical Industry publication-title: Technical Report – volume: 42 start-page: 137 year: 2016 end-page: 149 ident: bib0006 article-title: Nonlinear model predictive control of fed-batch fermentations using dynamic flux balance models publication-title: J. Process Control – year: 2000 ident: bib0041 article-title: Analysis of temporal-difference learning with function approximation publication-title: J. Adv. Neural Inf.Process. Syst. – volume: 24 start-page: 1247 year: 2014 end-page: 1259 ident: bib0018 article-title: Handling uncertainty in economic nonlinear model predictive control: a comparative case study publication-title: J. Process Control – volume: 23 start-page: 667 year: 1999 end-page: 682 ident: bib0025 article-title: Model predictive control: past, present and future publication-title: Comput. Chem. Eng. – year: 2014 ident: bib0031 article-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming – volume: 76 start-page: 711 year: 1998 end-page: 722 ident: bib0020 article-title: Learning to control the performance of batch processes publication-title: Chem. Eng. Res. Des. – volume: 10 start-page: 351 year: 2000 end-page: 362 ident: bib0001 article-title: Productivity optimization of an industrial semi-batch polymerization reactor under safety constraints publication-title: J. Process Control – volume: 106 start-page: 25 year: 2006 end-page: 57 ident: bib0044 article-title: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming publication-title: Math. Program. – start-page: 1057 year: 2000 ident: 10.1016/j.compchemeng.2020.107133_bib0039 article-title: Policy gradient methods for reinforcement learning with function approximation – volume: vol. 4 start-page: 2587 year: 2018 ident: 10.1016/j.compchemeng.2020.107133_bib0008 article-title: Addressing function approximation error in actor-critic methods – volume: 49 start-page: 37 issue: 7 year: 2016 ident: 10.1016/j.compchemeng.2020.107133_bib0012 article-title: A robust NMPC scheme for semi-batch polymerization reactors publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2016.07.213 – volume: 23 start-page: 667 issue: 4–5 year: 1999 ident: 10.1016/j.compchemeng.2020.107133_bib0025 article-title: Model predictive control: past, present and future publication-title: Comput. Chem. Eng. doi: 10.1016/S0098-1354(98)00301-9 – volume: 133 start-page: 106649 year: 2020 ident: 10.1016/j.compchemeng.2020.107133_bib0030 article-title: Reinforcement learning for batch bioprocess optimization publication-title: Comput. Chem. Eng. doi: 10.1016/j.compchemeng.2019.106649 – year: 2014 ident: 10.1016/j.compchemeng.2020.107133_bib0036 – volume: 46 start-page: 1043 issue: 11 year: 2007 ident: 10.1016/j.compchemeng.2020.107133_bib0004 article-title: An overview of simultaneous strategies for dynamic optimization publication-title: Chem. Eng. Process. doi: 10.1016/j.cep.2006.06.021 – volume: 24 start-page: 1156 issue: 8 year: 2014 ident: 10.1016/j.compchemeng.2020.107133_bib0007 article-title: A tutorial review of economic model predictive control methods publication-title: J. Process Control doi: 10.1016/j.jprocont.2014.03.010 – start-page: 119 year: 2009 ident: 10.1016/j.compchemeng.2020.107133_bib0033 article-title: Optimizing process economic performance using model predictive control – year: 2017 ident: 10.1016/j.compchemeng.2020.107133_bib0027 article-title: Automatic differentiation in PyTorch – volume: 529 start-page: 484 issue: 7587 year: 2016 ident: 10.1016/j.compchemeng.2020.107133_bib0035 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature doi: 10.1038/nature16961 – start-page: 369 year: 2000 ident: 10.1016/j.compchemeng.2020.107133_bib0032 article-title: An overview of nonlinear model predictive control applications – volume: 36 start-page: 823 issue: 5 year: 1930 ident: 10.1016/j.compchemeng.2020.107133_bib0043 article-title: On the theory of the brownian motion publication-title: Phys. Rev. doi: 10.1103/PhysRev.36.823 – volume: 3 start-page: 219 issue: 3 year: 2011 ident: 10.1016/j.compchemeng.2020.107133_bib0010 article-title: Pyomo: modeling and solving mathematical programs in Python publication-title: Math. Program. Comput. doi: 10.1007/s12532-011-0026-8 – start-page: 1641 year: 2019 ident: 10.1016/j.compchemeng.2020.107133_bib0034 article-title: A constraint-tightening approach to nonlinear model predictive control with chance constraints for stochastic systems – volume: 72 start-page: 39 year: 2018 ident: 10.1016/j.compchemeng.2020.107133_bib0040 article-title: Dual robust nonlinear model predictive control: amulti-stage approach publication-title: J. Process Control doi: 10.1016/j.jprocont.2018.10.003 – volume: 23 start-page: 1306 issue: 9 year: 2013 ident: 10.1016/j.compchemeng.2020.107133_bib0019 article-title: Multi-stage nonlinear model predictive control applied to a semi-batch polymerization reactor under uncertainty publication-title: J. Process Control doi: 10.1016/j.jprocont.2013.08.008 – volume: 76 start-page: 711 issue: 6 A6 year: 1998 ident: 10.1016/j.compchemeng.2020.107133_bib0020 article-title: Learning to control the performance of batch processes publication-title: Chem. Eng. Res. Des. doi: 10.1205/026387698525414 – ident: 10.1016/j.compchemeng.2020.107133_bib0011 – volume: 24 start-page: 1247 issue: 8 year: 2014 ident: 10.1016/j.compchemeng.2020.107133_bib0018 article-title: Handling uncertainty in economic nonlinear model predictive control: a comparative case study publication-title: J. Process Control doi: 10.1016/j.jprocont.2014.05.008 – volume: 37 start-page: 966 issue: 3 year: 1998 ident: 10.1016/j.compchemeng.2020.107133_bib0002 article-title: Dynamic optimization in a discontinuous world publication-title: Ind. Eng. Chem. Res. doi: 10.1021/ie970738y – volume: 59 start-page: 2515 issue: 7 year: 2013 ident: 10.1016/j.compchemeng.2020.107133_bib0026 article-title: Reactor modeling and recipe optimization of polyether polyol processes: polypropylene glycol publication-title: AlChE J. doi: 10.1002/aic.14144 – volume: 518 start-page: 529 issue: 7540 year: 2015 ident: 10.1016/j.compchemeng.2020.107133_bib0024 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – volume: 65 start-page: e16689 issue: 10 year: 2019 ident: 10.1016/j.compchemeng.2020.107133_bib0037 article-title: Toward self-driving processes: a deep reinforcement learning approach to control publication-title: AlChE J. doi: 10.1002/aic.16689 – volume: 28 start-page: 164 issue: 8 year: 2015 ident: 10.1016/j.compchemeng.2020.107133_bib0013 article-title: Model-based on-line optimization framework for semi-batch polymerization reactors publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2015.08.175 – volume: 42 start-page: 137 year: 2016 ident: 10.1016/j.compchemeng.2020.107133_bib0006 article-title: Nonlinear model predictive control of fed-batch fermentations using dynamic flux balance models publication-title: J. Process Control doi: 10.1016/j.jprocont.2016.04.012 – start-page: 1075 year: 1997 ident: 10.1016/j.compchemeng.2020.107133_bib0042 article-title: Analysis of temporal-difference learning with function approximation – volume: 41 start-page: 1281 issue: 7 year: 2005 ident: 10.1016/j.compchemeng.2020.107133_bib0016 article-title: Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes publication-title: Automatica doi: 10.1016/j.automatica.2005.02.006 – volume: 21 start-page: 1341 issue: 11 year: 2011 ident: 10.1016/j.compchemeng.2020.107133_bib0023 article-title: Tube-based robust nonlinear model predictive control publication-title: Int. J. Robust Nonlinear Control doi: 10.1002/rnc.1758 – issue: 1988 year: 2000 ident: 10.1016/j.compchemeng.2020.107133_bib0041 article-title: Analysis of temporal-difference learning with function approximation publication-title: J. Adv. Neural Inf.Process. Syst. – year: 2016 ident: 10.1016/j.compchemeng.2020.107133_bib0017 article-title: Continuous control with deep reinforcement learning – volume: 51 start-page: 523 issue: 20 year: 2018 ident: 10.1016/j.compchemeng.2020.107133_bib0028 article-title: Nonlinear model predictive control with explicit backoffs for stochastic systems under arbitrary uncertainty publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2018.11.036 – volume: 13 start-page: 786 issue: 5 year: 2005 ident: 10.1016/j.compchemeng.2020.107133_bib0029 article-title: Optimal control of a fed-batch bioreactor using simulation-based approximate dynamic programming publication-title: IEEE Trans. Control Syst. Technol. doi: 10.1109/TCST.2005.852105 – volume: 10 start-page: 351 issue: 4 year: 2000 ident: 10.1016/j.compchemeng.2020.107133_bib0001 article-title: Productivity optimization of an industrial semi-batch polymerization reactor under safety constraints publication-title: J. Process Control doi: 10.1016/S0959-1524(99)00049-9 – year: 2001 ident: 10.1016/j.compchemeng.2020.107133_bib0005 article-title: Dynamic Optimization in the Batch Chemical Industry – volume: 46 start-page: 1043 issue: 11 year: 2007 ident: 10.1016/j.compchemeng.2020.107133_bib0003 article-title: An overview of simultaneous strategies for dynamic optimization publication-title: Chem. Eng. Process. doi: 10.1016/j.cep.2006.06.021 – start-page: 23 year: 2000 ident: 10.1016/j.compchemeng.2020.107133_bib0022 article-title: Nonlinear model predictive control: challenges and opportunities – volume: 21 issue: SUPPL.1 year: 1997 ident: 10.1016/j.compchemeng.2020.107133_bib0045 article-title: Neuro-fuzzy modeling and control of a batch process involving simultaneous reaction and distillation publication-title: Comput. Chem. Eng. – volume: 87 start-page: 166 year: 2020 ident: 10.1016/j.compchemeng.2020.107133_bib0014 article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system publication-title: J. Process Control doi: 10.1016/j.jprocont.2020.02.003 – volume: 106 start-page: 25 issue: 1 year: 2006 ident: 10.1016/j.compchemeng.2020.107133_bib0044 article-title: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming publication-title: Math. Program. doi: 10.1007/s10107-004-0559-y – volume: vol. 67 year: 2017 ident: 10.1016/j.compchemeng.2020.107133_bib0009 – year: 2014 ident: 10.1016/j.compchemeng.2020.107133_bib0031 – year: 2018 ident: 10.1016/j.compchemeng.2020.107133_bib0038 – volume: 68 start-page: 139 year: 2015 ident: 10.1016/j.compchemeng.2020.107133_bib0021 article-title: Method of moments: a versatile tool for deterministic modeling of polymerization kinetics publication-title: Eur. Polym. J. doi: 10.1016/j.eurpolymj.2015.04.018 – volume: 33 start-page: 763 issue: 5 year: 1997 ident: 10.1016/j.compchemeng.2020.107133_bib0015 article-title: Worst-case formulations of model predictive control for systems with bounded parameters publication-title: Automatica doi: 10.1016/S0005-1098(96)00255-5 |
| SSID | ssj0002488 |
| Score | 2.5962505 |
| Snippet | •Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 107133 |
| SubjectTerms | Actor-Critic Batch process Optimal control Reinforcement learning |
| Title | Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation |
| URI | https://dx.doi.org/10.1016/j.compchemeng.2020.107133 |
| Volume | 144 |
| WOSCitedRecordID | wos000598170500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-4375 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002488 issn: 0098-1354 databaseCode: AIEXJ dateStart: 19950611 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3bbtQwELWWFiF4QFxFuclIvEWpam8ujsRLqYqWCiqEiti3yHbsLdU2ifZStf_AB_F5jO0464WiFiRerMiR7ThzMh5PxmcQes0N7RVXMpay2okTDZ-iEIrEaUKI4jt6yIXNWvIhPzxk43HxaTD44c_CnE3zumbn50X7X0UNdSBsc3T2L8TddwoVcA1ChxLEDuW1BP9ZWTJUaf1-PivEJDLLVRU1oCFOLSWIi1AHU1GAMj6OWndgQM2jpfUefDSsVfEen02bqFKqhcLFzVhiZ5vbQV5Ek5mNGFs4d257DGNEczU57Q401aHp6_NHzC3apCcqUCtCxJUGsv7bEVfLIEbAAvdt83vdgUmX9LVpfLWPLeIX0Wg79GpQYr0aSaipC9jdDh3BdK-pkyTQtcRusC9dBpxH4sRIsTXzgalswyjmjm-zTr39y5LYByr6GLiTMuiqNF2VrqsbaJPClgv06ebu-_3xQW8F0IQxz9dq5nELvVrFFv7huS63jQJ75-geutttVPCuA9h9NFD1A3QnoK98iL6vQQ17qGELNdxBDXdQw43GFmq4hxq2UMMB1LCBGl6DGnZQwx5q2EANW6jhEGqP0Jd3-0d7o7jL7hHLISWLWLGMZbk2CRWkyoTIKvOLmOiiIlpQKgXjLKWiKjhYUEKkiRZplbFcUV0VGZXDx2ijbmr1BGGRCli4uCQFUUmuVcF5okgGxherOOFsCzH_XkvZUd-bDCzT8kr5biHaN20d_8t1Gr3xwis7Q9YZqCUA9OrmT_9lzGfo9uo7eo42FrOleoFuyrPFt_nsZYfOn9i4znM |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reinforcement+learning+based+optimal+control+of+batch+processes+using+Monte-Carlo+deep+deterministic+policy+gradient+with+phase+segmentation&rft.jtitle=Computers+%26+chemical+engineering&rft.au=Yoo%2C+Haeun&rft.au=Kim%2C+Boeun&rft.au=Kim%2C+Jong+Woo&rft.au=Lee%2C+Jay+H.&rft.date=2021-01-04&rft.issn=0098-1354&rft.volume=144&rft.spage=107133&rft_id=info:doi/10.1016%2Fj.compchemeng.2020.107133&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_compchemeng_2020_107133 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0098-1354&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0098-1354&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0098-1354&client=summon |