Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation

•Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics of various phases of a batch run.•DDPG algorithm is modified with Monte-Carlo learning for stable agent training.•Suggested algorithm is appli...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Computers & chemical engineering Ročník 144; s. 107133
Hlavní autoři: Yoo, Haeun, Kim, Boeun, Kim, Jong Woo, Lee, Jay H.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 04.01.2021
Témata:
ISSN:0098-1354, 1873-4375
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract •Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics of various phases of a batch run.•DDPG algorithm is modified with Monte-Carlo learning for stable agent training.•Suggested algorithm is applied to a batch polymerization process control problem. Batch process control represents a challenge given its dynamic operation over a large operating envelope. Nonlinear model predictive control (NMPC) is the current standard for optimal control of batch processes. The performance of conventional NMPC can be unsatisfactory in the presence of uncertainties. Reinforcement learning (RL) which can utilize simulation or real operation data is a viable alternative for such problems. To apply RL to batch process control effectively, however, choices such as the reward function design and value update method must be made carefully. This study proposes a phase segmentation approach for the reward function design and value/policy function representation. In addition, the deep deterministic policy gradient algorithm (DDPG) is modified with Monte-Carlo learning to ensure more stable and efficient learning behavior. A case study of a batch polymerization process producing polyols is used to demonstrate the improvement brought by the proposed approach and to highlight further issues.
AbstractList •Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics of various phases of a batch run.•DDPG algorithm is modified with Monte-Carlo learning for stable agent training.•Suggested algorithm is applied to a batch polymerization process control problem. Batch process control represents a challenge given its dynamic operation over a large operating envelope. Nonlinear model predictive control (NMPC) is the current standard for optimal control of batch processes. The performance of conventional NMPC can be unsatisfactory in the presence of uncertainties. Reinforcement learning (RL) which can utilize simulation or real operation data is a viable alternative for such problems. To apply RL to batch process control effectively, however, choices such as the reward function design and value update method must be made carefully. This study proposes a phase segmentation approach for the reward function design and value/policy function representation. In addition, the deep deterministic policy gradient algorithm (DDPG) is modified with Monte-Carlo learning to ensure more stable and efficient learning behavior. A case study of a batch polymerization process producing polyols is used to demonstrate the improvement brought by the proposed approach and to highlight further issues.
ArticleNumber 107133
Author Yoo, Haeun
Kim, Jong Woo
Lee, Jay H.
Kim, Boeun
Author_xml – sequence: 1
  givenname: Haeun
  surname: Yoo
  fullname: Yoo, Haeun
  organization: Department of Biomolecular and Chemical Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea
– sequence: 2
  givenname: Boeun
  surname: Kim
  fullname: Kim, Boeun
  organization: Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA
– sequence: 3
  givenname: Jong Woo
  surname: Kim
  fullname: Kim, Jong Woo
  organization: School of Chemical and Biological Engineering, Institute of Chemical Processes, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
– sequence: 4
  givenname: Jay H.
  surname: Lee
  fullname: Lee, Jay H.
  email: jayhlee@kaist.ac.kr
  organization: Department of Biomolecular and Chemical Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea
BookMark eNqNkN9qXCEQh6Wk0E3ad7APcDbq-ee5KmFpkkJKILTXouO463JWD2pa8g556HrYXpRe5UZBZ77fzHdJLkIMSMhnzrac8eH6uIV4WuCAJwz7rWBifR95274jGy7Htunasb8gG8Ym2fC27z6Qy5yPjDHRSbkhr0_og4sJVkChM-oUfNhTozNaGpfiT3qmEENJcabR1Y8CB7qkCJgzZvqc1_LvtQCbnU5zpBZxqUfBdPLB5-KBLnH28EL3SVu_xvz2pTIONYNm3K_JuvgYPpL3Ts8ZP_29r8jP268_dvfNw-Pdt93NQwOt4KVBOchhdHLsR8DBmMEOdR3uJsudEQKM1LIXxk5ayM6YvnOmt4McUTg7DQLaK_LlzIUUc07oFPjzBCVpPyvO1CpXHdU_ctUqV53lVsL0H2FJ1VR6eVPv7tyLdcVfHpPKUK0AWp8QirLRv4HyB-CUoyQ
CitedBy_id crossref_primary_10_1016_j_cherd_2021_10_032
crossref_primary_10_1088_1361_6501_ad21cf
crossref_primary_10_1002_bit_28784
crossref_primary_10_1016_j_compchemeng_2021_107527
crossref_primary_10_1016_j_neunet_2025_107461
crossref_primary_10_1016_j_compchemeng_2021_107489
crossref_primary_10_1016_j_arcontrol_2021_10_006
crossref_primary_10_1016_j_compchemeng_2025_109391
crossref_primary_10_1002_cjce_24508
crossref_primary_10_1016_j_engappai_2025_112127
crossref_primary_10_3390_pr11010123
crossref_primary_10_1016_j_compchemeng_2023_108258
crossref_primary_10_1109_TSMC_2025_3559766
crossref_primary_10_1016_j_compchemeng_2022_107962
crossref_primary_10_1016_j_jprocont_2023_103088
crossref_primary_10_1016_j_compchemeng_2024_108739
crossref_primary_10_1016_j_automatica_2023_111260
crossref_primary_10_1016_j_compchemeng_2022_107727
crossref_primary_10_3390_s24082461
crossref_primary_10_1007_s00603_025_04651_0
crossref_primary_10_1007_s11814_022_1294_x
crossref_primary_10_1016_j_automatica_2025_112449
crossref_primary_10_1016_j_dche_2022_100023
crossref_primary_10_1016_j_compchemeng_2022_107819
crossref_primary_10_1016_j_jprocont_2022_12_004
crossref_primary_10_3389_fninf_2023_1096053
crossref_primary_10_1016_j_jprocont_2022_08_002
crossref_primary_10_1007_s43153_023_00422_y
crossref_primary_10_3390_act13090376
crossref_primary_10_3390_app132212147
crossref_primary_10_1016_j_ifacol_2024_05_052
crossref_primary_10_1002_cjce_24913
crossref_primary_10_1016_j_compchemeng_2023_108325
crossref_primary_10_1080_00207543_2022_2153942
crossref_primary_10_1016_j_conengprac_2021_105046
crossref_primary_10_1016_j_ifacol_2025_07_188
crossref_primary_10_1016_j_compchemeng_2022_107658
crossref_primary_10_1016_j_est_2023_107999
crossref_primary_10_1016_j_jprocont_2022_12_010
crossref_primary_10_1002_rnc_7659
crossref_primary_10_1016_j_jtice_2021_06_050
crossref_primary_10_3390_pr10122514
crossref_primary_10_1016_j_engappai_2024_108006
crossref_primary_10_1016_j_jwpe_2025_107683
crossref_primary_10_3390_pr12040682
crossref_primary_10_1002_eng2_12668
crossref_primary_10_1016_j_dche_2025_100231
crossref_primary_10_1007_s11390_021_1350_8
crossref_primary_10_1007_s12555_024_0990_1
crossref_primary_10_1016_j_compchemeng_2023_108310
crossref_primary_10_1016_j_compchemeng_2025_109124
crossref_primary_10_1016_j_ifacol_2025_07_176
crossref_primary_10_1016_j_neunet_2022_10_016
crossref_primary_10_1016_j_compchemeng_2025_109005
crossref_primary_10_1016_j_ceja_2025_100753
crossref_primary_10_1016_j_compchemeng_2023_108558
crossref_primary_10_1016_j_engappai_2023_106197
crossref_primary_10_1016_j_compchemeng_2025_109406
crossref_primary_10_3390_pr10112311
crossref_primary_10_3390_pr13051352
crossref_primary_10_1016_j_arcontrol_2025_101014
crossref_primary_10_1016_j_automatica_2022_110665
crossref_primary_10_1016_j_conengprac_2023_105462
crossref_primary_10_3390_technologies13010004
crossref_primary_10_1002_stc_3035
crossref_primary_10_1016_j_dche_2023_100108
crossref_primary_10_1016_j_compchemeng_2025_109363
crossref_primary_10_1016_j_conengprac_2024_105841
crossref_primary_10_1016_j_ces_2024_120762
crossref_primary_10_1016_j_cherd_2022_01_041
crossref_primary_10_1016_j_psep_2024_11_059
crossref_primary_10_3390_app12063078
crossref_primary_10_1016_j_dche_2022_100049
crossref_primary_10_1109_TSG_2022_3195681
crossref_primary_10_1080_00207543_2021_1973138
crossref_primary_10_1016_j_compchemeng_2023_108386
crossref_primary_10_1016_j_compchemeng_2024_108601
crossref_primary_10_1007_s10489_024_05452_8
crossref_primary_10_1080_00207543_2023_2252108
crossref_primary_10_1016_j_cherd_2024_08_013
crossref_primary_10_7717_peerj_cs_2690
Cites_doi 10.1016/j.ifacol.2016.07.213
10.1016/S0098-1354(98)00301-9
10.1016/j.compchemeng.2019.106649
10.1016/j.cep.2006.06.021
10.1016/j.jprocont.2014.03.010
10.1038/nature16961
10.1103/PhysRev.36.823
10.1007/s12532-011-0026-8
10.1016/j.jprocont.2018.10.003
10.1016/j.jprocont.2013.08.008
10.1205/026387698525414
10.1016/j.jprocont.2014.05.008
10.1021/ie970738y
10.1002/aic.14144
10.1038/nature14236
10.1002/aic.16689
10.1016/j.ifacol.2015.08.175
10.1016/j.jprocont.2016.04.012
10.1016/j.automatica.2005.02.006
10.1002/rnc.1758
10.1016/j.ifacol.2018.11.036
10.1109/TCST.2005.852105
10.1016/S0959-1524(99)00049-9
10.1016/j.jprocont.2020.02.003
10.1007/s10107-004-0559-y
10.1016/j.eurpolymj.2015.04.018
10.1016/S0005-1098(96)00255-5
ContentType Journal Article
Copyright 2020
Copyright_xml – notice: 2020
DBID AAYXX
CITATION
DOI 10.1016/j.compchemeng.2020.107133
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1873-4375
ExternalDocumentID 10_1016_j_compchemeng_2020_107133
S0098135420307912
GroupedDBID --K
--M
.DC
.~1
0R~
1B1
1~.
1~5
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKC
AAIKJ
AAKOC
AALRI
AAMNW
AAOAW
AAQFI
AAXUO
ABJNI
ABMAC
ABNUV
ABYKQ
ACDAQ
ACGFS
ACRLP
ADBBV
ADEWK
ADEZE
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHPOS
AIEXJ
AIKHN
AITUG
AJOXV
AKURH
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AXJTR
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
ENUVR
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
IHE
J1W
JJJVA
KOM
LG9
LX7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SPC
SPCBC
SSG
SST
SSZ
T5K
~G-
29F
9DU
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABFNM
ABWVN
ABXDB
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADMUD
ADNMO
AEIPS
AEUPX
AFFNX
AFJKZ
AFPUW
AGQPQ
AI.
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BBWZM
CITATION
EFKBS
EJD
FEDTE
FGOYB
HLY
HLZ
HVGLF
HZ~
NDZJH
R2-
SCE
SEW
VH1
WUQ
ZY4
~HD
ID FETCH-LOGICAL-c321t-e86867f8757ce6bb6d60021f9d1fb22cb8a852bd9a284bb54fb5d687e2fd962c3
ISICitedReferencesCount 100
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000598170500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0098-1354
IngestDate Sat Nov 29 07:27:35 EST 2025
Tue Nov 18 21:27:31 EST 2025
Fri Feb 23 02:46:41 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Optimal control
Actor-Critic
Batch process
Reinforcement learning
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c321t-e86867f8757ce6bb6d60021f9d1fb22cb8a852bd9a284bb54fb5d687e2fd962c3
ParticipantIDs crossref_citationtrail_10_1016_j_compchemeng_2020_107133
crossref_primary_10_1016_j_compchemeng_2020_107133
elsevier_sciencedirect_doi_10_1016_j_compchemeng_2020_107133
PublicationCentury 2000
PublicationDate 2021-01-04
PublicationDateYYYYMMDD 2021-01-04
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-01-04
  day: 04
PublicationDecade 2020
PublicationTitle Computers & chemical engineering
PublicationYear 2021
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Thangavel, Lucia, Paulen, Engell (bib0040) 2018; 72
Jung, Nie, Lee, Biegler (bib0013) 2015; 28
Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, Wierstra (bib0017) 2016
Uhlenbeck, Ornstein (bib0043) 1930; 36
Wächter, Biegler (bib0044) 2006; 106
Puterman (bib0031) 2014
Wilson, Martinez (bib0045) 1997; 21
Spielberg, Tulsyan, Lawrence, Loewen, Bhushan Gopaluni (bib0037) 2019; 65
Ellis, Durand, Christofides (bib0007) 2014; 24
Tsitsiklis, Van Roy (bib0042) 1997
Tsitsiklis, Roy (bib0041) 2000
Mastan, Zhu (bib0021) 2015; 68
Nie, Biegler, Villa, Wassick (bib0026) 2013; 59
Barton, Allgor, Feehery, Galán (bib0002) 1998; 37
Abel, Helbig, Marquardt, Zwick, Daszkowski (bib0001) 2000; 10
Morari, H. Lee (bib0025) 1999; 23
Lee, Lee (bib0016) 2005; 41
Biegler (bib0003) 2007; 46
Petsagkourakis, Sandoval, Bradford, Zhang, del Rio-Chanona (bib0030) 2020; 133
Silver, Huang, Maddison, Guez, Sifre, Van Den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot (bib0035) 2016; 529
Lucia, Andersson, Brandt, Diehl, Engell (bib0018) 2014; 24
Martinez (bib0020) 1998; 76
Lee, Yu (bib0015) 1997; 33
Hart, Laird, Watson, Woodruff, Hackebeil, Nicholson, Siirola (bib0009) 2017; vol. 67
Chang, Liu, Henson (bib0006) 2016; 42
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (bib0024) 2015; 518
Mayne, Kerrigan, Van Wyk, Falugi (bib0023) 2011; 21
Rawlings, Amrit (bib0033) 2009
Lucia, Finkler, Engell (bib0019) 2013; 23
Bonvin, Srinivasan, Ruppen (bib0005) 2001
.
Santos, Bonzanini, Heirung, Mesbah (bib0034) 2019
Biegler (bib0004) 2007; 46
Paszke, Gross, Chintala, Chanan, Yang, DeVito, Lin, Desmaison, Antiga, Lerer (bib0027) 2017
Mayne (bib0022) 2000
Sutton, McAllester, Singh, Mansour (bib0039) 2000
Hart, Watson, Woodruff (bib0010) 2011; 3
Hausknecht, M., Stone, P., 2015. Deep reinforcement learning in parameterized action space.
Peroni, Kaisare, Lee (bib0029) 2005; 13
Kim, Park, Yoo, Oh, Lee, Lee (bib0014) 2020; 87
Jang, Lee, Biegler (bib0012) 2016; 49
Sutton, Barto (bib0038) 2018
Fujimoto, Van Hoof, Meger (bib0008) 2018; vol. 4
Qin, Badgwell (bib0032) 2000
Silver, Lever, Heess, Degris, Wierstra, Riedmiller (bib0036) 2014
Paulson, Mesbah (bib0028) 2018; 51
Hart (10.1016/j.compchemeng.2020.107133_bib0009) 2017; vol. 67
Hart (10.1016/j.compchemeng.2020.107133_bib0010) 2011; 3
Jung (10.1016/j.compchemeng.2020.107133_bib0013) 2015; 28
Sutton (10.1016/j.compchemeng.2020.107133_bib0039) 2000
Chang (10.1016/j.compchemeng.2020.107133_bib0006) 2016; 42
Kim (10.1016/j.compchemeng.2020.107133_bib0014) 2020; 87
Qin (10.1016/j.compchemeng.2020.107133_bib0032) 2000
Santos (10.1016/j.compchemeng.2020.107133_bib0034) 2019
Lee (10.1016/j.compchemeng.2020.107133_bib0016) 2005; 41
Fujimoto (10.1016/j.compchemeng.2020.107133_bib0008) 2018; vol. 4
Bonvin (10.1016/j.compchemeng.2020.107133_bib0005) 2001
Uhlenbeck (10.1016/j.compchemeng.2020.107133_bib0043) 1930; 36
Abel (10.1016/j.compchemeng.2020.107133_bib0001) 2000; 10
Sutton (10.1016/j.compchemeng.2020.107133_bib0038) 2018
Biegler (10.1016/j.compchemeng.2020.107133_bib0004) 2007; 46
Martinez (10.1016/j.compchemeng.2020.107133_bib0020) 1998; 76
Rawlings (10.1016/j.compchemeng.2020.107133_bib0033) 2009
Wilson (10.1016/j.compchemeng.2020.107133_bib0045) 1997; 21
Morari (10.1016/j.compchemeng.2020.107133_bib0025) 1999; 23
Mayne (10.1016/j.compchemeng.2020.107133_bib0023) 2011; 21
Barton (10.1016/j.compchemeng.2020.107133_bib0002) 1998; 37
Lee (10.1016/j.compchemeng.2020.107133_bib0015) 1997; 33
Lucia (10.1016/j.compchemeng.2020.107133_bib0018) 2014; 24
Paulson (10.1016/j.compchemeng.2020.107133_bib0028) 2018; 51
Nie (10.1016/j.compchemeng.2020.107133_bib0026) 2013; 59
Mnih (10.1016/j.compchemeng.2020.107133_bib0024) 2015; 518
Jang (10.1016/j.compchemeng.2020.107133_bib0012) 2016; 49
Mastan (10.1016/j.compchemeng.2020.107133_bib0021) 2015; 68
Petsagkourakis (10.1016/j.compchemeng.2020.107133_bib0030) 2020; 133
Silver (10.1016/j.compchemeng.2020.107133_bib0035) 2016; 529
Paszke (10.1016/j.compchemeng.2020.107133_bib0027) 2017
Tsitsiklis (10.1016/j.compchemeng.2020.107133_bib0041) 2000
Peroni (10.1016/j.compchemeng.2020.107133_bib0029) 2005; 13
Tsitsiklis (10.1016/j.compchemeng.2020.107133_bib0042) 1997
Wächter (10.1016/j.compchemeng.2020.107133_bib0044) 2006; 106
Mayne (10.1016/j.compchemeng.2020.107133_bib0022) 2000
Lillicrap (10.1016/j.compchemeng.2020.107133_bib0017) 2016
10.1016/j.compchemeng.2020.107133_bib0011
Lucia (10.1016/j.compchemeng.2020.107133_bib0019) 2013; 23
Puterman (10.1016/j.compchemeng.2020.107133_bib0031) 2014
Biegler (10.1016/j.compchemeng.2020.107133_bib0003) 2007; 46
Thangavel (10.1016/j.compchemeng.2020.107133_bib0040) 2018; 72
Ellis (10.1016/j.compchemeng.2020.107133_bib0007) 2014; 24
Spielberg (10.1016/j.compchemeng.2020.107133_bib0037) 2019; 65
Silver (10.1016/j.compchemeng.2020.107133_bib0036) 2014
References_xml – reference: Hausknecht, M., Stone, P., 2015. Deep reinforcement learning in parameterized action space.
– volume: 529
  start-page: 484
  year: 2016
  ident: bib0035
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
– volume: 28
  start-page: 164
  year: 2015
  end-page: 169
  ident: bib0013
  article-title: Model-based on-line optimization framework for semi-batch polymerization reactors
  publication-title: IFAC-PapersOnLine
– year: 2016
  ident: bib0017
  article-title: Continuous control with deep reinforcement learning
  publication-title: Conference Paper at ICLR 2016
– volume: 23
  start-page: 1306
  year: 2013
  end-page: 1319
  ident: bib0019
  article-title: Multi-stage nonlinear model predictive control applied to a semi-batch polymerization reactor under uncertainty
  publication-title: J. Process Control
– volume: 49
  start-page: 37
  year: 2016
  end-page: 42
  ident: bib0012
  article-title: A robust NMPC scheme for semi-batch polymerization reactors
  publication-title: IFAC-PapersOnLine
– volume: 87
  start-page: 166
  year: 2020
  end-page: 178
  ident: bib0014
  article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system
  publication-title: J. Process Control
– volume: 21
  start-page: 1341
  year: 2011
  end-page: 1353
  ident: bib0023
  article-title: Tube-based robust nonlinear model predictive control
  publication-title: Int. J. Robust Nonlinear Control
– year: 2017
  ident: bib0027
  article-title: Automatic differentiation in PyTorch
  publication-title: NIPS 2017 Workshop Autodiff Submission
– volume: 13
  start-page: 786
  year: 2005
  end-page: 790
  ident: bib0029
  article-title: Optimal control of a fed-batch bioreactor using simulation-based approximate dynamic programming
  publication-title: IEEE Trans. Control Syst. Technol.
– volume: 72
  start-page: 39
  year: 2018
  end-page: 51
  ident: bib0040
  article-title: Dual robust nonlinear model predictive control: amulti-stage approach
  publication-title: J. Process Control
– start-page: 369
  year: 2000
  end-page: 392
  ident: bib0032
  article-title: An overview of nonlinear model predictive control applications
  publication-title: Nonlinear Model Predictive Control
– volume: 37
  start-page: 966
  year: 1998
  end-page: 981
  ident: bib0002
  article-title: Dynamic optimization in a discontinuous world
  publication-title: Ind. Eng. Chem. Res.
– volume: 59
  start-page: 2515
  year: 2013
  end-page: 2529
  ident: bib0026
  article-title: Reactor modeling and recipe optimization of polyether polyol processes: polypropylene glycol
  publication-title: AlChE J.
– start-page: 1641
  year: 2019
  end-page: 1647
  ident: bib0034
  article-title: A constraint-tightening approach to nonlinear model predictive control with chance constraints for stochastic systems
  publication-title: 2019 American Control Conference (ACC)
– start-page: 1057
  year: 2000
  end-page: 1063
  ident: bib0039
  article-title: Policy gradient methods for reinforcement learning with function approximation
  publication-title: Advances in Neural Information Processing Systems
– volume: 46
  start-page: 1043
  year: 2007
  end-page: 1053
  ident: bib0004
  article-title: An overview of simultaneous strategies for dynamic optimization
  publication-title: Chem. Eng. Process.
– volume: 21
  year: 1997
  ident: bib0045
  article-title: Neuro-fuzzy modeling and control of a batch process involving simultaneous reaction and distillation
  publication-title: Comput. Chem. Eng.
– volume: 3
  start-page: 219
  year: 2011
  end-page: 260
  ident: bib0010
  article-title: Pyomo: modeling and solving mathematical programs in Python
  publication-title: Math. Program. Comput.
– volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: bib0024
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– volume: vol. 67
  year: 2017
  ident: bib0009
  article-title: Pyomo–optimization modeling in Python
– start-page: 1075
  year: 1997
  end-page: 1081
  ident: bib0042
  article-title: Analysis of temporal-difference learning with function approximation
  publication-title: Advances in Neural Information Processing Systems
– volume: 24
  start-page: 1156
  year: 2014
  end-page: 1178
  ident: bib0007
  article-title: A tutorial review of economic model predictive control methods
  publication-title: J. Process Control
– volume: 46
  start-page: 1043
  year: 2007
  end-page: 1053
  ident: bib0003
  article-title: An overview of simultaneous strategies for dynamic optimization
  publication-title: Chem. Eng. Process.
– volume: 33
  start-page: 763
  year: 1997
  end-page: 781
  ident: bib0015
  article-title: Worst-case formulations of model predictive control for systems with bounded parameters
  publication-title: Automatica
– start-page: 23
  year: 2000
  end-page: 44
  ident: bib0022
  article-title: Nonlinear model predictive control: challenges and opportunities
  publication-title: Nonlinear Model Predictive Control
– volume: 133
  start-page: 106649
  year: 2020
  ident: bib0030
  article-title: Reinforcement learning for batch bioprocess optimization
  publication-title: Comput. Chem. Eng.
– volume: vol. 4
  start-page: 2587
  year: 2018
  end-page: 2601
  ident: bib0008
  article-title: Addressing function approximation error in actor-critic methods
  publication-title: 35th International Conference on Machine Learning, ICML 2018
– volume: 68
  start-page: 139
  year: 2015
  end-page: 160
  ident: bib0021
  article-title: Method of moments: a versatile tool for deterministic modeling of polymerization kinetics
  publication-title: Eur. Polym. J.
– year: 2018
  ident: bib0038
  article-title: Reinforcement Learning: An Introduction
– volume: 41
  start-page: 1281
  year: 2005
  end-page: 1288
  ident: bib0016
  article-title: Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes
  publication-title: Automatica
– volume: 36
  start-page: 823
  year: 1930
  ident: bib0043
  article-title: On the theory of the brownian motion
  publication-title: Phys. Rev.
– year: 2014
  ident: bib0036
  article-title: Deterministic policy gradient algorithms
– reference: .
– start-page: 119
  year: 2009
  end-page: 138
  ident: bib0033
  article-title: Optimizing process economic performance using model predictive control
  publication-title: Nonlinear Model Predictive Control
– volume: 65
  start-page: e16689
  year: 2019
  ident: bib0037
  article-title: Toward self-driving processes: a deep reinforcement learning approach to control
  publication-title: AlChE J.
– volume: 51
  start-page: 523
  year: 2018
  end-page: 534
  ident: bib0028
  article-title: Nonlinear model predictive control with explicit backoffs for stochastic systems under arbitrary uncertainty
  publication-title: IFAC-PapersOnLine
– year: 2001
  ident: bib0005
  article-title: Dynamic Optimization in the Batch Chemical Industry
  publication-title: Technical Report
– volume: 42
  start-page: 137
  year: 2016
  end-page: 149
  ident: bib0006
  article-title: Nonlinear model predictive control of fed-batch fermentations using dynamic flux balance models
  publication-title: J. Process Control
– year: 2000
  ident: bib0041
  article-title: Analysis of temporal-difference learning with function approximation
  publication-title: J. Adv. Neural Inf.Process. Syst.
– volume: 24
  start-page: 1247
  year: 2014
  end-page: 1259
  ident: bib0018
  article-title: Handling uncertainty in economic nonlinear model predictive control: a comparative case study
  publication-title: J. Process Control
– volume: 23
  start-page: 667
  year: 1999
  end-page: 682
  ident: bib0025
  article-title: Model predictive control: past, present and future
  publication-title: Comput. Chem. Eng.
– year: 2014
  ident: bib0031
  article-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming
– volume: 76
  start-page: 711
  year: 1998
  end-page: 722
  ident: bib0020
  article-title: Learning to control the performance of batch processes
  publication-title: Chem. Eng. Res. Des.
– volume: 10
  start-page: 351
  year: 2000
  end-page: 362
  ident: bib0001
  article-title: Productivity optimization of an industrial semi-batch polymerization reactor under safety constraints
  publication-title: J. Process Control
– volume: 106
  start-page: 25
  year: 2006
  end-page: 57
  ident: bib0044
  article-title: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming
  publication-title: Math. Program.
– start-page: 1057
  year: 2000
  ident: 10.1016/j.compchemeng.2020.107133_bib0039
  article-title: Policy gradient methods for reinforcement learning with function approximation
– volume: vol. 4
  start-page: 2587
  year: 2018
  ident: 10.1016/j.compchemeng.2020.107133_bib0008
  article-title: Addressing function approximation error in actor-critic methods
– volume: 49
  start-page: 37
  issue: 7
  year: 2016
  ident: 10.1016/j.compchemeng.2020.107133_bib0012
  article-title: A robust NMPC scheme for semi-batch polymerization reactors
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2016.07.213
– volume: 23
  start-page: 667
  issue: 4–5
  year: 1999
  ident: 10.1016/j.compchemeng.2020.107133_bib0025
  article-title: Model predictive control: past, present and future
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/S0098-1354(98)00301-9
– volume: 133
  start-page: 106649
  year: 2020
  ident: 10.1016/j.compchemeng.2020.107133_bib0030
  article-title: Reinforcement learning for batch bioprocess optimization
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2019.106649
– year: 2014
  ident: 10.1016/j.compchemeng.2020.107133_bib0036
– volume: 46
  start-page: 1043
  issue: 11
  year: 2007
  ident: 10.1016/j.compchemeng.2020.107133_bib0004
  article-title: An overview of simultaneous strategies for dynamic optimization
  publication-title: Chem. Eng. Process.
  doi: 10.1016/j.cep.2006.06.021
– volume: 24
  start-page: 1156
  issue: 8
  year: 2014
  ident: 10.1016/j.compchemeng.2020.107133_bib0007
  article-title: A tutorial review of economic model predictive control methods
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2014.03.010
– start-page: 119
  year: 2009
  ident: 10.1016/j.compchemeng.2020.107133_bib0033
  article-title: Optimizing process economic performance using model predictive control
– year: 2017
  ident: 10.1016/j.compchemeng.2020.107133_bib0027
  article-title: Automatic differentiation in PyTorch
– volume: 529
  start-page: 484
  issue: 7587
  year: 2016
  ident: 10.1016/j.compchemeng.2020.107133_bib0035
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
  doi: 10.1038/nature16961
– start-page: 369
  year: 2000
  ident: 10.1016/j.compchemeng.2020.107133_bib0032
  article-title: An overview of nonlinear model predictive control applications
– volume: 36
  start-page: 823
  issue: 5
  year: 1930
  ident: 10.1016/j.compchemeng.2020.107133_bib0043
  article-title: On the theory of the brownian motion
  publication-title: Phys. Rev.
  doi: 10.1103/PhysRev.36.823
– volume: 3
  start-page: 219
  issue: 3
  year: 2011
  ident: 10.1016/j.compchemeng.2020.107133_bib0010
  article-title: Pyomo: modeling and solving mathematical programs in Python
  publication-title: Math. Program. Comput.
  doi: 10.1007/s12532-011-0026-8
– start-page: 1641
  year: 2019
  ident: 10.1016/j.compchemeng.2020.107133_bib0034
  article-title: A constraint-tightening approach to nonlinear model predictive control with chance constraints for stochastic systems
– volume: 72
  start-page: 39
  year: 2018
  ident: 10.1016/j.compchemeng.2020.107133_bib0040
  article-title: Dual robust nonlinear model predictive control: amulti-stage approach
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2018.10.003
– volume: 23
  start-page: 1306
  issue: 9
  year: 2013
  ident: 10.1016/j.compchemeng.2020.107133_bib0019
  article-title: Multi-stage nonlinear model predictive control applied to a semi-batch polymerization reactor under uncertainty
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2013.08.008
– volume: 76
  start-page: 711
  issue: 6 A6
  year: 1998
  ident: 10.1016/j.compchemeng.2020.107133_bib0020
  article-title: Learning to control the performance of batch processes
  publication-title: Chem. Eng. Res. Des.
  doi: 10.1205/026387698525414
– ident: 10.1016/j.compchemeng.2020.107133_bib0011
– volume: 24
  start-page: 1247
  issue: 8
  year: 2014
  ident: 10.1016/j.compchemeng.2020.107133_bib0018
  article-title: Handling uncertainty in economic nonlinear model predictive control: a comparative case study
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2014.05.008
– volume: 37
  start-page: 966
  issue: 3
  year: 1998
  ident: 10.1016/j.compchemeng.2020.107133_bib0002
  article-title: Dynamic optimization in a discontinuous world
  publication-title: Ind. Eng. Chem. Res.
  doi: 10.1021/ie970738y
– volume: 59
  start-page: 2515
  issue: 7
  year: 2013
  ident: 10.1016/j.compchemeng.2020.107133_bib0026
  article-title: Reactor modeling and recipe optimization of polyether polyol processes: polypropylene glycol
  publication-title: AlChE J.
  doi: 10.1002/aic.14144
– volume: 518
  start-page: 529
  issue: 7540
  year: 2015
  ident: 10.1016/j.compchemeng.2020.107133_bib0024
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 65
  start-page: e16689
  issue: 10
  year: 2019
  ident: 10.1016/j.compchemeng.2020.107133_bib0037
  article-title: Toward self-driving processes: a deep reinforcement learning approach to control
  publication-title: AlChE J.
  doi: 10.1002/aic.16689
– volume: 28
  start-page: 164
  issue: 8
  year: 2015
  ident: 10.1016/j.compchemeng.2020.107133_bib0013
  article-title: Model-based on-line optimization framework for semi-batch polymerization reactors
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2015.08.175
– volume: 42
  start-page: 137
  year: 2016
  ident: 10.1016/j.compchemeng.2020.107133_bib0006
  article-title: Nonlinear model predictive control of fed-batch fermentations using dynamic flux balance models
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2016.04.012
– start-page: 1075
  year: 1997
  ident: 10.1016/j.compchemeng.2020.107133_bib0042
  article-title: Analysis of temporal-difference learning with function approximation
– volume: 41
  start-page: 1281
  issue: 7
  year: 2005
  ident: 10.1016/j.compchemeng.2020.107133_bib0016
  article-title: Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes
  publication-title: Automatica
  doi: 10.1016/j.automatica.2005.02.006
– volume: 21
  start-page: 1341
  issue: 11
  year: 2011
  ident: 10.1016/j.compchemeng.2020.107133_bib0023
  article-title: Tube-based robust nonlinear model predictive control
  publication-title: Int. J. Robust Nonlinear Control
  doi: 10.1002/rnc.1758
– issue: 1988
  year: 2000
  ident: 10.1016/j.compchemeng.2020.107133_bib0041
  article-title: Analysis of temporal-difference learning with function approximation
  publication-title: J. Adv. Neural Inf.Process. Syst.
– year: 2016
  ident: 10.1016/j.compchemeng.2020.107133_bib0017
  article-title: Continuous control with deep reinforcement learning
– volume: 51
  start-page: 523
  issue: 20
  year: 2018
  ident: 10.1016/j.compchemeng.2020.107133_bib0028
  article-title: Nonlinear model predictive control with explicit backoffs for stochastic systems under arbitrary uncertainty
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2018.11.036
– volume: 13
  start-page: 786
  issue: 5
  year: 2005
  ident: 10.1016/j.compchemeng.2020.107133_bib0029
  article-title: Optimal control of a fed-batch bioreactor using simulation-based approximate dynamic programming
  publication-title: IEEE Trans. Control Syst. Technol.
  doi: 10.1109/TCST.2005.852105
– volume: 10
  start-page: 351
  issue: 4
  year: 2000
  ident: 10.1016/j.compchemeng.2020.107133_bib0001
  article-title: Productivity optimization of an industrial semi-batch polymerization reactor under safety constraints
  publication-title: J. Process Control
  doi: 10.1016/S0959-1524(99)00049-9
– year: 2001
  ident: 10.1016/j.compchemeng.2020.107133_bib0005
  article-title: Dynamic Optimization in the Batch Chemical Industry
– volume: 46
  start-page: 1043
  issue: 11
  year: 2007
  ident: 10.1016/j.compchemeng.2020.107133_bib0003
  article-title: An overview of simultaneous strategies for dynamic optimization
  publication-title: Chem. Eng. Process.
  doi: 10.1016/j.cep.2006.06.021
– start-page: 23
  year: 2000
  ident: 10.1016/j.compchemeng.2020.107133_bib0022
  article-title: Nonlinear model predictive control: challenges and opportunities
– volume: 21
  issue: SUPPL.1
  year: 1997
  ident: 10.1016/j.compchemeng.2020.107133_bib0045
  article-title: Neuro-fuzzy modeling and control of a batch process involving simultaneous reaction and distillation
  publication-title: Comput. Chem. Eng.
– volume: 87
  start-page: 166
  year: 2020
  ident: 10.1016/j.compchemeng.2020.107133_bib0014
  article-title: A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2020.02.003
– volume: 106
  start-page: 25
  issue: 1
  year: 2006
  ident: 10.1016/j.compchemeng.2020.107133_bib0044
  article-title: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming
  publication-title: Math. Program.
  doi: 10.1007/s10107-004-0559-y
– volume: vol. 67
  year: 2017
  ident: 10.1016/j.compchemeng.2020.107133_bib0009
– year: 2014
  ident: 10.1016/j.compchemeng.2020.107133_bib0031
– year: 2018
  ident: 10.1016/j.compchemeng.2020.107133_bib0038
– volume: 68
  start-page: 139
  year: 2015
  ident: 10.1016/j.compchemeng.2020.107133_bib0021
  article-title: Method of moments: a versatile tool for deterministic modeling of polymerization kinetics
  publication-title: Eur. Polym. J.
  doi: 10.1016/j.eurpolymj.2015.04.018
– volume: 33
  start-page: 763
  issue: 5
  year: 1997
  ident: 10.1016/j.compchemeng.2020.107133_bib0015
  article-title: Worst-case formulations of model predictive control for systems with bounded parameters
  publication-title: Automatica
  doi: 10.1016/S0005-1098(96)00255-5
SSID ssj0002488
Score 2.5962505
Snippet •Design of reward function is suggested for the general economic process control.•Phase segmentation approach is proposed to address distinct characteristics...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 107133
SubjectTerms Actor-Critic
Batch process
Optimal control
Reinforcement learning
Title Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation
URI https://dx.doi.org/10.1016/j.compchemeng.2020.107133
Volume 144
WOSCitedRecordID wos000598170500004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-4375
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002488
  issn: 0098-1354
  databaseCode: AIEXJ
  dateStart: 19950611
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3bbtQwELWWFiF4QFxFuclIvEWpam8ujsRLqYqWCiqEiti3yHbsLdU2ifZStf_AB_F5jO0464WiFiRerMiR7ThzMh5PxmcQes0N7RVXMpay2okTDZ-iEIrEaUKI4jt6yIXNWvIhPzxk43HxaTD44c_CnE3zumbn50X7X0UNdSBsc3T2L8TddwoVcA1ChxLEDuW1BP9ZWTJUaf1-PivEJDLLVRU1oCFOLSWIi1AHU1GAMj6OWndgQM2jpfUefDSsVfEen02bqFKqhcLFzVhiZ5vbQV5Ek5mNGFs4d257DGNEczU57Q401aHp6_NHzC3apCcqUCtCxJUGsv7bEVfLIEbAAvdt83vdgUmX9LVpfLWPLeIX0Wg79GpQYr0aSaipC9jdDh3BdK-pkyTQtcRusC9dBpxH4sRIsTXzgalswyjmjm-zTr39y5LYByr6GLiTMuiqNF2VrqsbaJPClgv06ebu-_3xQW8F0IQxz9dq5nELvVrFFv7huS63jQJ75-geutttVPCuA9h9NFD1A3QnoK98iL6vQQ17qGELNdxBDXdQw43GFmq4hxq2UMMB1LCBGl6DGnZQwx5q2EANW6jhEGqP0Jd3-0d7o7jL7hHLISWLWLGMZbk2CRWkyoTIKvOLmOiiIlpQKgXjLKWiKjhYUEKkiRZplbFcUV0VGZXDx2ijbmr1BGGRCli4uCQFUUmuVcF5okgGxherOOFsCzH_XkvZUd-bDCzT8kr5biHaN20d_8t1Gr3xwis7Q9YZqCUA9OrmT_9lzGfo9uo7eo42FrOleoFuyrPFt_nsZYfOn9i4znM
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reinforcement+learning+based+optimal+control+of+batch+processes+using+Monte-Carlo+deep+deterministic+policy+gradient+with+phase+segmentation&rft.jtitle=Computers+%26+chemical+engineering&rft.au=Yoo%2C+Haeun&rft.au=Kim%2C+Boeun&rft.au=Kim%2C+Jong+Woo&rft.au=Lee%2C+Jay+H.&rft.date=2021-01-04&rft.issn=0098-1354&rft.volume=144&rft.spage=107133&rft_id=info:doi/10.1016%2Fj.compchemeng.2020.107133&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_compchemeng_2020_107133
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0098-1354&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0098-1354&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0098-1354&client=summon