Deep reinforcement learning based controller with dynamic feature extraction for an industrial claus process
•The DRL-based controller integrating with the process dynamics is developed.•The dynamic feature is extracted from the historical data using the Seq2seq network.•The controller was trained by interacting with the Seq2seq model.•The standard deviation of the control variable in the industrial Claus...
Uloženo v:
| Vydáno v: | Journal of the Taiwan Institute of Chemical Engineers Ročník 146; s. 104779 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.05.2023
|
| Témata: | |
| ISSN: | 1876-1070, 1876-1089 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | •The DRL-based controller integrating with the process dynamics is developed.•The dynamic feature is extracted from the historical data using the Seq2seq network.•The controller was trained by interacting with the Seq2seq model.•The standard deviation of the control variable in the industrial Claus process can be reduced by up to 55% using the proposed DRL-based controller.
The significant time delay between the manipulated and controlled variables introduces challenges in the task of system identification when implementing model predictive control (MPC) for an industrial process. Recently, deep reinforcement learning (DRL) with model-free characteristics has attracted considerable attention from the process control community. However, the model-free assumption in DRL is based on the property of the Markov decision process (MDP), in which all state variables must be observed. This assumption is not true for an industrial process.
In this study, the sequence-to-sequence (Seq2seq) network was employed to build a surrogate model based on the industrial Claus process data. Meanwhile, the hidden state output from the encoder of the Seq2seq network, which represents the dynamic feature of the process, connects to the DRL-based controller to compensate for the partial observabilities of a real process.
The results show that the standard deviation of the control variable, which refers to the H2S to SO2 concentration ratio in the tail gas, can be reduced by up to 55% using the proposed DRL-based controller comparing with the current control strategy.
[Display omitted] |
|---|---|
| ISSN: | 1876-1070 1876-1089 |
| DOI: | 10.1016/j.jtice.2023.104779 |