How good are learning-based control v.s. model-based control for load shifting? Investigations on a single zone building energy system
Both model predictive control (MPC) and deep reinforcement learning control (DRL) have been presented as a way to approximate the true optimality of a dynamic programming problem, and these two have shown significant operational cost saving potentials for building energy systems. However, there is s...
Saved in:
| Published in: | Energy (Oxford) Vol. 273; no. C; p. 127073 |
|---|---|
| Main Authors: | , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United Kingdom
Elsevier Ltd
01.06.2023
Elsevier |
| Subjects: | |
| ISSN: | 0360-5442 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Both model predictive control (MPC) and deep reinforcement learning control (DRL) have been presented as a way to approximate the true optimality of a dynamic programming problem, and these two have shown significant operational cost saving potentials for building energy systems. However, there is still a lack of in-depth quantitative studies on their approximation levels to the true optimality, especially in the building energy domain. To fill in the gap, this paper provides a numerical framework that enables the evaluation of the optimality levels of different controllers for building energy systems. This framework is then used to comprehensively compare the optimal control performance of both MPC and DRL controllers with given computation budgets for a single zone fan coil unit system. Note the optimality is estimated based on a user-specific selection of trade-off weights among energy costs, thermal comfort and control slew rates. Compared with the best optimality we can find through expensive optimization simulations, the best DRL agent can maximally approximate the optimality by 96.54%, which outperforms the best MPC whose optimality level is 90.11%. However, due to the stochasticity, the DRL agent is only expected to approximate the optimality by 90.42%, which is almost equivalent to the best MPC. Except for Proximal Policy Optimization (PPO), all DRL agents can have a better approximation to the optimality than the best MPC, and are expected to have better approximation than the MPC with a prediction horizon of 32 steps (15 min per step). In terms of reducing energy cost and thermal discomfort, MPC can outperform the rule-based control (RBC) by 18.47%–25.44%. DRL can be expected to outperform RBC by 18.95%–25.65% ,and the best DRL control policy can outperform RBC by 20.29%–29.72%. Although the comparison of the optimality level is performed in a perfect setting, e.g., MPC assumes perfect models, and DRL assumes a perfect offline training process and online deployment process, this can shed insight on their capabilities of approximating to the original dynamic programming problem.
•Learning- and model-based controllers are evaluated in a containerized environment.•Both controllers can approximate the true optimality by up to 90%–96%.•Most DRL controllers have equivalent performance as the best MPC controller.•DRL controllers can outperform the best MPC controller by up to 6%. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 USDOE EE0009150 |
| ISSN: | 0360-5442 |
| DOI: | 10.1016/j.energy.2023.127073 |