Deep reinforcement learning with shallow controllers: An experimental application to PID tuning

Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep R...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Control engineering practice Ročník 121; s. 105046
Hlavní autoři: Lawrence, Nathan P., Forbes, Michael G., Loewen, Philip D., McClement, Daniel G., Backström, Johan U., Gopaluni, R. Bhushan
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 01.04.2022
Témata:
ISSN:0967-0661, 1873-6939
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence. [Display omitted] •Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods.
ISSN:0967-0661
1873-6939
DOI:10.1016/j.conengprac.2021.105046