Stochastic optimization of multireservoir systems via reinforcement learning

Although several variants of stochastic dynamic programming have been applied to optimal operation of multireservoir systems, they have been plagued by a high‐dimensional state space and the inability to accurately incorporate the stochastic environment as characterized by temporally and spatially c...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Water resources research Ročník 43; číslo 11
Hlavní autoři: Lee, J.H, Labadie, J.W
Médium: Journal Article
Jazyk:angličtina
Vydáno: Blackwell Publishing Ltd 01.11.2007
Témata:
ISSN:0043-1397, 1944-7973
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Although several variants of stochastic dynamic programming have been applied to optimal operation of multireservoir systems, they have been plagued by a high‐dimensional state space and the inability to accurately incorporate the stochastic environment as characterized by temporally and spatially correlated hydrologic inflows. Reinforcement learning has emerged as an effective approach to solving sequential decision problems by combining concepts from artificial intelligence, cognitive science, and operations research. A reinforcement learning system has a mathematical foundation similar to dynamic programming and Markov decision processes, with the goal of maximizing the long‐term reward or returns as conditioned on the state of the system environment and the immediate reward obtained from operational decisions. Reinforcement learning can include Monte Carlo simulation where transition probabilities and rewards are not explicitly known a priori. The Q‐Learning method in reinforcement learning is demonstrated on the two‐reservoir Geum River system, South Korea, and is shown to outperform implicit stochastic dynamic programming and sampling stochastic dynamic programming methods.
Bibliografie:ark:/67375/WNG-HC32JQ9P-H
ArticleID:2006WR005627
istex:2C354B438C2E0A6B37F942F108EEADE49A1FEC45
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0043-1397
1944-7973
DOI:10.1029/2006WR005627