Reinforcement learning with dynamic convex risk measures

We develop an approach for solving time‐consistent risk‐sensitive stochastic optimization problems using model‐free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time‐consistent dynamic pr...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Mathematical finance Ročník 34; číslo 2; s. 557 - 587
Hlavní autori:	Coache, Anthony, Jaimungal, Sebastian
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Oxford Blackwell Publishing Ltd 01.04.2024
Predmet:	Algorithms Arbitrage Dynamic programming Flexibility Hedging Learning Machine learning Neural networks Obstacle avoidance Optimization Policies Random variables Reinforcement Risk Risk assessment Robot control
ISSN:	0960-1627, 1467-9965
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	We develop an approach for solving time‐consistent risk‐sensitive stochastic optimization problems using model‐free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time‐consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules that aid in obtaining optimal policies. We further develop an actor–critic style algorithm using neural networks to optimize over policies. Finally, we demonstrate the performance and flexibility of our approach by applying it to three optimization problems: statistical arbitrage trading strategies, financial hedging, and obstacle avoidance robot control.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0960-1627 1467-9965
DOI:	10.1111/mafi.12388