An Adaptive Q-Learning Algorithm Developed for Agent-Based Computational Modeling of Electricity Market

Balancing between exploration and exploitation with adaptation of the Q -learning (QL) parameters to the condition of dynamic uncertain environment has always been a significant subject of interest in the context of reinforcement learning. The peculiarities of the electricity market have provided su...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews Ročník 40; číslo 5; s. 547 - 556
Hlavní autoři: Rahimiyan, Morteza, Mashhadi, Habib Rajabi
Médium: Journal Article
Jazyk:angličtina
Vydáno: New-York, NY IEEE 01.09.2010
Institute of Electrical and Electronics Engineers
Témata:
ISSN:1094-6977, 1558-2442
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Balancing between exploration and exploitation with adaptation of the Q -learning (QL) parameters to the condition of dynamic uncertain environment has always been a significant subject of interest in the context of reinforcement learning. The peculiarities of the electricity market have provided such complex dynamic economic environment, and consequently have increased the requirement for advancement of the learning methods. In this economic system, the agent's market power plays a vital role in bidding decision-making problem. In order to improve the QL method, as main idea, adaptation of its parameters to the market power is proposed for making a good balance between exploration and exploitation. To implement this adaptation process, due to the fuzzy nature of human's decision-making process, a fuzzy system is designed to map each agent's market power into the QL parameters. Therefore, a fuzzy QL method is developed to model the power supplier's strategic bidding behavior in a computational electricity market. In the simulation framework, the QL algorithm selects the power supplier's bidding strategy according to the past experiences and the values of the parameters, which show the human's risk characteristic. The application of the proposed methodology for the power supplier in a multiarea power system shows the performance improvement in comparison to the QL with fixed parameters.
Bibliografie:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:1094-6977
1558-2442
DOI:10.1109/TSMCC.2010.2044174