Contrastive Initial State Buffer for Reinforcement Learning

Uloženo v:
Podrobná bibliografie
Název: Contrastive Initial State Buffer for Reinforcement Learning
Autoři: Messikommer, Nico, Song, Yunlong, Scaramuzza, Davide
Přispěvatelé: University of Zurich, Messikommer, Nico
Zdroj: 2024 IEEE International Conference on Robotics and Automation (ICRA). :2866-2872
Publication Status: Preprint
Informace o vydavateli: IEEE, 2024.
Rok vydání: 2024
Témata: 1712 Software, FOS: Computer and information sciences, 0301 basic medicine, Computer Science - Machine Learning, 0303 health sciences, 03 medical and health sciences, 10009 Department of Informatics, 2208 Electrical and Electronic Engineering, 2207 Control and Systems Engineering, 1702 Artificial Intelligence, 000 Computer science, knowledge & systems, Machine Learning (cs.LG)
Popis: In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. While recent works have been effective in leveraging past experiences for policy updates, they often overlook the potential of reusing past experiences for data collection. Independent of the underlying RL algorithm, we introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent in the environment in order to guide it toward more informative states. We validate our approach on two complex robotic tasks without relying on any prior information about the environment: (i) locomotion of a quadruped robot traversing challenging terrains and (ii) a quadcopter drone racing through a track. The experimental results show that our initial state buffer achieves higher task performance than the nominal baseline while also speeding up training convergence.
Druh dokumentu: Article
Other literature type
Conference object
Popis souboru: ICRA24_Messikommer.pdf - application/pdf
DOI: 10.1109/icra57147.2024.10610528
DOI: 10.48550/arxiv.2309.09752
DOI: 10.5167/uzh-264842
Přístupová URL adresa: http://arxiv.org/abs/2309.09752
https://www.zora.uzh.ch/id/eprint/264842/
https://doi.org/10.5167/uzh-264842
Rights: STM Policy #29
arXiv Non-Exclusive Distribution
Přístupové číslo: edsair.doi.dedup.....562bdb6af2c783075267c8461130207d
Databáze: OpenAIRE
Popis
Abstrakt:In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. While recent works have been effective in leveraging past experiences for policy updates, they often overlook the potential of reusing past experiences for data collection. Independent of the underlying RL algorithm, we introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent in the environment in order to guide it toward more informative states. We validate our approach on two complex robotic tasks without relying on any prior information about the environment: (i) locomotion of a quadruped robot traversing challenging terrains and (ii) a quadcopter drone racing through a track. The experimental results show that our initial state buffer achieves higher task performance than the nominal baseline while also speeding up training convergence.
DOI:10.1109/icra57147.2024.10610528