DARL: Distributed Reconfigurable Accelerator for Hyperdimensional Reinforcement Learning

Reinforcement Learning (RL) is a powerful technology to solve decision-making problems such as robotics control. Modern RL algorithms, i.e., Deep Q-Learning, are based on costly and resource hungry deep neural networks. This motivates us to deploy alternative models for powering RL agents on edge de...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) s. 1 - 9
Hlavní autoři: Chen, Hanning, Issa, Mariam, Ni, Yang, Imani, Mohsen
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 29.10.2022
Témata:
ISSN:1558-2434
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Reinforcement Learning (RL) is a powerful technology to solve decision-making problems such as robotics control. Modern RL algorithms, i.e., Deep Q-Learning, are based on costly and resource hungry deep neural networks. This motivates us to deploy alternative models for powering RL agents on edge devices. Recently, brain-inspired Hyper-Dimensional Computing (HDC) has been introduced as a promising solution for lightweight and efficient machine learning, particularly for classification.In this work, we develop a novel platform capable of real-time hyper-dimensional reinforcement learning. Our heterogeneous CPU-FPGA platform, called DARL, maximizes FPGA's computing capabilities by applying hardware optimizations to hyperdimensional computing's critical operations, including hardware-friendly encoder IP, the hypervector chunk fragmentation, and the delayed model update. Aside from hardware innovation, we also extend the platform to basic single-agent RL to support multi-agents distributed learning. We evaluate the effectiveness of our approach on OpenAI Gym tasks. Our results show that the FPGA platform provides on average 20× speedup compared to current state-of-the-art hyperdimensional RL methods running on Intel Xeon 6226 CPU. In addition, DARL provides around 4.8× faster and 4.2× higher energy efficiency compared to the state-of-the-art RL accelerator while ensuring a better or comparable quality of learning.
ISSN:1558-2434
DOI:10.1145/3508352.3549437