Bandit-based Variable Fixing for Binary Optimization on GPU Parallel Computing

This paper explores whether reinforcement learning is capable of enhancing metaheuristics for the quadratic unconstrained binary optimization (QUBO), which have recently attracted attention as a solver for a wide range of combinatorial optimization problems. In particular, we introduce a novel appro...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings - Euromicro Workshop on Parallel and Distributed Processing s. 154 - 158
Hlavní autor:	Yasudo, Ryota
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 01.03.2023
Témata:	decision making GPGPU Graphics processing units Linear programming Metaheuristics multi-armed bandit problem Parallel algorithms quadratic unconstrained binary optimization Reinforcement learning Search problems
ISSN:	2377-5750
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This paper explores whether reinforcement learning is capable of enhancing metaheuristics for the quadratic unconstrained binary optimization (QUBO), which have recently attracted attention as a solver for a wide range of combinatorial optimization problems. In particular, we introduce a novel approach called the bandit-based variable fixing (BVF). The key idea behind BVF is to regard an execution of an arbitrary metaheuristic with a variable fixed as a play of a slot machine. Thus, BVF explores variables to fix with the maximum expected reward, and executes a metaheuristic at the same time. The bandit-based approach is then extended to fix multiple variables. To accelerate solving multi-armed bandit problem, we implement a parallel algorithm for BVF on a GPU. Our results suggest that our proposed BVF enhances original metaheuristics.
ISSN:	2377-5750
DOI:	10.1109/PDP59025.2023.00031