Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization

Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization. However, these algorithms may require careful hyperparameter tuning for each problem instance. We use a reinforcement learning agent in conjunction with a quantum-inspired algorithm to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machine learning: science and technology Jg. 2; H. 2; S. 25009 - 25020
Hauptverfasser: Beloborodov, Dmitrii, Ulanov, A E, Foerster, Jakob N, Whiteson, Shimon, Lvovsky, A I
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Bristol IOP Publishing 01.06.2021
Schlagworte:
ISSN:2632-2153, 2632-2153
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization. However, these algorithms may require careful hyperparameter tuning for each problem instance. We use a reinforcement learning agent in conjunction with a quantum-inspired algorithm to solve the Ising energy minimization problem, which is equivalent to the Maximum Cut problem. The agent controls the algorithm by tuning one of its parameters with the goal of improving recently seen solutions. We propose a new Rescaled Ranked Reward (R3) method that enables a stable single-player version of self-play training and helps the agent escape local optima. The training on any problem instance can be accelerated by applying transfer learning from an agent trained on randomly generated problems. Our approach allows sampling high quality solutions to the Ising problem with high probability and outperforms both baseline heuristics and a black-box hyperparameter optimization approach.
Bibliographie:MLST-100171.R2
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2632-2153
2632-2153
DOI:10.1088/2632-2153/abc328