A Learning Approach to Multi-robot Task Allocation with Priority Constraints and Uncertainty

Multi-robot task allocation has an important impact on the efficiency of multi-robot collaboration. For single-shot allocation without complicated constraints, some exact algorithms and heuristic algorithms can find the optimal solution efficiently. However, considering the priority constraints and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2022 IEEE International Conference on Industrial Technology (ICIT) S. 1 - 8
Hauptverfasser: Deng, Fuqin, Huang, Huanzhao, Fu, Lanhui, Yue, Hongwei, Zhang, Jianmin, Wu, Zexiao, Lam, Tin Lun
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 22.08.2022
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multi-robot task allocation has an important impact on the efficiency of multi-robot collaboration. For single-shot allocation without complicated constraints, some exact algorithms and heuristic algorithms can find the optimal solution efficiently. However, considering the priority constraints and uncertain execution time of robots for multiple times of allocation in an approximate dynamic programming environment, traditional methods such as heuristic algorithms have limited performance. To obtain better performance, we propose a method based on deep reinforcement learning. Specifically, we first use the directed acyclic graph to describe the priority relationship between tasks. Then we propose a graph neural network with a hierarchical attention mechanism to extract the characteristics of the task groups. Finally, we design the policy network to solve the approximate dynamic programming problem of multi-robot task allocation. Through training on the dataset of a given environment, the policy network can gradually refine the decision-making process by reinforcement learning. Experiment results show that the proposed modeling and solving method can find better solutions than existing heuristic algorithms. Furthermore, the learned strategy can be directly applied in other untrained environments with superior performance.
DOI:10.1109/ICIT48603.2022.10002805