A stable method for task priority adaptation in quadratic programming via reinforcement learning

In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Robotics and computer-integrated manufacturing Ročník 91; s. 102857
Hlavní autori:	Testa, Andrea, Laghi, Marco, Bianco, Edoardo Del, Raiola, Gennaro, Hoffman, Enrico Mingo, Ajoudani, Arash
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Elsevier Ltd 01.02.2025 Elsevier
Predmet:	Computer Science Machine Learning Machine learning for robot control Optimization and optimal control Reinforcement learning Robotics Machine learning for robot control Optimization and optimal control Reinforcement learning Reinforcement Learning Machine Learning for Robot Control Optimization and Optimal Control Optimization and Optimal Control Reinforcement Learning Machine Learning for Robot Control
ISSN:	0736-5845, 1879-2537
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. It seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learning-based adaptation module. This module utilizes the Asynchronous Advantage Actor–Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, flexibility, and safety. •Attainment of multiple tasks in robot control using Quadratic Programming.•Usage of a Reinforcement Learning strategy for online adaptation of task priorities.•Implementation of the Asynchronous Advantage Actor–Critic algorithm.•Demonstration of the stability of the developed controller.•Validation on a Franka through collaborative peg-in-hole and polishing tasks.
ISSN:	0736-5845 1879-2537
DOI:	10.1016/j.rcim.2024.102857