A stable method for task priority adaptation in quadratic programming via reinforcement learning

In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Robotics and computer-integrated manufacturing Jg. 91; S. 102857
Hauptverfasser: Testa, Andrea, Laghi, Marco, Bianco, Edoardo Del, Raiola, Gennaro, Hoffman, Enrico Mingo, Ajoudani, Arash
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.02.2025
Elsevier
Schlagworte:
ISSN:0736-5845, 1879-2537
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. It seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learning-based adaptation module. This module utilizes the Asynchronous Advantage Actor–Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, flexibility, and safety. •Attainment of multiple tasks in robot control using Quadratic Programming.•Usage of a Reinforcement Learning strategy for online adaptation of task priorities.•Implementation of the Asynchronous Advantage Actor–Critic algorithm.•Demonstration of the stability of the developed controller.•Validation on a Franka through collaborative peg-in-hole and polishing tasks.
AbstractList In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. It seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learning-based adaptation module. This module utilizes the Asynchronous Advantage Actor–Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, flexibility, and safety. •Attainment of multiple tasks in robot control using Quadratic Programming.•Usage of a Reinforcement Learning strategy for online adaptation of task priorities.•Implementation of the Asynchronous Advantage Actor–Critic algorithm.•Demonstration of the stability of the developed controller.•Validation on a Franka through collaborative peg-in-hole and polishing tasks.
In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. This controller seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learningbased adaptation module. This module utilizes the Asynchronous Advantage Actor-Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, and safety.
ArticleNumber 102857
Author Raiola, Gennaro
Laghi, Marco
Testa, Andrea
Hoffman, Enrico Mingo
Ajoudani, Arash
Bianco, Edoardo Del
Author_xml – sequence: 1
  givenname: Andrea
  orcidid: 0000-0001-9364-307X
  surname: Testa
  fullname: Testa, Andrea
  email: andrea.testa01.ext@leonardo.com
  organization: Leonardo S.p.A., Via Raffaele Pieragostini, 80, Genova, 16149, Italy
– sequence: 2
  givenname: Marco
  orcidid: 0000-0002-1819-8276
  surname: Laghi
  fullname: Laghi, Marco
  email: marco.laghi@iit.it
  organization: Istituto Italiano di Tecnologia (IIT), Via San Quirico, 19D, Genova, 16163, Italy
– sequence: 3
  givenname: Edoardo Del
  orcidid: 0000-0002-2138-5427
  surname: Bianco
  fullname: Bianco, Edoardo Del
  email: edoardo.delbianco@leonardo.com
  organization: Leonardo S.p.A., Via Raffaele Pieragostini, 80, Genova, 16149, Italy
– sequence: 4
  givenname: Gennaro
  orcidid: 0000-0003-1481-1106
  surname: Raiola
  fullname: Raiola, Gennaro
  email: gennaro.raiola@leonardo.com
  organization: Leonardo S.p.A., Via Raffaele Pieragostini, 80, Genova, 16149, Italy
– sequence: 5
  givenname: Enrico Mingo
  orcidid: 0000-0003-2063-7490
  surname: Hoffman
  fullname: Hoffman, Enrico Mingo
  email: enrico.mingo-hoffman@inria.fr
  organization: Istituto Italiano di Tecnologia (IIT), Via San Quirico, 19D, Genova, 16163, Italy
– sequence: 6
  givenname: Arash
  orcidid: 0000-0002-1261-737X
  surname: Ajoudani
  fullname: Ajoudani, Arash
  email: arash.ajoudani@iit.it
  organization: Istituto Italiano di Tecnologia (IIT), Via San Quirico, 19D, Genova, 16163, Italy
BackLink https://hal.science/hal-04280264$$DView record in HAL
BookMark eNp9kE9LAzEQxYNUsFW_gKdcPWzNn91mF7yUolYoeNFznE1m29RuotlY6Lc3peLR0zBv3m-YeRMy8sEjITecTTnjs7vtNBrXTwUTZRZEXakzMua1agpRSTUiY6bkrKjqsrogk2HYMpadlRyT9zkdErQ7pD2mTbC0C5EmGD7oZ3QhunSgYOEzQXLBU-fp1zfYmDuTDWEdoe-dX9O9AxrR-Uwb7NEnukOIPo-uyHkHuwGvf-sleXt8eF0si9XL0_NiviqMFE0qrOW2ZEJIJuvWVByZMrNGcazauhJQspZ3tlGlEiA709gOWmxLZSVIYK2p5SW5Pe3dwE7n23uIBx3A6eV8pY8aK0XNxKzci-wVJ6-JYRgidn8AZ_qYp97qY576mKc-5Zmh-xOE-Yu9w6gH49AbtC6iSdoG9x_-A7oagb0
Cites_doi 10.1109/ICRA.2018.8462877
10.1109/LRA.2017.2738321
10.1016/j.artint.2022.103771
10.1109/TRO.2018.2878355
10.1007/s10514-017-9677-2
10.1109/TSMCC.2012.2218595
10.1109/LRA.2020.2972847
10.3389/fmtec.2023.1154263
10.1109/JRA.1987.1087068
10.1016/j.rcim.2020.101996
10.1109/LRA.2018.2795639
10.1177/027836498700600201
10.1177/0278364914521306
10.1007/s10514-015-9436-1
ContentType Journal Article
Copyright 2024 The Author(s)
licence_http://creativecommons.org/publicdomain/zero
Copyright_xml – notice: 2024 The Author(s)
– notice: licence_http://creativecommons.org/publicdomain/zero
DBID 6I.
AAFTH
AAYXX
CITATION
1XC
VOOES
DOI 10.1016/j.rcim.2024.102857
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
Hyper Article en Ligne (HAL)
Hyper Article en Ligne (HAL) (Open Access)
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1879-2537
ExternalDocumentID oai:HAL:hal-04280264v2
10_1016_j_rcim_2024_102857
S0736584524001443
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
29P
4.4
457
4G.
5VS
6I.
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAFTH
AAIKC
AAIKJ
AAKOC
AALRI
AAMNW
AAOAW
AAQFI
AAQXK
AAXKI
AAXUO
AAYFN
ABBOA
ABFSI
ABJNI
ABMAC
ABXDB
ACDAQ
ACGFS
ACIWK
ACNNM
ACRLP
ADBBV
ADEZE
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFFNX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
E.L
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
HLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG9
LY7
M41
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PZZ
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SET
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
UHS
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
CITATION
EFKBS
EFLBG
~HD
1XC
VOOES
ID FETCH-LOGICAL-c329t-dd1d40223038bc51e07c6971e5b852a40b1fd97472a3fc9dfabeb47d3a3a0bc83
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001309641100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0736-5845
IngestDate Wed Nov 05 07:38:27 EST 2025
Sat Nov 29 03:56:53 EST 2025
Sat Sep 07 15:51:15 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Machine learning for robot control
Optimization and optimal control
Reinforcement learning
Reinforcement Learning
Machine Learning for Robot Control
Optimization and Optimal Control
Optimization and Optimal Control Reinforcement Learning Machine Learning for Robot Control
Language English
License This is an open access article under the CC BY license.
licence_http://creativecommons.org/publicdomain/zero/: http://creativecommons.org/publicdomain/zero
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c329t-dd1d40223038bc51e07c6971e5b852a40b1fd97472a3fc9dfabeb47d3a3a0bc83
ORCID 0000-0003-2063-7490
0000-0002-1261-737X
0000-0002-1819-8276
0000-0001-9364-307X
0000-0002-2138-5427
0000-0003-1481-1106
OpenAccessLink https://hal.science/hal-04280264
ParticipantIDs hal_primary_oai_HAL_hal_04280264v2
crossref_primary_10_1016_j_rcim_2024_102857
elsevier_sciencedirect_doi_10_1016_j_rcim_2024_102857
PublicationCentury 2000
PublicationDate February 2025
2025-02-00
2025-02
PublicationDateYYYYMMDD 2025-02-01
PublicationDate_xml – month: 02
  year: 2025
  text: February 2025
PublicationDecade 2020
PublicationTitle Robotics and computer-integrated manufacturing
PublicationYear 2025
Publisher Elsevier Ltd
Elsevier
Publisher_xml – name: Elsevier Ltd
– name: Elsevier
References Nakamura, Hanafusa, Yoshikawa (b13) 1987; 6
Chung, Fu, Hsu (b23) 2008
Penco, Hoffman, Modugno, Gomes, Mouret, Ivaldi (b7) 2020; 5
Nottensteiner, Stulp, Albu-Schäffer (b35) 2020
Liu, Tan, Padois (b5) 2016; 40
Tieleman, Hinton (b30) 2012; 4
Raiola, Cardenas, Tadele, De Vries, Stramigioli (b3) 2018; 3
Caron, Pham, Nakamura (b24) 2015
E. Mingo Hoffman, A. Laurenzi, L. Muratore, N.G. Tsagarakis, D.G. Caldwell, Multi-Priority Cartesian Impedance Control based on Quadratic Programming Optimization, in: IEEE International Conference on Robotics and Automation, ICRA, Brisbane, Australia, (ISSN: 2577-087X) 2018, pp. 309–315
Dehio, Steil (b15) 2019
Ajoudani, Zanchettin, Ivaldi, Albu-Schäffer, Kosuge, Khatib (b12) 2018; 42
Mnih, Badia, Mirza, Graves, Lillicrap, Harley, Silver, Kavukcuoglu (b11) 2016
Grondman, Busoniu, Lopes, Babuska (b19) 2012; 42
Del Prete (b25) 2017; 3
Modugno, Neumann, Rueckert, Oriolo, Peters, Ivaldi (b6) 2016
Salini, Padois, Bidaud (b4) 2011
Nambiar, Wiberg, Tarkian (b18) 2023; 3
Song, Chen, Li (b34) 2021; 67
Kuhn, Tucker (b36) 2013
Aszemi, Dominic (b31) 2019; 10
Degris, Pilarski, Sutton (b20) 2012
Silvério, Calinon, Rozo, Caldwell (b8) 2018; 35
Alibrahim, Ludwig (b32) 2021
Siciliano, Khatib, Kröger (b37) 2008
Tassi, Ajoudani (b16) 2023
Y. Abe, M. Da Silva, J. Popović, Multiobjective control with frictional contacts, in: Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2007, pp. 249–258.
Testa, Raiano, Laghi, Ajoudani, Mingo Hoffman (b26) 2023
.
Lober, Padois, Sigaud (b17) 2015
Escande, Mansard, Wieber (b1) 2014; 33
Roveda, Testa, Shahid, Braghin, Piga (b9) 2022; 312
Dehio, Reinhart, Steil (b10) 2015
Sola, Deray, Atchuthan (b27) 2018
Sutton, Barto (b29) 2018
Wang, Kurth-Nelson, Tirumala, Soyer, Leibo, Munos, Blundell, Kumaran, Botvinick (b21) 2016
Khatib (b28) 1987; 3
Siciliano, Sciavicco, Villani, Oriolo (b22) 2009
Tang, Lin, Zhao, Chen, Tomizuka (b33) 2016
Tassi (10.1016/j.rcim.2024.102857_b16) 2023
Aszemi (10.1016/j.rcim.2024.102857_b31) 2019; 10
Grondman (10.1016/j.rcim.2024.102857_b19) 2012; 42
Kuhn (10.1016/j.rcim.2024.102857_b36) 2013
Modugno (10.1016/j.rcim.2024.102857_b6) 2016
Escande (10.1016/j.rcim.2024.102857_b1) 2014; 33
Chung (10.1016/j.rcim.2024.102857_b23) 2008
Roveda (10.1016/j.rcim.2024.102857_b9) 2022; 312
Song (10.1016/j.rcim.2024.102857_b34) 2021; 67
10.1016/j.rcim.2024.102857_b14
Ajoudani (10.1016/j.rcim.2024.102857_b12) 2018; 42
Sutton (10.1016/j.rcim.2024.102857_b29) 2018
Liu (10.1016/j.rcim.2024.102857_b5) 2016; 40
Nakamura (10.1016/j.rcim.2024.102857_b13) 1987; 6
Wang (10.1016/j.rcim.2024.102857_b21) 2016
Tieleman (10.1016/j.rcim.2024.102857_b30) 2012; 4
Dehio (10.1016/j.rcim.2024.102857_b15) 2019
Del Prete (10.1016/j.rcim.2024.102857_b25) 2017; 3
Dehio (10.1016/j.rcim.2024.102857_b10) 2015
Caron (10.1016/j.rcim.2024.102857_b24) 2015
Alibrahim (10.1016/j.rcim.2024.102857_b32) 2021
Tang (10.1016/j.rcim.2024.102857_b33) 2016
Penco (10.1016/j.rcim.2024.102857_b7) 2020; 5
Raiola (10.1016/j.rcim.2024.102857_b3) 2018; 3
Siciliano (10.1016/j.rcim.2024.102857_b22) 2009
Degris (10.1016/j.rcim.2024.102857_b20) 2012
Nottensteiner (10.1016/j.rcim.2024.102857_b35) 2020
Nambiar (10.1016/j.rcim.2024.102857_b18) 2023; 3
Khatib (10.1016/j.rcim.2024.102857_b28) 1987; 3
Sola (10.1016/j.rcim.2024.102857_b27) 2018
Mnih (10.1016/j.rcim.2024.102857_b11) 2016
Salini (10.1016/j.rcim.2024.102857_b4) 2011
10.1016/j.rcim.2024.102857_b2
Lober (10.1016/j.rcim.2024.102857_b17) 2015
Siciliano (10.1016/j.rcim.2024.102857_b37) 2008
Testa (10.1016/j.rcim.2024.102857_b26) 2023
Silvério (10.1016/j.rcim.2024.102857_b8) 2018; 35
References_xml – volume: 3
  start-page: 1237
  year: 2018
  end-page: 1244
  ident: b3
  article-title: Development of a safety-and energy-aware impedance controller for collaborative robots
  publication-title: IEEE Robot. Autom. Lett.
– volume: 5
  start-page: 2626
  year: 2020
  end-page: 2633
  ident: b7
  article-title: Learning robust task priorities and gains for control of redundant robots
  publication-title: IEEE Robot. Autom. Lett.
– start-page: 1551
  year: 2021
  end-page: 1559
  ident: b32
  article-title: Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization
  publication-title: 2021 IEEE Congress on Evolutionary Computation
– volume: 42
  start-page: 957
  year: 2018
  end-page: 975
  ident: b12
  article-title: Progress and prospects of the human–robot collaboration
  publication-title: Auton. Robots
– start-page: 5771
  year: 2020
  end-page: 5777
  ident: b35
  article-title: Robust, locally guided peg-in-hole using impedance-controlled robots
  publication-title: 2020 IEEE International Conference on Robotics and Automation
– volume: 33
  start-page: 1006
  year: 2014
  end-page: 1028
  ident: b1
  article-title: Hierarchical quadratic programming: Fast online humanoid-robot motion generation
  publication-title: Int. J. Robot. Res.
– start-page: 3944
  year: 2015
  end-page: 3949
  ident: b17
  article-title: Variance modulated task prioritization in whole-body control
  publication-title: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems
– volume: 10
  year: 2019
  ident: b31
  article-title: Hyperparameter optimization in convolutional neural network using genetic algorithms
  publication-title: Int. J. Adv. Comput. Sci. Appl.
– volume: 3
  year: 2023
  ident: b18
  article-title: Automation of unstructured production environment by applying reinforcement learning
  publication-title: Front. Manuf. Technol.
– reference: Y. Abe, M. Da Silva, J. Popović, Multiobjective control with frictional contacts, in: Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2007, pp. 249–258.
– start-page: 1928
  year: 2016
  end-page: 1937
  ident: b11
  article-title: Asynchronous methods for deep reinforcement learning
  publication-title: International Conference on Machine Learning
– start-page: 123
  year: 2023
  end-page: 136
  ident: b26
  article-title: Joint position bounds in resolved-acceleration control: a comparison
  publication-title: International Workshop on Human-Friendly Robotics
– start-page: 6416
  year: 2015
  end-page: 6421
  ident: b10
  article-title: Multiple task optimization with a mixture of controllers for motion generation
  publication-title: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems
– year: 2018
  ident: b27
  article-title: A micro Lie theory for state estimation in robotics
– year: 2018
  ident: b29
  article-title: Reinforcement Learning: An Introduction
– volume: 6
  start-page: 3
  year: 1987
  end-page: 15
  ident: b13
  article-title: Task-priority based redundancy control of robot manipulators
  publication-title: Int. J. Robot. Res.
– year: 2016
  ident: b21
  article-title: Learning to reinforcement learn
– volume: 67
  year: 2021
  ident: b34
  article-title: A peg-in-hole robot assembly system based on Gauss mixture model
  publication-title: Robot. Comput.-Integr. Manuf.
– start-page: 162
  year: 2016
  end-page: 167
  ident: b33
  article-title: Autonomous alignment of peg and hole by force/torque measurement for robotic assembly
  publication-title: 2016 IEEE International Conference on Automation Science and Engineering
– start-page: 247
  year: 2013
  end-page: 258
  ident: b36
  article-title: Nonlinear programming
  publication-title: Traces and Emergence of Nonlinear Programming
– reference: E. Mingo Hoffman, A. Laurenzi, L. Muratore, N.G. Tsagarakis, D.G. Caldwell, Multi-Priority Cartesian Impedance Control based on Quadratic Programming Optimization, in: IEEE International Conference on Robotics and Automation, ICRA, Brisbane, Australia, (ISSN: 2577-087X) 2018, pp. 309–315,
– year: 2008
  ident: b37
  article-title: Springer Handbook of Robotics
– start-page: 1141
  year: 2019
  end-page: 1147
  ident: b15
  article-title: Dynamically-consistent generalized hierarchical control
  publication-title: 2019 International Conference on Robotics and Automation
– volume: 35
  start-page: 78
  year: 2018
  end-page: 94
  ident: b8
  article-title: Learning task priorities from demonstrations
  publication-title: IEEE Trans. Robot.
– start-page: 133
  year: 2008
  end-page: 159
  ident: b23
  article-title: Motion control
  publication-title: Springer Handbook of Robotics
– start-page: 2177
  year: 2012
  end-page: 2182
  ident: b20
  article-title: Model-free reinforcement learning with continuous action in practice
  publication-title: 2012 American Control Conference
– volume: 4
  start-page: 26
  year: 2012
  end-page: 31
  ident: b30
  article-title: Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
  publication-title: COURSERA: Neural Netw. Mach. Learn.
– start-page: 1283
  year: 2011
  end-page: 1290
  ident: b4
  article-title: Synthesis of complex humanoid whole-body behavior: A focus on sequencing and tasks transitions
  publication-title: 2011 IEEE International Conference on Robotics and Automation
– reference: .
– year: 2023
  ident: b16
  article-title: Multi-modal and adaptive control of human-robot interaction through hierarchical quadratic programming
– volume: 42
  start-page: 1291
  year: 2012
  end-page: 1307
  ident: b19
  article-title: A survey of actor-critic reinforcement learning: Standard and natural policy gradients
  publication-title: IEEE Trans. Syst. Man Cybern. C
– volume: 3
  start-page: 43
  year: 1987
  end-page: 53
  ident: b28
  article-title: A unified approach for motion and force control of robot manipulators: The operational space formulation
  publication-title: IEEE J. Robot. Autom.
– start-page: 221
  year: 2016
  end-page: 226
  ident: b6
  article-title: Learning soft task priorities for control of redundant robots
  publication-title: 2016 IEEE International Conference on Robotics and Automation
– volume: 40
  start-page: 17
  year: 2016
  end-page: 31
  ident: b5
  article-title: Generalized hierarchical control
  publication-title: Auton. Robots
– start-page: 5107
  year: 2015
  end-page: 5112
  ident: b24
  article-title: Stability of surface contacts for humanoid robots: Closed-form formulae of the contact wrench cone for rectangular support areas
  publication-title: 2015 IEEE International Conference on Robotics and Automation
– start-page: 105
  year: 2009
  end-page: 160
  ident: b22
  article-title: Differential kinematics and statics
  publication-title: Robotics: Modelling, Planning and Control
– volume: 312
  year: 2022
  ident: b9
  article-title: Q-learning-based model predictive variable impedance control for physical human-robot collaboration
  publication-title: Artificial Intelligence
– volume: 3
  start-page: 281
  year: 2017
  end-page: 288
  ident: b25
  article-title: Joint position and velocity bounds in discrete-time acceleration/torque control of robot manipulators
  publication-title: IEEE Robot. Autom. Lett.
– start-page: 1551
  year: 2021
  ident: 10.1016/j.rcim.2024.102857_b32
  article-title: Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization
– start-page: 133
  year: 2008
  ident: 10.1016/j.rcim.2024.102857_b23
  article-title: Motion control
– ident: 10.1016/j.rcim.2024.102857_b14
  doi: 10.1109/ICRA.2018.8462877
– volume: 3
  start-page: 281
  issue: 1
  year: 2017
  ident: 10.1016/j.rcim.2024.102857_b25
  article-title: Joint position and velocity bounds in discrete-time acceleration/torque control of robot manipulators
  publication-title: IEEE Robot. Autom. Lett.
  doi: 10.1109/LRA.2017.2738321
– volume: 312
  year: 2022
  ident: 10.1016/j.rcim.2024.102857_b9
  article-title: Q-learning-based model predictive variable impedance control for physical human-robot collaboration
  publication-title: Artificial Intelligence
  doi: 10.1016/j.artint.2022.103771
– volume: 35
  start-page: 78
  issue: 1
  year: 2018
  ident: 10.1016/j.rcim.2024.102857_b8
  article-title: Learning task priorities from demonstrations
  publication-title: IEEE Trans. Robot.
  doi: 10.1109/TRO.2018.2878355
– volume: 42
  start-page: 957
  year: 2018
  ident: 10.1016/j.rcim.2024.102857_b12
  article-title: Progress and prospects of the human–robot collaboration
  publication-title: Auton. Robots
  doi: 10.1007/s10514-017-9677-2
– year: 2018
  ident: 10.1016/j.rcim.2024.102857_b27
– volume: 42
  start-page: 1291
  issue: 6
  year: 2012
  ident: 10.1016/j.rcim.2024.102857_b19
  article-title: A survey of actor-critic reinforcement learning: Standard and natural policy gradients
  publication-title: IEEE Trans. Syst. Man Cybern. C
  doi: 10.1109/TSMCC.2012.2218595
– start-page: 5107
  year: 2015
  ident: 10.1016/j.rcim.2024.102857_b24
  article-title: Stability of surface contacts for humanoid robots: Closed-form formulae of the contact wrench cone for rectangular support areas
– volume: 5
  start-page: 2626
  issue: 2
  year: 2020
  ident: 10.1016/j.rcim.2024.102857_b7
  article-title: Learning robust task priorities and gains for control of redundant robots
  publication-title: IEEE Robot. Autom. Lett.
  doi: 10.1109/LRA.2020.2972847
– year: 2008
  ident: 10.1016/j.rcim.2024.102857_b37
– start-page: 221
  year: 2016
  ident: 10.1016/j.rcim.2024.102857_b6
  article-title: Learning soft task priorities for control of redundant robots
– ident: 10.1016/j.rcim.2024.102857_b2
– start-page: 1928
  year: 2016
  ident: 10.1016/j.rcim.2024.102857_b11
  article-title: Asynchronous methods for deep reinforcement learning
– start-page: 247
  year: 2013
  ident: 10.1016/j.rcim.2024.102857_b36
  article-title: Nonlinear programming
– volume: 3
  year: 2023
  ident: 10.1016/j.rcim.2024.102857_b18
  article-title: Automation of unstructured production environment by applying reinforcement learning
  publication-title: Front. Manuf. Technol.
  doi: 10.3389/fmtec.2023.1154263
– volume: 3
  start-page: 43
  issue: 1
  year: 1987
  ident: 10.1016/j.rcim.2024.102857_b28
  article-title: A unified approach for motion and force control of robot manipulators: The operational space formulation
  publication-title: IEEE J. Robot. Autom.
  doi: 10.1109/JRA.1987.1087068
– start-page: 1283
  year: 2011
  ident: 10.1016/j.rcim.2024.102857_b4
  article-title: Synthesis of complex humanoid whole-body behavior: A focus on sequencing and tasks transitions
– start-page: 123
  year: 2023
  ident: 10.1016/j.rcim.2024.102857_b26
  article-title: Joint position bounds in resolved-acceleration control: a comparison
– volume: 4
  start-page: 26
  issue: 2
  year: 2012
  ident: 10.1016/j.rcim.2024.102857_b30
  article-title: Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
  publication-title: COURSERA: Neural Netw. Mach. Learn.
– volume: 67
  year: 2021
  ident: 10.1016/j.rcim.2024.102857_b34
  article-title: A peg-in-hole robot assembly system based on Gauss mixture model
  publication-title: Robot. Comput.-Integr. Manuf.
  doi: 10.1016/j.rcim.2020.101996
– volume: 3
  start-page: 1237
  issue: 2
  year: 2018
  ident: 10.1016/j.rcim.2024.102857_b3
  article-title: Development of a safety-and energy-aware impedance controller for collaborative robots
  publication-title: IEEE Robot. Autom. Lett.
  doi: 10.1109/LRA.2018.2795639
– volume: 10
  issue: 6
  year: 2019
  ident: 10.1016/j.rcim.2024.102857_b31
  article-title: Hyperparameter optimization in convolutional neural network using genetic algorithms
  publication-title: Int. J. Adv. Comput. Sci. Appl.
– volume: 6
  start-page: 3
  issue: 2
  year: 1987
  ident: 10.1016/j.rcim.2024.102857_b13
  article-title: Task-priority based redundancy control of robot manipulators
  publication-title: Int. J. Robot. Res.
  doi: 10.1177/027836498700600201
– start-page: 105
  year: 2009
  ident: 10.1016/j.rcim.2024.102857_b22
  article-title: Differential kinematics and statics
– start-page: 2177
  year: 2012
  ident: 10.1016/j.rcim.2024.102857_b20
  article-title: Model-free reinforcement learning with continuous action in practice
– year: 2023
  ident: 10.1016/j.rcim.2024.102857_b16
– year: 2016
  ident: 10.1016/j.rcim.2024.102857_b21
– start-page: 1141
  year: 2019
  ident: 10.1016/j.rcim.2024.102857_b15
  article-title: Dynamically-consistent generalized hierarchical control
– year: 2018
  ident: 10.1016/j.rcim.2024.102857_b29
– start-page: 5771
  year: 2020
  ident: 10.1016/j.rcim.2024.102857_b35
  article-title: Robust, locally guided peg-in-hole using impedance-controlled robots
– start-page: 3944
  year: 2015
  ident: 10.1016/j.rcim.2024.102857_b17
  article-title: Variance modulated task prioritization in whole-body control
– start-page: 162
  year: 2016
  ident: 10.1016/j.rcim.2024.102857_b33
  article-title: Autonomous alignment of peg and hole by force/torque measurement for robotic assembly
– volume: 33
  start-page: 1006
  issue: 7
  year: 2014
  ident: 10.1016/j.rcim.2024.102857_b1
  article-title: Hierarchical quadratic programming: Fast online humanoid-robot motion generation
  publication-title: Int. J. Robot. Res.
  doi: 10.1177/0278364914521306
– volume: 40
  start-page: 17
  year: 2016
  ident: 10.1016/j.rcim.2024.102857_b5
  article-title: Generalized hierarchical control
  publication-title: Auton. Robots
  doi: 10.1007/s10514-015-9436-1
– start-page: 6416
  year: 2015
  ident: 10.1016/j.rcim.2024.102857_b10
  article-title: Multiple task optimization with a mixture of controllers for motion generation
SSID ssj0002453
Score 2.4082274
Snippet In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the...
SourceID hal
crossref
elsevier
SourceType Open Access Repository
Index Database
Publisher
StartPage 102857
SubjectTerms Computer Science
Machine Learning
Machine learning for robot control
Optimization and optimal control
Reinforcement learning
Robotics
Title A stable method for task priority adaptation in quadratic programming via reinforcement learning
URI https://dx.doi.org/10.1016/j.rcim.2024.102857
https://hal.science/hal-04280264
Volume 91
WOSCitedRecordID wos001309641100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-2537
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002453
  issn: 0736-5845
  databaseCode: AIEXJ
  dateStart: 19960301
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELa2LQd64FFALS9ZiNsqVeIka_u4QosKqipUFWlvwa9ACiSrdLvqT-HnMo4fuy0qogcuUWQljpX5NB5_mvkGobe5MUorTRNhWJkUkvGEc8MSpkhJdJpqXadDswl6csLmc_5pNPoVamFWP2jbsqsrvvivpoYxMLYtnb2DueOkMAD3YHS4gtnh-k-Gn1p6wNZDuebQLo9QXHwfL_qms63qxkKLhU8ybFpbVqn7QbfV52r9tOzBqhHj3gyyqmpgEEN_ia-b4expJ7uo86x8g4gkSlBomxx7aWsnhmLIyBIYH7MO2ZRxXzgWrsOwrR9S3Zq9B1wOfO5MdwDnDlxkzAo5FQ0czR2337ai7zZJDFKGvOfo62g-SSAWKjcds2vj5T2rDYSclPUfTt_xD-eHvWqstgApDtcPX1fYvrHzxXzEkOp2Xtk5KjtH5ebYQjuElhz85c70w2z-Me7ypHAKp2HhviDL5Q7eXMltQc_Wt0DfD-HM2SP0wJ9D8NTh5zEamXYPPQw9PrB3-Xtod0Ow8gn6MsUOXNiBCwM8sAUXDuDCa3DhpsURXHgDXBjAha-BCwdwPUWf38_O3h0lvkdHonLCl4nWmS4gDoRIiElVZialasJpZkrJSiKKVGa1tmdWIvJacV0LaWRBdS5ykUrF8mdou-1as48wk7VkWU0VJ7RIueQ5Z6mimciYEJNaHqBx-InVwkmxVLcb7gCV4T9XPph0QWIFsPnre2_AKPEDVn39aHpc2TFLL6RwgFiR53daygt0f435l2h72V-aV-ieWi2bi_61B9Zv_FCm4A
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+stable+method+for+task+priority+adaptation+in+quadratic+programming+via+reinforcement+learning&rft.jtitle=Robotics+and+computer-integrated+manufacturing&rft.au=Testa%2C+Andrea&rft.au=Laghi%2C+Marco&rft.au=Bianco%2C+Edoardo+Del&rft.au=Raiola%2C+Gennaro&rft.date=2025-02-01&rft.issn=0736-5845&rft.volume=91&rft.spage=102857&rft_id=info:doi/10.1016%2Fj.rcim.2024.102857&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_rcim_2024_102857
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0736-5845&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0736-5845&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0736-5845&client=summon