A stable method for task priority adaptation in quadratic programming via reinforcement learning
In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the...
Gespeichert in:
| Veröffentlicht in: | Robotics and computer-integrated manufacturing Jg. 91; S. 102857 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Ltd
01.02.2025
Elsevier |
| Schlagworte: | |
| ISSN: | 0736-5845, 1879-2537 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. It seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learning-based adaptation module. This module utilizes the Asynchronous Advantage Actor–Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, flexibility, and safety.
•Attainment of multiple tasks in robot control using Quadratic Programming.•Usage of a Reinforcement Learning strategy for online adaptation of task priorities.•Implementation of the Asynchronous Advantage Actor–Critic algorithm.•Demonstration of the stability of the developed controller.•Validation on a Franka through collaborative peg-in-hole and polishing tasks. |
|---|---|
| AbstractList | In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. It seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learning-based adaptation module. This module utilizes the Asynchronous Advantage Actor–Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, flexibility, and safety.
•Attainment of multiple tasks in robot control using Quadratic Programming.•Usage of a Reinforcement Learning strategy for online adaptation of task priorities.•Implementation of the Asynchronous Advantage Actor–Critic algorithm.•Demonstration of the stability of the developed controller.•Validation on a Franka through collaborative peg-in-hole and polishing tasks. In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. This controller seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learningbased adaptation module. This module utilizes the Asynchronous Advantage Actor-Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, and safety. |
| ArticleNumber | 102857 |
| Author | Raiola, Gennaro Laghi, Marco Testa, Andrea Hoffman, Enrico Mingo Ajoudani, Arash Bianco, Edoardo Del |
| Author_xml | – sequence: 1 givenname: Andrea orcidid: 0000-0001-9364-307X surname: Testa fullname: Testa, Andrea email: andrea.testa01.ext@leonardo.com organization: Leonardo S.p.A., Via Raffaele Pieragostini, 80, Genova, 16149, Italy – sequence: 2 givenname: Marco orcidid: 0000-0002-1819-8276 surname: Laghi fullname: Laghi, Marco email: marco.laghi@iit.it organization: Istituto Italiano di Tecnologia (IIT), Via San Quirico, 19D, Genova, 16163, Italy – sequence: 3 givenname: Edoardo Del orcidid: 0000-0002-2138-5427 surname: Bianco fullname: Bianco, Edoardo Del email: edoardo.delbianco@leonardo.com organization: Leonardo S.p.A., Via Raffaele Pieragostini, 80, Genova, 16149, Italy – sequence: 4 givenname: Gennaro orcidid: 0000-0003-1481-1106 surname: Raiola fullname: Raiola, Gennaro email: gennaro.raiola@leonardo.com organization: Leonardo S.p.A., Via Raffaele Pieragostini, 80, Genova, 16149, Italy – sequence: 5 givenname: Enrico Mingo orcidid: 0000-0003-2063-7490 surname: Hoffman fullname: Hoffman, Enrico Mingo email: enrico.mingo-hoffman@inria.fr organization: Istituto Italiano di Tecnologia (IIT), Via San Quirico, 19D, Genova, 16163, Italy – sequence: 6 givenname: Arash orcidid: 0000-0002-1261-737X surname: Ajoudani fullname: Ajoudani, Arash email: arash.ajoudani@iit.it organization: Istituto Italiano di Tecnologia (IIT), Via San Quirico, 19D, Genova, 16163, Italy |
| BackLink | https://hal.science/hal-04280264$$DView record in HAL |
| BookMark | eNp9kE9LAzEQxYNUsFW_gKdcPWzNn91mF7yUolYoeNFznE1m29RuotlY6Lc3peLR0zBv3m-YeRMy8sEjITecTTnjs7vtNBrXTwUTZRZEXakzMua1agpRSTUiY6bkrKjqsrogk2HYMpadlRyT9zkdErQ7pD2mTbC0C5EmGD7oZ3QhunSgYOEzQXLBU-fp1zfYmDuTDWEdoe-dX9O9AxrR-Uwb7NEnukOIPo-uyHkHuwGvf-sleXt8eF0si9XL0_NiviqMFE0qrOW2ZEJIJuvWVByZMrNGcazauhJQspZ3tlGlEiA709gOWmxLZSVIYK2p5SW5Pe3dwE7n23uIBx3A6eV8pY8aK0XNxKzci-wVJ6-JYRgidn8AZ_qYp97qY576mKc-5Zmh-xOE-Yu9w6gH49AbtC6iSdoG9x_-A7oagb0 |
| Cites_doi | 10.1109/ICRA.2018.8462877 10.1109/LRA.2017.2738321 10.1016/j.artint.2022.103771 10.1109/TRO.2018.2878355 10.1007/s10514-017-9677-2 10.1109/TSMCC.2012.2218595 10.1109/LRA.2020.2972847 10.3389/fmtec.2023.1154263 10.1109/JRA.1987.1087068 10.1016/j.rcim.2020.101996 10.1109/LRA.2018.2795639 10.1177/027836498700600201 10.1177/0278364914521306 10.1007/s10514-015-9436-1 |
| ContentType | Journal Article |
| Copyright | 2024 The Author(s) licence_http://creativecommons.org/publicdomain/zero |
| Copyright_xml | – notice: 2024 The Author(s) – notice: licence_http://creativecommons.org/publicdomain/zero |
| DBID | 6I. AAFTH AAYXX CITATION 1XC VOOES |
| DOI | 10.1016/j.rcim.2024.102857 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef Hyper Article en Ligne (HAL) Hyper Article en Ligne (HAL) (Open Access) |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1879-2537 |
| ExternalDocumentID | oai:HAL:hal-04280264v2 10_1016_j_rcim_2024_102857 S0736584524001443 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1~. 1~5 29P 4.4 457 4G. 5VS 6I. 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAFTH AAIKC AAIKJ AAKOC AALRI AAMNW AAOAW AAQFI AAQXK AAXKI AAXUO AAYFN ABBOA ABFSI ABJNI ABMAC ABXDB ACDAQ ACGFS ACIWK ACNNM ACRLP ADBBV ADEZE ADMUD ADTZH AEBSH AECPX AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AIALX AIEXJ AIKHN AITUG AJOXV AKRWK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 E.L EBS EFJIC EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 LY7 M41 MO0 MS~ N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PZZ Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K UHS WUQ XPP ZMT ~G- 9DU AATTM AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKYEP ANKPU APXCP CITATION EFKBS EFLBG ~HD 1XC VOOES |
| ID | FETCH-LOGICAL-c329t-dd1d40223038bc51e07c6971e5b852a40b1fd97472a3fc9dfabeb47d3a3a0bc83 |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001309641100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0736-5845 |
| IngestDate | Wed Nov 05 07:38:27 EST 2025 Sat Nov 29 03:56:53 EST 2025 Sat Sep 07 15:51:15 EDT 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Machine learning for robot control Optimization and optimal control Reinforcement learning Reinforcement Learning Machine Learning for Robot Control Optimization and Optimal Control Optimization and Optimal Control Reinforcement Learning Machine Learning for Robot Control |
| Language | English |
| License | This is an open access article under the CC BY license. licence_http://creativecommons.org/publicdomain/zero/: http://creativecommons.org/publicdomain/zero |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c329t-dd1d40223038bc51e07c6971e5b852a40b1fd97472a3fc9dfabeb47d3a3a0bc83 |
| ORCID | 0000-0003-2063-7490 0000-0002-1261-737X 0000-0002-1819-8276 0000-0001-9364-307X 0000-0002-2138-5427 0000-0003-1481-1106 |
| OpenAccessLink | https://hal.science/hal-04280264 |
| ParticipantIDs | hal_primary_oai_HAL_hal_04280264v2 crossref_primary_10_1016_j_rcim_2024_102857 elsevier_sciencedirect_doi_10_1016_j_rcim_2024_102857 |
| PublicationCentury | 2000 |
| PublicationDate | February 2025 2025-02-00 2025-02 |
| PublicationDateYYYYMMDD | 2025-02-01 |
| PublicationDate_xml | – month: 02 year: 2025 text: February 2025 |
| PublicationDecade | 2020 |
| PublicationTitle | Robotics and computer-integrated manufacturing |
| PublicationYear | 2025 |
| Publisher | Elsevier Ltd Elsevier |
| Publisher_xml | – name: Elsevier Ltd – name: Elsevier |
| References | Nakamura, Hanafusa, Yoshikawa (b13) 1987; 6 Chung, Fu, Hsu (b23) 2008 Penco, Hoffman, Modugno, Gomes, Mouret, Ivaldi (b7) 2020; 5 Nottensteiner, Stulp, Albu-Schäffer (b35) 2020 Liu, Tan, Padois (b5) 2016; 40 Tieleman, Hinton (b30) 2012; 4 Raiola, Cardenas, Tadele, De Vries, Stramigioli (b3) 2018; 3 Caron, Pham, Nakamura (b24) 2015 E. Mingo Hoffman, A. Laurenzi, L. Muratore, N.G. Tsagarakis, D.G. Caldwell, Multi-Priority Cartesian Impedance Control based on Quadratic Programming Optimization, in: IEEE International Conference on Robotics and Automation, ICRA, Brisbane, Australia, (ISSN: 2577-087X) 2018, pp. 309–315 Dehio, Steil (b15) 2019 Ajoudani, Zanchettin, Ivaldi, Albu-Schäffer, Kosuge, Khatib (b12) 2018; 42 Mnih, Badia, Mirza, Graves, Lillicrap, Harley, Silver, Kavukcuoglu (b11) 2016 Grondman, Busoniu, Lopes, Babuska (b19) 2012; 42 Del Prete (b25) 2017; 3 Modugno, Neumann, Rueckert, Oriolo, Peters, Ivaldi (b6) 2016 Salini, Padois, Bidaud (b4) 2011 Nambiar, Wiberg, Tarkian (b18) 2023; 3 Song, Chen, Li (b34) 2021; 67 Kuhn, Tucker (b36) 2013 Aszemi, Dominic (b31) 2019; 10 Degris, Pilarski, Sutton (b20) 2012 Silvério, Calinon, Rozo, Caldwell (b8) 2018; 35 Alibrahim, Ludwig (b32) 2021 Siciliano, Khatib, Kröger (b37) 2008 Tassi, Ajoudani (b16) 2023 Y. Abe, M. Da Silva, J. Popović, Multiobjective control with frictional contacts, in: Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2007, pp. 249–258. Testa, Raiano, Laghi, Ajoudani, Mingo Hoffman (b26) 2023 . Lober, Padois, Sigaud (b17) 2015 Escande, Mansard, Wieber (b1) 2014; 33 Roveda, Testa, Shahid, Braghin, Piga (b9) 2022; 312 Dehio, Reinhart, Steil (b10) 2015 Sola, Deray, Atchuthan (b27) 2018 Sutton, Barto (b29) 2018 Wang, Kurth-Nelson, Tirumala, Soyer, Leibo, Munos, Blundell, Kumaran, Botvinick (b21) 2016 Khatib (b28) 1987; 3 Siciliano, Sciavicco, Villani, Oriolo (b22) 2009 Tang, Lin, Zhao, Chen, Tomizuka (b33) 2016 Tassi (10.1016/j.rcim.2024.102857_b16) 2023 Aszemi (10.1016/j.rcim.2024.102857_b31) 2019; 10 Grondman (10.1016/j.rcim.2024.102857_b19) 2012; 42 Kuhn (10.1016/j.rcim.2024.102857_b36) 2013 Modugno (10.1016/j.rcim.2024.102857_b6) 2016 Escande (10.1016/j.rcim.2024.102857_b1) 2014; 33 Chung (10.1016/j.rcim.2024.102857_b23) 2008 Roveda (10.1016/j.rcim.2024.102857_b9) 2022; 312 Song (10.1016/j.rcim.2024.102857_b34) 2021; 67 10.1016/j.rcim.2024.102857_b14 Ajoudani (10.1016/j.rcim.2024.102857_b12) 2018; 42 Sutton (10.1016/j.rcim.2024.102857_b29) 2018 Liu (10.1016/j.rcim.2024.102857_b5) 2016; 40 Nakamura (10.1016/j.rcim.2024.102857_b13) 1987; 6 Wang (10.1016/j.rcim.2024.102857_b21) 2016 Tieleman (10.1016/j.rcim.2024.102857_b30) 2012; 4 Dehio (10.1016/j.rcim.2024.102857_b15) 2019 Del Prete (10.1016/j.rcim.2024.102857_b25) 2017; 3 Dehio (10.1016/j.rcim.2024.102857_b10) 2015 Caron (10.1016/j.rcim.2024.102857_b24) 2015 Alibrahim (10.1016/j.rcim.2024.102857_b32) 2021 Tang (10.1016/j.rcim.2024.102857_b33) 2016 Penco (10.1016/j.rcim.2024.102857_b7) 2020; 5 Raiola (10.1016/j.rcim.2024.102857_b3) 2018; 3 Siciliano (10.1016/j.rcim.2024.102857_b22) 2009 Degris (10.1016/j.rcim.2024.102857_b20) 2012 Nottensteiner (10.1016/j.rcim.2024.102857_b35) 2020 Nambiar (10.1016/j.rcim.2024.102857_b18) 2023; 3 Khatib (10.1016/j.rcim.2024.102857_b28) 1987; 3 Sola (10.1016/j.rcim.2024.102857_b27) 2018 Mnih (10.1016/j.rcim.2024.102857_b11) 2016 Salini (10.1016/j.rcim.2024.102857_b4) 2011 10.1016/j.rcim.2024.102857_b2 Lober (10.1016/j.rcim.2024.102857_b17) 2015 Siciliano (10.1016/j.rcim.2024.102857_b37) 2008 Testa (10.1016/j.rcim.2024.102857_b26) 2023 Silvério (10.1016/j.rcim.2024.102857_b8) 2018; 35 |
| References_xml | – volume: 3 start-page: 1237 year: 2018 end-page: 1244 ident: b3 article-title: Development of a safety-and energy-aware impedance controller for collaborative robots publication-title: IEEE Robot. Autom. Lett. – volume: 5 start-page: 2626 year: 2020 end-page: 2633 ident: b7 article-title: Learning robust task priorities and gains for control of redundant robots publication-title: IEEE Robot. Autom. Lett. – start-page: 1551 year: 2021 end-page: 1559 ident: b32 article-title: Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization publication-title: 2021 IEEE Congress on Evolutionary Computation – volume: 42 start-page: 957 year: 2018 end-page: 975 ident: b12 article-title: Progress and prospects of the human–robot collaboration publication-title: Auton. Robots – start-page: 5771 year: 2020 end-page: 5777 ident: b35 article-title: Robust, locally guided peg-in-hole using impedance-controlled robots publication-title: 2020 IEEE International Conference on Robotics and Automation – volume: 33 start-page: 1006 year: 2014 end-page: 1028 ident: b1 article-title: Hierarchical quadratic programming: Fast online humanoid-robot motion generation publication-title: Int. J. Robot. Res. – start-page: 3944 year: 2015 end-page: 3949 ident: b17 article-title: Variance modulated task prioritization in whole-body control publication-title: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems – volume: 10 year: 2019 ident: b31 article-title: Hyperparameter optimization in convolutional neural network using genetic algorithms publication-title: Int. J. Adv. Comput. Sci. Appl. – volume: 3 year: 2023 ident: b18 article-title: Automation of unstructured production environment by applying reinforcement learning publication-title: Front. Manuf. Technol. – reference: Y. Abe, M. Da Silva, J. Popović, Multiobjective control with frictional contacts, in: Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2007, pp. 249–258. – start-page: 1928 year: 2016 end-page: 1937 ident: b11 article-title: Asynchronous methods for deep reinforcement learning publication-title: International Conference on Machine Learning – start-page: 123 year: 2023 end-page: 136 ident: b26 article-title: Joint position bounds in resolved-acceleration control: a comparison publication-title: International Workshop on Human-Friendly Robotics – start-page: 6416 year: 2015 end-page: 6421 ident: b10 article-title: Multiple task optimization with a mixture of controllers for motion generation publication-title: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems – year: 2018 ident: b27 article-title: A micro Lie theory for state estimation in robotics – year: 2018 ident: b29 article-title: Reinforcement Learning: An Introduction – volume: 6 start-page: 3 year: 1987 end-page: 15 ident: b13 article-title: Task-priority based redundancy control of robot manipulators publication-title: Int. J. Robot. Res. – year: 2016 ident: b21 article-title: Learning to reinforcement learn – volume: 67 year: 2021 ident: b34 article-title: A peg-in-hole robot assembly system based on Gauss mixture model publication-title: Robot. Comput.-Integr. Manuf. – start-page: 162 year: 2016 end-page: 167 ident: b33 article-title: Autonomous alignment of peg and hole by force/torque measurement for robotic assembly publication-title: 2016 IEEE International Conference on Automation Science and Engineering – start-page: 247 year: 2013 end-page: 258 ident: b36 article-title: Nonlinear programming publication-title: Traces and Emergence of Nonlinear Programming – reference: E. Mingo Hoffman, A. Laurenzi, L. Muratore, N.G. Tsagarakis, D.G. Caldwell, Multi-Priority Cartesian Impedance Control based on Quadratic Programming Optimization, in: IEEE International Conference on Robotics and Automation, ICRA, Brisbane, Australia, (ISSN: 2577-087X) 2018, pp. 309–315, – year: 2008 ident: b37 article-title: Springer Handbook of Robotics – start-page: 1141 year: 2019 end-page: 1147 ident: b15 article-title: Dynamically-consistent generalized hierarchical control publication-title: 2019 International Conference on Robotics and Automation – volume: 35 start-page: 78 year: 2018 end-page: 94 ident: b8 article-title: Learning task priorities from demonstrations publication-title: IEEE Trans. Robot. – start-page: 133 year: 2008 end-page: 159 ident: b23 article-title: Motion control publication-title: Springer Handbook of Robotics – start-page: 2177 year: 2012 end-page: 2182 ident: b20 article-title: Model-free reinforcement learning with continuous action in practice publication-title: 2012 American Control Conference – volume: 4 start-page: 26 year: 2012 end-page: 31 ident: b30 article-title: Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude publication-title: COURSERA: Neural Netw. Mach. Learn. – start-page: 1283 year: 2011 end-page: 1290 ident: b4 article-title: Synthesis of complex humanoid whole-body behavior: A focus on sequencing and tasks transitions publication-title: 2011 IEEE International Conference on Robotics and Automation – reference: . – year: 2023 ident: b16 article-title: Multi-modal and adaptive control of human-robot interaction through hierarchical quadratic programming – volume: 42 start-page: 1291 year: 2012 end-page: 1307 ident: b19 article-title: A survey of actor-critic reinforcement learning: Standard and natural policy gradients publication-title: IEEE Trans. Syst. Man Cybern. C – volume: 3 start-page: 43 year: 1987 end-page: 53 ident: b28 article-title: A unified approach for motion and force control of robot manipulators: The operational space formulation publication-title: IEEE J. Robot. Autom. – start-page: 221 year: 2016 end-page: 226 ident: b6 article-title: Learning soft task priorities for control of redundant robots publication-title: 2016 IEEE International Conference on Robotics and Automation – volume: 40 start-page: 17 year: 2016 end-page: 31 ident: b5 article-title: Generalized hierarchical control publication-title: Auton. Robots – start-page: 5107 year: 2015 end-page: 5112 ident: b24 article-title: Stability of surface contacts for humanoid robots: Closed-form formulae of the contact wrench cone for rectangular support areas publication-title: 2015 IEEE International Conference on Robotics and Automation – start-page: 105 year: 2009 end-page: 160 ident: b22 article-title: Differential kinematics and statics publication-title: Robotics: Modelling, Planning and Control – volume: 312 year: 2022 ident: b9 article-title: Q-learning-based model predictive variable impedance control for physical human-robot collaboration publication-title: Artificial Intelligence – volume: 3 start-page: 281 year: 2017 end-page: 288 ident: b25 article-title: Joint position and velocity bounds in discrete-time acceleration/torque control of robot manipulators publication-title: IEEE Robot. Autom. Lett. – start-page: 1551 year: 2021 ident: 10.1016/j.rcim.2024.102857_b32 article-title: Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization – start-page: 133 year: 2008 ident: 10.1016/j.rcim.2024.102857_b23 article-title: Motion control – ident: 10.1016/j.rcim.2024.102857_b14 doi: 10.1109/ICRA.2018.8462877 – volume: 3 start-page: 281 issue: 1 year: 2017 ident: 10.1016/j.rcim.2024.102857_b25 article-title: Joint position and velocity bounds in discrete-time acceleration/torque control of robot manipulators publication-title: IEEE Robot. Autom. Lett. doi: 10.1109/LRA.2017.2738321 – volume: 312 year: 2022 ident: 10.1016/j.rcim.2024.102857_b9 article-title: Q-learning-based model predictive variable impedance control for physical human-robot collaboration publication-title: Artificial Intelligence doi: 10.1016/j.artint.2022.103771 – volume: 35 start-page: 78 issue: 1 year: 2018 ident: 10.1016/j.rcim.2024.102857_b8 article-title: Learning task priorities from demonstrations publication-title: IEEE Trans. Robot. doi: 10.1109/TRO.2018.2878355 – volume: 42 start-page: 957 year: 2018 ident: 10.1016/j.rcim.2024.102857_b12 article-title: Progress and prospects of the human–robot collaboration publication-title: Auton. Robots doi: 10.1007/s10514-017-9677-2 – year: 2018 ident: 10.1016/j.rcim.2024.102857_b27 – volume: 42 start-page: 1291 issue: 6 year: 2012 ident: 10.1016/j.rcim.2024.102857_b19 article-title: A survey of actor-critic reinforcement learning: Standard and natural policy gradients publication-title: IEEE Trans. Syst. Man Cybern. C doi: 10.1109/TSMCC.2012.2218595 – start-page: 5107 year: 2015 ident: 10.1016/j.rcim.2024.102857_b24 article-title: Stability of surface contacts for humanoid robots: Closed-form formulae of the contact wrench cone for rectangular support areas – volume: 5 start-page: 2626 issue: 2 year: 2020 ident: 10.1016/j.rcim.2024.102857_b7 article-title: Learning robust task priorities and gains for control of redundant robots publication-title: IEEE Robot. Autom. Lett. doi: 10.1109/LRA.2020.2972847 – year: 2008 ident: 10.1016/j.rcim.2024.102857_b37 – start-page: 221 year: 2016 ident: 10.1016/j.rcim.2024.102857_b6 article-title: Learning soft task priorities for control of redundant robots – ident: 10.1016/j.rcim.2024.102857_b2 – start-page: 1928 year: 2016 ident: 10.1016/j.rcim.2024.102857_b11 article-title: Asynchronous methods for deep reinforcement learning – start-page: 247 year: 2013 ident: 10.1016/j.rcim.2024.102857_b36 article-title: Nonlinear programming – volume: 3 year: 2023 ident: 10.1016/j.rcim.2024.102857_b18 article-title: Automation of unstructured production environment by applying reinforcement learning publication-title: Front. Manuf. Technol. doi: 10.3389/fmtec.2023.1154263 – volume: 3 start-page: 43 issue: 1 year: 1987 ident: 10.1016/j.rcim.2024.102857_b28 article-title: A unified approach for motion and force control of robot manipulators: The operational space formulation publication-title: IEEE J. Robot. Autom. doi: 10.1109/JRA.1987.1087068 – start-page: 1283 year: 2011 ident: 10.1016/j.rcim.2024.102857_b4 article-title: Synthesis of complex humanoid whole-body behavior: A focus on sequencing and tasks transitions – start-page: 123 year: 2023 ident: 10.1016/j.rcim.2024.102857_b26 article-title: Joint position bounds in resolved-acceleration control: a comparison – volume: 4 start-page: 26 issue: 2 year: 2012 ident: 10.1016/j.rcim.2024.102857_b30 article-title: Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude publication-title: COURSERA: Neural Netw. Mach. Learn. – volume: 67 year: 2021 ident: 10.1016/j.rcim.2024.102857_b34 article-title: A peg-in-hole robot assembly system based on Gauss mixture model publication-title: Robot. Comput.-Integr. Manuf. doi: 10.1016/j.rcim.2020.101996 – volume: 3 start-page: 1237 issue: 2 year: 2018 ident: 10.1016/j.rcim.2024.102857_b3 article-title: Development of a safety-and energy-aware impedance controller for collaborative robots publication-title: IEEE Robot. Autom. Lett. doi: 10.1109/LRA.2018.2795639 – volume: 10 issue: 6 year: 2019 ident: 10.1016/j.rcim.2024.102857_b31 article-title: Hyperparameter optimization in convolutional neural network using genetic algorithms publication-title: Int. J. Adv. Comput. Sci. Appl. – volume: 6 start-page: 3 issue: 2 year: 1987 ident: 10.1016/j.rcim.2024.102857_b13 article-title: Task-priority based redundancy control of robot manipulators publication-title: Int. J. Robot. Res. doi: 10.1177/027836498700600201 – start-page: 105 year: 2009 ident: 10.1016/j.rcim.2024.102857_b22 article-title: Differential kinematics and statics – start-page: 2177 year: 2012 ident: 10.1016/j.rcim.2024.102857_b20 article-title: Model-free reinforcement learning with continuous action in practice – year: 2023 ident: 10.1016/j.rcim.2024.102857_b16 – year: 2016 ident: 10.1016/j.rcim.2024.102857_b21 – start-page: 1141 year: 2019 ident: 10.1016/j.rcim.2024.102857_b15 article-title: Dynamically-consistent generalized hierarchical control – year: 2018 ident: 10.1016/j.rcim.2024.102857_b29 – start-page: 5771 year: 2020 ident: 10.1016/j.rcim.2024.102857_b35 article-title: Robust, locally guided peg-in-hole using impedance-controlled robots – start-page: 3944 year: 2015 ident: 10.1016/j.rcim.2024.102857_b17 article-title: Variance modulated task prioritization in whole-body control – start-page: 162 year: 2016 ident: 10.1016/j.rcim.2024.102857_b33 article-title: Autonomous alignment of peg and hole by force/torque measurement for robotic assembly – volume: 33 start-page: 1006 issue: 7 year: 2014 ident: 10.1016/j.rcim.2024.102857_b1 article-title: Hierarchical quadratic programming: Fast online humanoid-robot motion generation publication-title: Int. J. Robot. Res. doi: 10.1177/0278364914521306 – volume: 40 start-page: 17 year: 2016 ident: 10.1016/j.rcim.2024.102857_b5 article-title: Generalized hierarchical control publication-title: Auton. Robots doi: 10.1007/s10514-015-9436-1 – start-page: 6416 year: 2015 ident: 10.1016/j.rcim.2024.102857_b10 article-title: Multiple task optimization with a mixture of controllers for motion generation |
| SSID | ssj0002453 |
| Score | 2.4082274 |
| Snippet | In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the... |
| SourceID | hal crossref elsevier |
| SourceType | Open Access Repository Index Database Publisher |
| StartPage | 102857 |
| SubjectTerms | Computer Science Machine Learning Machine learning for robot control Optimization and optimal control Reinforcement learning Robotics |
| Title | A stable method for task priority adaptation in quadratic programming via reinforcement learning |
| URI | https://dx.doi.org/10.1016/j.rcim.2024.102857 https://hal.science/hal-04280264 |
| Volume | 91 |
| WOSCitedRecordID | wos001309641100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2537 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002453 issn: 0736-5845 databaseCode: AIEXJ dateStart: 19960301 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELa2LQd64FFALS9ZiNsqVeIka_u4QosKqipUFWlvwa9ACiSrdLvqT-HnMo4fuy0qogcuUWQljpX5NB5_mvkGobe5MUorTRNhWJkUkvGEc8MSpkhJdJpqXadDswl6csLmc_5pNPoVamFWP2jbsqsrvvivpoYxMLYtnb2DueOkMAD3YHS4gtnh-k-Gn1p6wNZDuebQLo9QXHwfL_qms63qxkKLhU8ybFpbVqn7QbfV52r9tOzBqhHj3gyyqmpgEEN_ia-b4expJ7uo86x8g4gkSlBomxx7aWsnhmLIyBIYH7MO2ZRxXzgWrsOwrR9S3Zq9B1wOfO5MdwDnDlxkzAo5FQ0czR2337ai7zZJDFKGvOfo62g-SSAWKjcds2vj5T2rDYSclPUfTt_xD-eHvWqstgApDtcPX1fYvrHzxXzEkOp2Xtk5KjtH5ebYQjuElhz85c70w2z-Me7ypHAKp2HhviDL5Q7eXMltQc_Wt0DfD-HM2SP0wJ9D8NTh5zEamXYPPQw9PrB3-Xtod0Ow8gn6MsUOXNiBCwM8sAUXDuDCa3DhpsURXHgDXBjAha-BCwdwPUWf38_O3h0lvkdHonLCl4nWmS4gDoRIiElVZialasJpZkrJSiKKVGa1tmdWIvJacV0LaWRBdS5ykUrF8mdou-1as48wk7VkWU0VJ7RIueQ5Z6mimciYEJNaHqBx-InVwkmxVLcb7gCV4T9XPph0QWIFsPnre2_AKPEDVn39aHpc2TFLL6RwgFiR53daygt0f435l2h72V-aV-ieWi2bi_61B9Zv_FCm4A |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+stable+method+for+task+priority+adaptation+in+quadratic+programming+via+reinforcement+learning&rft.jtitle=Robotics+and+computer-integrated+manufacturing&rft.au=Testa%2C+Andrea&rft.au=Laghi%2C+Marco&rft.au=Bianco%2C+Edoardo+Del&rft.au=Raiola%2C+Gennaro&rft.date=2025-02-01&rft.issn=0736-5845&rft.volume=91&rft.spage=102857&rft_id=info:doi/10.1016%2Fj.rcim.2024.102857&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_rcim_2024_102857 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0736-5845&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0736-5845&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0736-5845&client=summon |