Enabling End Users to Program Robots Using Reinforcement Learning

Reinforcement learning (RL) is a powerful learning technique in robotics, where people can specify rewards that robots learn how to maximize through a process of trialanderror. Despite the numerous advantages of RL to robot programming, no approaches to our knowledge have sought to enable nontechnic...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI) S. 767 - 777
Hauptverfasser:	Ayalew, Tewodros W., Wang, Jennifer, Littman, Michael L., Ur, Blase, Sebo, Sarah
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 04.03.2025
Schlagworte:	End-User Robot Programming Human-robot interaction Reinforcement learning Robot programming
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Reinforcement learning (RL) is a powerful learning technique in robotics, where people can specify rewards that robots learn how to maximize through a process of trialanderror. Despite the numerous advantages of RL to robot programming, no approaches to our knowledge have sought to enable nontechnical users to specify RL programs for robots. In this work, we designed two novel RL-based robot programming paradigms for non-technical users: Full MDP Programming (Full-MDP) and Goal-Only MDP Programming (Goal-MDP). To evaluate the efficacy of these two approaches, we ran a between-subjects online user study ( N = 409) where participants were asked to program a simulated robot to complete example household tasks (e.g., delivering coffee) using one of our RL programming paradigms or a commonly used baseline: Sequential Programming (Seq), or Trigger-Action Programming (TAP). While users neither performed well nor reported positive experiences with the FullMDP interface, user performance and experience with Goal-MDP was similar to the baselines (Seq and TAP) with significantly shorter programs. These results demonstrate that RL-based paradigms like Goal-MDP are a viable alternative to more traditional approaches and provide a starting point for robot programming interfaces that allow end-users to leverage the myriad benefits of RL for programming robots.
DOI:	10.1109/HRI61500.2025.10974035