Learning from Physical Human Corrections, One Feature at a Time

We focus on learning robot objective functions from human guidance: specifically, from physical corrections provided by the person while the robot is acting. Objective functions are typically parametrized in terms of features, which capture aspects of the task that might be important. When the perso...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2018 13th ACM/IEEE International Conference on Human-Robot Interaction (HRI) S. 141 - 149
Hauptverfasser: Bajcsy, Andrea, Losey, Dylan P., O'Malley, Marcia K., Dragan, Anca D.
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: New York, NY, USA ACM 26.02.2018
Schriftenreihe:ACM Conferences
Schlagworte:
ISBN:9781450349536, 1450349536
ISSN:2167-2148
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We focus on learning robot objective functions from human guidance: specifically, from physical corrections provided by the person while the robot is acting. Objective functions are typically parametrized in terms of features, which capture aspects of the task that might be important. When the person intervenes to correct the robot»s behavior, the robot should update its understanding of which features matter, how much, and in what way. Unfortunately, real users do not provide optimal corrections that isolate exactly what the robot was doing wrong. Thus, when receiving a correction, it is difficult for the robot to determine which features the person meant to correct, and which features were changed unintentionally. In this paper, we propose to improve the efficiency of robot learning during physical interactions by reducing unintended learning. Our approach allows the human-robot team to focus on learning one feature at a time, unlike state-of-the-art techniques that update all features at once. We derive an online method for identifying the single feature which the human is trying to change during physical interaction, and experimentally compare this one-at-a-time approach to the all-at-once baseline in a user study. Our results suggest that users teaching one-at-a-time perform better, especially in tasks that require changing multiple features.
ISBN:9781450349536
1450349536
ISSN:2167-2148
DOI:10.1145/3171221.3171267