ImageInThat: Manipulating Images to Convey User Instructions to Robots

Foundation models are rapidly improving the capability of robots in performing everyday tasks autonomously such as meal preparation, yet robots will still need to be instructed by humans due to model performance, the difficulty of capturing user preferences, and the need for user agency. Robots can...

Full description

Saved in:

Bibliographic Details
Published in:	2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI) pp. 757 - 766
Main Authors:	Mahadevan, Karthik, Lewis, Blaine, Li, Jiannan, Mutlu, Bilge, Tang, Anthony, Grossman, Tovi
Format:	Conference Proceeding
Language:	English
Published:	IEEE 04.03.2025
Subjects:	Codes direct manipulation end-user robot programming Faces Foundation models Human-robot interaction Natural languages Prototypes robot instruction following Robot programming Robots
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!