MARCER: Multimodal Augmented Reality for Composing and Executing Robot Tasks

In this work, we combine the strengths of humans and robots by developing MARCER, a novel interactive and multimodal end-user robot programming system. MARCER utilizes a Large Language Model to translate users' natural language task descriptions and environmental context into Action Plans for r...

Full description

Saved in:

Bibliographic Details
Published in:	2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI) pp. 529 - 539
Main Authors:	Ikeda, Bryce, Gramopadhye, Maitrey, Nekervis, LillyAnn, Szafir, Daniel
Format:	Conference Proceeding
Language:	English
Published:	IEEE 04.03.2025
Subjects:	Augmented reality End-user Robot Programming Human-Robot Collaboration Human-robot interaction Large language models Natural language processing Real-time systems Refining Robot programming Translation Usability Visualization
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this work, we combine the strengths of humans and robots by developing MARCER, a novel interactive and multimodal end-user robot programming system. MARCER utilizes a Large Language Model to translate users' natural language task descriptions and environmental context into Action Plans for robot execution, based on a trigger-action programming paradigm that facilitates authoring reactive robot behaviors. MARCER also affords interaction via augmented reality to help users parameterize and validate robot programs and provide real-time, visual previews and feedback directly in the context of the robot's operating environment. We present the design, implementation, and evaluation of MARCER to explore the usability of such systems and demonstrate how trigger-action programming, Large Language Models, and augmented reality hold deep-seated synergies that, when combined, empower users to program general-purpose robots to perform everyday tasks.
DOI:	10.1109/HRI61500.2025.10974232