LVDiffusor: Distilling Functional Rearrangement Priors From Large Models Into Diffusor

Object rearrangement, a fundamental challenge in robotics, demands versatile strategies to handle diverse objects, configurations, and functional needs. To achieve this, the AI robot needs to learn functional rearrangement priors to specify precise goals that meet the functional requirements. Previo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE robotics and automation letters Jg. 9; H. 10; S. 8258 - 8265
Hauptverfasser:	Zeng, Yiming, Wu, Mingdong, Yang, Long, Zhang, Jiyao, Ding, Hao, Cheng, Hui, Dong, Hao
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	IEEE 01.10.2024
Schlagworte:	AI-enabled robotics Artificial intelligence Big Data big data in robotics and automation Data models Deep learning deep learning methods Diffusion models Layout Robot kinematics Robotics and automation Scalability Task analysis
ISSN:	2377-3766, 2377-3766
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Object rearrangement, a fundamental challenge in robotics, demands versatile strategies to handle diverse objects, configurations, and functional needs. To achieve this, the AI robot needs to learn functional rearrangement priors to specify precise goals that meet the functional requirements. Previous methods typically learn such priors from either laborious human annotations or manually designed heuristics, which limits scalability and generalization. In this letter, we propose a novel approach that leverages large models to distill functional rearrangement priors. Specifically, our approach collects diverse arrangement examples using both LLMs and VLMs and then distills the examples into a diffusion model. During test time, the learned diffusion model is conditioned on the initial configuration and guides the positioning of objects to meet functional requirements. In this way, we balance zero-shot generalization with time efficiency. Extensive experiments in multiple domains, including real-world scenarios, demonstrate the effectiveness of our approach in generating compatible goals for object rearrangement tasks, significantly outperforming baseline methods.
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2024.3438036