Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learnin...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on robotics Ročník 36; číslo 3; s. 582 - 596
Hlavní autoři: Lee, Michelle A., Zhu, Yuke, Zachares, Peter, Tan, Matthew, Srinivasan, Krishnan, Savarese, Silvio, Fei-Fei, Li, Garg, Animesh, Bohg, Jeannette
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.06.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1552-3098, 1941-0468
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1552-3098
1941-0468
DOI:10.1109/TRO.2019.2959445