Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learnin...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on robotics Ročník 36; číslo 3; s. 582 - 596
Hlavní autoři:	Lee, Michelle A., Zhu, Yuke, Zachares, Peter, Tan, Matthew, Srinivasan, Krishnan, Savarese, Silvio, Fei-Fei, Li, Garg, Animesh, Bohg, Jeannette
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.06.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Algorithms Clearances Computer simulation Control systems design Deep learning in robotics and automation Haptic interfaces Machine learning perception for grasping and manipulation Reinforcement learning Representations Robot sensing systems Robots sensor fusion sensor-based control Solid modeling Task analysis Visualization
ISSN:	1552-3098, 1941-0468
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1552-3098 1941-0468
DOI:	10.1109/TRO.2019.2959445