Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learnin...
Uložené v:
| Vydané v: | IEEE transactions on robotics Ročník 36; číslo 3; s. 582 - 596 |
|---|---|
| Hlavní autori: | , , , , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.06.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1552-3098, 1941-0468 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot. |
|---|---|
| AbstractList | Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot. |
| Author | Srinivasan, Krishnan Savarese, Silvio Zhu, Yuke Zachares, Peter Lee, Michelle A. Tan, Matthew Fei-Fei, Li Bohg, Jeannette Garg, Animesh |
| Author_xml | – sequence: 1 givenname: Michelle A. orcidid: 0000-0002-9893-3591 surname: Lee fullname: Lee, Michelle A. email: mishlee@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 2 givenname: Yuke surname: Zhu fullname: Zhu, Yuke email: yukez@cs.stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 3 givenname: Peter surname: Zachares fullname: Zachares, Peter email: zachares@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 4 givenname: Matthew surname: Tan fullname: Tan, Matthew email: mratan@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 5 givenname: Krishnan surname: Srinivasan fullname: Srinivasan, Krishnan email: krshna@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 6 givenname: Silvio surname: Savarese fullname: Savarese, Silvio email: ssilvio@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 7 givenname: Li surname: Fei-Fei fullname: Fei-Fei, Li email: feifeili@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA – sequence: 8 givenname: Animesh orcidid: 0000-0003-0482-4296 surname: Garg fullname: Garg, Animesh email: garg@cs.toronto.edu organization: Nvidia Research, Santa Clara, CA, USA – sequence: 9 givenname: Jeannette orcidid: 0000-0002-4921-7193 surname: Bohg fullname: Bohg, Jeannette email: bohg@stanford.edu organization: Department of Computer Science, Stanford University, Stanford, CA, USA |
| BookMark | eNp9kM1LAzEQxYMo2FbvgpeA562T3WR3402KX9BSqNWTsKTZiU0_kppsD_73btniwYOnmYH3m8d7fXLqvENCrhgMGQN5O59NhykwOUylkJyLE9JjkrMEeF6etrsQaZKBLM9JP8YVQMolZD3yMVFr6z7pK7qI1Bv6bqP1jipX07nf6-UdHaMK7qCZ7DeN3fpabegMdwEjukY1rTpS4wMd-fbUTTKzeknnKq7jBTkzahPx8jgH5O3xYT56TsbTp5fR_TjRmSiapBD1IjWZQs6lMkqhMVAjM7gwbAGMQwmcl4LlJQCXXGIOUiitNcuMMbzMBuSm-7sL_muPsalWfh9ca1mlnIFgRdqGHRDoVDr4GAOaahfsVoXvikF16LBqO6wOHVbHDlsk_4No20VugrKb_8DrDrSI-OsjgWcFg-wH1oWAww |
| CODEN | ITREAE |
| CitedBy_id | crossref_primary_10_1109_TASE_2023_3312657 crossref_primary_10_1109_JSYST_2023_3317328 crossref_primary_10_3390_machines13070605 crossref_primary_10_1109_TMECH_2023_3336520 crossref_primary_10_1007_s42235_022_00320_y crossref_primary_10_1109_TRO_2022_3168731 crossref_primary_10_1016_j_aei_2024_102596 crossref_primary_10_1016_j_jmsy_2025_06_014 crossref_primary_10_1109_TMECH_2023_3336316 crossref_primary_10_1109_TIE_2021_3084168 crossref_primary_10_1016_j_rcim_2024_102842 crossref_primary_10_1109_TCDS_2023_3237734 crossref_primary_10_1007_s10846_023_01815_4 crossref_primary_10_1177_02783649251344638 crossref_primary_10_1051_e3sconf_202339904044 crossref_primary_10_3390_s24134378 crossref_primary_10_1007_s00500_021_06190_6 crossref_primary_10_1109_TCSI_2024_3521547 crossref_primary_10_1109_TCYB_2023_3310505 crossref_primary_10_1016_j_jmsy_2023_11_008 crossref_primary_10_1111_exsy_13502 crossref_primary_10_1109_LRA_2024_3487490 crossref_primary_10_1109_TMECH_2023_3264650 crossref_primary_10_1126_scirobotics_adi8808 crossref_primary_10_1109_TRO_2025_3531816 crossref_primary_10_1177_02783649241273565 crossref_primary_10_1109_LRA_2023_3261759 crossref_primary_10_1109_TIE_2024_3357894 crossref_primary_10_1109_TRO_2022_3226149 crossref_primary_10_1109_LRA_2022_3146589 crossref_primary_10_1109_LRA_2025_3583608 crossref_primary_10_1109_TOH_2024_3384482 crossref_primary_10_1016_j_neunet_2024_106347 crossref_primary_10_1016_j_robot_2024_104832 crossref_primary_10_1016_j_birob_2025_100217 crossref_primary_10_1016_j_cej_2025_167677 crossref_primary_10_1109_LRA_2024_3398428 crossref_primary_10_1007_s10489_024_05417_x crossref_primary_10_1016_j_apm_2025_116003 crossref_primary_10_1109_LRA_2024_3397531 crossref_primary_10_1007_s10846_023_01984_2 crossref_primary_10_1109_LRA_2020_3038377 crossref_primary_10_1109_LRA_2024_3396368 crossref_primary_10_1109_TIT_2025_3532280 crossref_primary_10_3390_biomimetics8010086 crossref_primary_10_1016_j_procir_2021_11_035 crossref_primary_10_1108_TQM_09_2024_0342 crossref_primary_10_1016_j_engappai_2024_108603 crossref_primary_10_1016_j_eswa_2022_118441 crossref_primary_10_1109_ACCESS_2022_3183609 crossref_primary_10_1145_3626954 crossref_primary_10_1108_IR_03_2025_0089 crossref_primary_10_1109_LRA_2022_3150511 crossref_primary_10_3389_fnbot_2023_1280773 crossref_primary_10_1109_TIM_2024_3470030 crossref_primary_10_1109_ACCESS_2020_3028740 crossref_primary_10_3390_s23115344 crossref_primary_10_1109_LRA_2024_3359542 crossref_primary_10_1016_j_neunet_2025_107202 crossref_primary_10_1109_TOH_2023_3269086 crossref_primary_10_1109_LRA_2022_3159163 crossref_primary_10_1007_s11633_025_1542_8 crossref_primary_10_1109_LRA_2021_3061982 crossref_primary_10_1109_TII_2022_3224966 crossref_primary_10_1109_LRA_2024_3366023 crossref_primary_10_3390_s21113818 crossref_primary_10_1109_TMECH_2024_3452509 crossref_primary_10_3389_fnbot_2023_1320251 crossref_primary_10_1016_j_actaastro_2025_01_017 crossref_primary_10_1016_j_asr_2024_08_028 crossref_primary_10_1109_TII_2024_3353797 crossref_primary_10_1109_TCDS_2023_3277288 crossref_primary_10_1109_TIE_2023_3269464 crossref_primary_10_7717_peerj_cs_383 crossref_primary_10_3389_fnbot_2024_1478181 crossref_primary_10_1109_JIOT_2024_3396401 crossref_primary_10_1109_LRA_2021_3100269 crossref_primary_10_1109_ACCESS_2024_3493755 crossref_primary_10_1109_LRA_2024_3379803 crossref_primary_10_1016_j_rcim_2022_102517 |
| Cites_doi | 10.2991/978-94-6239-133-8_25 10.1109/ICRA.2019.8794048 10.1177/027836498700600101 10.1109/IROS.2010.5652967 10.1109/ICRA.2017.7989384 10.1109/IROS.2015.7354090 10.1177/027836499501400103 10.1109/HUMANOIDS.2016.7803371 10.1007/978-3-030-01231-1_39 10.1109/TRO.2011.2162271 10.1109/CVPR.2016.264 10.1177/0278364914559753 10.1109/IROS.2016.7759578 10.1109/TPAMI.2018.2889774 10.1109/IROS.2011.6095096 10.1109/ICRA.2019.8793763 10.1109/TRO.2018.2819658 10.1109/IROS.2011.6094878 10.1177/027836499000900603 10.1007/978-3-030-33950-0_37 10.1109/ROBOT.1989.100031 10.1109/LRA.2018.2852779 10.1109/IROS40897.2019.8968201 10.15607/RSS.2018.XIV.009 10.1109/ICRA.2019.8794285 10.1109/ICRA.2016.7487176 10.1109/TRO.2017.2721939 10.1177/0278364919887447 10.1109/IROS.2014.6943202 10.1109/IROS.2018.8594077 10.1007/s10514-013-9365-9 10.15607/RSS.2015.XI.044 10.1111/j.0956-7976.2004.00691.x 10.1109/ICRA.2014.6907696 10.1016/j.neunet.2018.07.006 10.1109/HUMANOIDS.2015.7363558 10.1007/978-3-319-28872-7_16 10.1109/ICRA.2018.8460875 10.1109/ICRA.2018.8460528 10.1109/ICRA.2019.8794233 10.1177/0278364913506757 10.1109/ICRA.2019.8794219 10.1115/1.3149634 10.1126/scirobotics.aav3123 10.1109/IROS.2016.7759592 10.1038/nature14539 10.1109/ICRA.2019.8793485 10.1109/LRA.2016.2645124 10.1109/CVPR.2017.538 10.1109/LRA.2018.2800101 10.1109/IROS.2017.8206165 10.1007/978-3-030-28619-4_41 10.1109/CVPR.2019.01086 10.1109/ICRA.2017.7989326 10.1109/HUMANOIDS.2015.7363524 10.1109/ICRA.2017.7989324 10.1109/ICRA.2013.6630999 10.1007/s10514-015-9435-2 10.1109/IROS.2018.8593430 10.1109/ICRA.2019.8793520 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 7TB 8FD FR3 JQ2 L7M L~C L~D |
| DOI | 10.1109/TRO.2019.2959445 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1941-0468 |
| EndPage | 596 |
| ExternalDocumentID | 10_1109_TRO_2019_2959445 9043710 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Toyota Research Institute funderid: 10.13039/100015599 – fundername: American Technologies Corporation |
| GroupedDBID | .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AIBXA AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ H~9 IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS VJK AAYXX CITATION 7SC 7SP 7TB 8FD FR3 JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c357t-75db2f3ae449afaaeff0de1febf1b0140804485168004949e6095accc13fff483 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 151 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000543027200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1552-3098 |
| IngestDate | Sun Nov 09 05:39:02 EST 2025 Tue Nov 18 22:34:46 EST 2025 Sat Nov 29 01:47:26 EST 2025 Wed Aug 27 02:41:18 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c357t-75db2f3ae449afaaeff0de1febf1b0140804485168004949e6095accc13fff483 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-9893-3591 0000-0003-0482-4296 0000-0002-4921-7193 |
| PQID | 2410517290 |
| PQPubID | 27625 |
| PageCount | 15 |
| ParticipantIDs | proquest_journals_2410517290 ieee_primary_9043710 crossref_primary_10_1109_TRO_2019_2959445 crossref_citationtrail_10_1109_TRO_2019_2959445 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-June 2020-6-00 20200601 |
| PublicationDateYYYYMMDD | 2020-06-01 |
| PublicationDate_xml | – month: 06 year: 2020 text: 2020-June |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on robotics |
| PublicationTitleAbbrev | TRO |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | wu (ref60) 0 ref57 oh (ref40) 0 ref13 ref12 ref58 ref14 ref53 ref55 ref11 agrawal (ref38) 0 ref54 ref10 oord (ref67) 2016 fischer (ref65) 0 ref17 ref16 ref19 ref18 ref50 kostrikov (ref86) 2018 liu (ref52) 0 kingma (ref85) 0 ref46 ref45 ref48 ref47 ref42 ref44 ref43 levine (ref7) 2016; 17 ref49 ref8 ref9 ref4 ref3 ref6 ref5 ref81 ref84 wüthrich (ref83) 0 suzuki (ref61) 2016 ghosh (ref82) 2019 ref79 ref35 ref78 ref34 ref37 ref36 kingma (ref63) 0 ngiam (ref41) 0 haarnoja (ref75) 0 ref31 ref74 ref30 ref77 ref33 ref32 srivastava (ref59) 0 ref2 lecun (ref62) 2015; 521 ref1 bohg (ref51) 2011 ref73 ref72 conti (ref80) 0 ref24 ref23 ref26 simonyan (ref66) 0 ref69 ref25 ref64 ref20 schulman (ref76) 2017 calandra (ref15) 0 schulman (ref71) 0 ref22 ref21 ref28 ref27 ref29 lillicrap (ref70) 0 babaeizadeh (ref39) 0 edelman (ref56) 1987 thrun (ref68) 2005 |
| References_xml | – year: 0 ident: ref39 article-title: Stochastic variational video prediction publication-title: Proc Intl Conf on Learning Representations – ident: ref57 doi: 10.2991/978-94-6239-133-8_25 – year: 2018 ident: ref86 article-title: Pytorch implementations of reinforcement learning algorithms – ident: ref47 doi: 10.1109/ICRA.2019.8794048 – ident: ref12 doi: 10.1177/027836498700600101 – ident: ref50 doi: 10.1109/IROS.2010.5652967 – ident: ref8 doi: 10.1109/ICRA.2017.7989384 – ident: ref4 doi: 10.1109/IROS.2015.7354090 – ident: ref79 doi: 10.1177/027836499501400103 – ident: ref73 doi: 10.1109/HUMANOIDS.2016.7803371 – ident: ref42 doi: 10.1007/978-3-030-01231-1_39 – start-page: 5575 year: 0 ident: ref60 article-title: Multimodal generative models for scalable weakly-supervised learning publication-title: Proc Adv Neural Inf Process Syst – start-page: 496 year: 0 ident: ref80 article-title: The chai libraries publication-title: Proc Eurohaptics – ident: ref3 doi: 10.1109/TRO.2011.2162271 – ident: ref43 doi: 10.1109/CVPR.2016.264 – ident: ref78 doi: 10.1177/0278364914559753 – start-page: 5074 year: 0 ident: ref38 article-title: Learning to poke by poking: Experiential learning of intuitive physics publication-title: Proc Adv Neural Inf Process Syst – year: 2005 ident: ref68 publication-title: Probabilistic Robotics – start-page: 314 year: 0 ident: ref15 article-title: The feeling of success: Does touch sensing help predict grasp outcomes? publication-title: Proc Conf Robot Learn – ident: ref20 doi: 10.1109/IROS.2016.7759578 – ident: ref64 doi: 10.1109/TPAMI.2018.2889774 – ident: ref18 doi: 10.1109/IROS.2011.6095096 – ident: ref53 doi: 10.1109/ICRA.2019.8793763 – year: 1987 ident: ref56 publication-title: Neural Darwinism The Theory of Neuronal Group Selection – ident: ref74 doi: 10.1109/TRO.2018.2819658 – ident: ref24 doi: 10.1109/IROS.2011.6094878 – ident: ref2 doi: 10.1177/027836499000900603 – year: 0 ident: ref63 article-title: Auto-encoding variational bayes publication-title: Proc Intl Conf on Learning Representations – ident: ref33 doi: 10.1007/978-3-030-33950-0_37 – year: 2017 ident: ref76 article-title: Proximal policy optimization algorithms publication-title: CoRR – start-page: 1856 year: 0 ident: ref75 article-title: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor publication-title: Proc Int Conf Mach Learn – ident: ref14 doi: 10.1109/ROBOT.1989.100031 – volume: 17 start-page: 1334 year: 2016 ident: ref7 article-title: End-to-end training of deep visuomotor policies publication-title: J Mach Learn Res – ident: ref25 doi: 10.1109/LRA.2018.2852779 – ident: ref84 doi: 10.1109/IROS40897.2019.8968201 – ident: ref10 doi: 10.15607/RSS.2018.XIV.009 – year: 0 ident: ref85 article-title: Adam: A method for stochastic optimization publication-title: Proc 3rd Int Conf Learn Representations – ident: ref55 doi: 10.1109/ICRA.2019.8794285 – ident: ref44 doi: 10.1109/ICRA.2016.7487176 – year: 0 ident: ref70 article-title: Continuous control with deep reinforcement learning publication-title: Proc Intl Conf on Learning Representations – ident: ref58 doi: 10.1109/TRO.2017.2721939 – ident: ref29 doi: 10.1177/0278364919887447 – ident: ref5 doi: 10.1109/IROS.2014.6943202 – year: 0 ident: ref66 article-title: Very deep convolutional networks for large-scale image recognition publication-title: Proc 3rd Int Conf Learn Representations – ident: ref48 doi: 10.1109/IROS.2018.8594077 – start-page: 2758 year: 0 ident: ref65 article-title: Flownet: Learning optical flow with convolutional networks publication-title: Proc IEEE Int Conf Comput Vision – ident: ref77 doi: 10.1007/s10514-013-9365-9 – ident: ref27 doi: 10.15607/RSS.2015.XI.044 – year: 2016 ident: ref67 article-title: Wavenet: A generative model for raw audio publication-title: Proc 9th Speech Synthesis Workshop – ident: ref1 doi: 10.1111/j.0956-7976.2004.00691.x – ident: ref45 doi: 10.1109/ICRA.2014.6907696 – ident: ref35 doi: 10.1016/j.neunet.2018.07.006 – ident: ref17 doi: 10.1109/HUMANOIDS.2015.7363558 – ident: ref81 doi: 10.1007/978-3-319-28872-7_16 – year: 2019 ident: ref82 article-title: From variational to deterministic autoencoders – ident: ref31 doi: 10.1109/ICRA.2018.8460875 – ident: ref30 doi: 10.1109/ICRA.2018.8460528 – ident: ref26 doi: 10.1109/ICRA.2019.8794233 – ident: ref72 doi: 10.1177/0278364913506757 – ident: ref23 doi: 10.1109/ICRA.2019.8794219 – ident: ref13 doi: 10.1115/1.3149634 – ident: ref6 doi: 10.1126/scirobotics.aav3123 – ident: ref32 doi: 10.1109/IROS.2016.7759592 – start-page: 689 year: 0 ident: ref41 article-title: Multimodal deep learning publication-title: Proc 28th Int Conf Mach Learn – volume: 521 start-page: 436 year: 2015 ident: ref62 article-title: Deep learning publication-title: Nature doi: 10.1038/nature14539 – ident: ref11 doi: 10.1109/ICRA.2019.8793485 – ident: ref69 doi: 10.1109/LRA.2016.2645124 – ident: ref37 doi: 10.1109/CVPR.2017.538 – ident: ref36 doi: 10.1109/LRA.2018.2800101 – ident: ref46 doi: 10.1109/IROS.2017.8206165 – year: 2011 ident: ref51 article-title: Multi-modal scene understanding for robotic grasping – ident: ref34 doi: 10.1007/978-3-030-28619-4_41 – start-page: 3195 year: 0 ident: ref83 article-title: Probabilistic object tracking using a range camera publication-title: Proc IEEE/RSJ Int Conf Intell Robots Syst – start-page: 249 year: 0 ident: ref52 article-title: Learning end-to-end multimodal sensor policies for autonomous navigation publication-title: Proc Conf Robot Learn – ident: ref54 doi: 10.1109/CVPR.2019.01086 – ident: ref19 doi: 10.1109/ICRA.2017.7989326 – ident: ref21 doi: 10.1109/HUMANOIDS.2015.7363524 – year: 2016 ident: ref61 article-title: Joint multimodal learning with deep generative models – start-page: 2863 year: 0 ident: ref40 article-title: Action-conditional video prediction using deep networks in atari games publication-title: Proc Adv Neural Inf Process Syst – ident: ref9 doi: 10.1109/ICRA.2017.7989324 – start-page: 2222 year: 0 ident: ref59 article-title: Multimodal learning with deep boltzmann machines publication-title: Proc Adv Neural Inf Process Syst – ident: ref16 doi: 10.1109/ICRA.2013.6630999 – ident: ref28 doi: 10.1007/s10514-015-9435-2 – ident: ref49 doi: 10.1109/IROS.2018.8593430 – ident: ref22 doi: 10.1109/ICRA.2019.8793520 – start-page: 1889 year: 0 ident: ref71 article-title: Trust region policy optimization publication-title: Proc Int Conf Mach Learn |
| SSID | ssj0024903 |
| Score | 2.6731412 |
| Snippet | Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 582 |
| SubjectTerms | Algorithms Clearances Computer simulation Control systems design Deep learning in robotics and automation Haptic interfaces Machine learning perception for grasping and manipulation Reinforcement learning Representations Robot sensing systems Robots sensor fusion sensor-based control Solid modeling Task analysis Visualization |
| Title | Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks |
| URI | https://ieeexplore.ieee.org/document/9043710 https://www.proquest.com/docview/2410517290 |
| Volume | 36 |
| WOSCitedRecordID | wos000543027200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1941-0468 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0024903 issn: 1552-3098 databaseCode: RIE dateStart: 20040101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVKxQADXwVRKMgDCxJpna8mx4YQFQMUVErVASmyHZsiIEFNy-_H56alEgiJLYMdJXdO7p597x0hJybmtkUa-U7QdrkTuEHoCBerLGIWS8ZEW1p1_sFN1O3GwyHcV8jZggujlLLFZ6qJl_YsP83lFLfKWoBCPMinWomiaMbV-tbVA9sFGRXFHJ9BPD-SZNDq9-6whguaHoQQIHFpKQTZnio_fsQ2unQ2__dcW2SjzCLpxczt26Sish2yvqQtWCNPt7bPFH0wOFXRXNOBZZFTnqW0n0_l6JyW2qrP1LJw3_PU3LJnK2NLQlJWUJPTUlSw4hLlRuSI9nnxWuySx85V__LaKVspONIPo4kThanwtM9VEADXnCutWapcrYR2BYKsmBmcFrrGRSgYAwp16LiU0vW11kHs75Fqlmdqn1DQvjIvLEToyiA1ANoDzTwBwCUYsBLWSWtu3USWOuPY7uItsXiDQWL8kaA_ktIfdXK6mPEx09j4Y2wN7b8YV5q-ThpzByblR1gkHpawmgQN2MHvsw7Jmofw2W6qNEh1Mp6qI7IqPycvxfjYrq8vOLfMOg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3BTtwwEB0hQGo5tBSK2JZSH7ggEdZOnN0MN1QVgVgWRAPiUCmyHRsQNKnILt9fjze7IIGQesvBjpIZJzPPnvcGYMvH3J4u-0kke0JFUsg00oKqLDKeGc51zwR1_stBfzjMrq7wbA52ZlwYa20oPrO7dBnO8svajGmrrIskxEN8qoVUylhM2FpPynoY-iCTpliUcMymh5Icu_n5KVVx4W6MKUqiLj0LQqGryotfcYgvBx__78mW4UObR7L9ieM_wZytVmDpmbrgKvw-CZ2m2C-PVC2rHbsMPHKmqpLl9djc7LFWXfWaBR7un7r0tzwPtbEtJalqmM9qGWlYKUOCI-aG5aq5az7DxcHP_Mdh1DZTiEyS9kdRPy117BJlpUTllLLO8dIKZ7UTmmBWxj1SS4V3EknGoCUlOmWMEYlzTmbJGsxXdWXXgaFLrH9hrVNhZOkhdIyOxxpRGfRwJe1Ad2rdwrRK49Tw4r4IiINj4f1RkD-K1h8d2J7N-DtR2Xhj7CrZfzauNX0HNqYOLNrPsCliKmL1KRryL6_P-g7vDvOTQTE4Gh5_hfcxgemwxbIB86OHsf0Gi-ZxdNs8bIa19g8BVM-B |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Making+Sense+of+Vision+and+Touch%3A+Learning+Multimodal+Representations+for+Contact-Rich+Tasks&rft.jtitle=IEEE+transactions+on+robotics&rft.au=Lee%2C+Michelle+A&rft.au=Zhu%2C+Yuke&rft.au=Zachares%2C+Peter&rft.au=Tan%2C+Matthew&rft.date=2020-06-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1552-3098&rft.eissn=1941-0468&rft.volume=36&rft.issue=3&rft.spage=582&rft_id=info:doi/10.1109%2FTRO.2019.2959445&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1552-3098&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1552-3098&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1552-3098&client=summon |