Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learnin...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on robotics Ročník 36; číslo 3; s. 582 - 596
Hlavní autori: Lee, Michelle A., Zhu, Yuke, Zachares, Peter, Tan, Matthew, Srinivasan, Krishnan, Savarese, Silvio, Fei-Fei, Li, Garg, Animesh, Bohg, Jeannette
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York IEEE 01.06.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1552-3098, 1941-0468
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.
AbstractList Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.
Author Srinivasan, Krishnan
Savarese, Silvio
Zhu, Yuke
Zachares, Peter
Lee, Michelle A.
Tan, Matthew
Fei-Fei, Li
Bohg, Jeannette
Garg, Animesh
Author_xml – sequence: 1
  givenname: Michelle A.
  orcidid: 0000-0002-9893-3591
  surname: Lee
  fullname: Lee, Michelle A.
  email: mishlee@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 2
  givenname: Yuke
  surname: Zhu
  fullname: Zhu, Yuke
  email: yukez@cs.stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 3
  givenname: Peter
  surname: Zachares
  fullname: Zachares, Peter
  email: zachares@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 4
  givenname: Matthew
  surname: Tan
  fullname: Tan, Matthew
  email: mratan@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 5
  givenname: Krishnan
  surname: Srinivasan
  fullname: Srinivasan, Krishnan
  email: krshna@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 6
  givenname: Silvio
  surname: Savarese
  fullname: Savarese, Silvio
  email: ssilvio@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 7
  givenname: Li
  surname: Fei-Fei
  fullname: Fei-Fei, Li
  email: feifeili@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
– sequence: 8
  givenname: Animesh
  orcidid: 0000-0003-0482-4296
  surname: Garg
  fullname: Garg, Animesh
  email: garg@cs.toronto.edu
  organization: Nvidia Research, Santa Clara, CA, USA
– sequence: 9
  givenname: Jeannette
  orcidid: 0000-0002-4921-7193
  surname: Bohg
  fullname: Bohg, Jeannette
  email: bohg@stanford.edu
  organization: Department of Computer Science, Stanford University, Stanford, CA, USA
BookMark eNp9kM1LAzEQxYMo2FbvgpeA562T3WR3402KX9BSqNWTsKTZiU0_kppsD_73btniwYOnmYH3m8d7fXLqvENCrhgMGQN5O59NhykwOUylkJyLE9JjkrMEeF6etrsQaZKBLM9JP8YVQMolZD3yMVFr6z7pK7qI1Bv6bqP1jipX07nf6-UdHaMK7qCZ7DeN3fpabegMdwEjukY1rTpS4wMd-fbUTTKzeknnKq7jBTkzahPx8jgH5O3xYT56TsbTp5fR_TjRmSiapBD1IjWZQs6lMkqhMVAjM7gwbAGMQwmcl4LlJQCXXGIOUiitNcuMMbzMBuSm-7sL_muPsalWfh9ca1mlnIFgRdqGHRDoVDr4GAOaahfsVoXvikF16LBqO6wOHVbHDlsk_4No20VugrKb_8DrDrSI-OsjgWcFg-wH1oWAww
CODEN ITREAE
CitedBy_id crossref_primary_10_1109_TASE_2023_3312657
crossref_primary_10_1109_JSYST_2023_3317328
crossref_primary_10_3390_machines13070605
crossref_primary_10_1109_TMECH_2023_3336520
crossref_primary_10_1007_s42235_022_00320_y
crossref_primary_10_1109_TRO_2022_3168731
crossref_primary_10_1016_j_aei_2024_102596
crossref_primary_10_1016_j_jmsy_2025_06_014
crossref_primary_10_1109_TMECH_2023_3336316
crossref_primary_10_1109_TIE_2021_3084168
crossref_primary_10_1016_j_rcim_2024_102842
crossref_primary_10_1109_TCDS_2023_3237734
crossref_primary_10_1007_s10846_023_01815_4
crossref_primary_10_1177_02783649251344638
crossref_primary_10_1051_e3sconf_202339904044
crossref_primary_10_3390_s24134378
crossref_primary_10_1007_s00500_021_06190_6
crossref_primary_10_1109_TCSI_2024_3521547
crossref_primary_10_1109_TCYB_2023_3310505
crossref_primary_10_1016_j_jmsy_2023_11_008
crossref_primary_10_1111_exsy_13502
crossref_primary_10_1109_LRA_2024_3487490
crossref_primary_10_1109_TMECH_2023_3264650
crossref_primary_10_1126_scirobotics_adi8808
crossref_primary_10_1109_TRO_2025_3531816
crossref_primary_10_1177_02783649241273565
crossref_primary_10_1109_LRA_2023_3261759
crossref_primary_10_1109_TIE_2024_3357894
crossref_primary_10_1109_TRO_2022_3226149
crossref_primary_10_1109_LRA_2022_3146589
crossref_primary_10_1109_LRA_2025_3583608
crossref_primary_10_1109_TOH_2024_3384482
crossref_primary_10_1016_j_neunet_2024_106347
crossref_primary_10_1016_j_robot_2024_104832
crossref_primary_10_1016_j_birob_2025_100217
crossref_primary_10_1016_j_cej_2025_167677
crossref_primary_10_1109_LRA_2024_3398428
crossref_primary_10_1007_s10489_024_05417_x
crossref_primary_10_1016_j_apm_2025_116003
crossref_primary_10_1109_LRA_2024_3397531
crossref_primary_10_1007_s10846_023_01984_2
crossref_primary_10_1109_LRA_2020_3038377
crossref_primary_10_1109_LRA_2024_3396368
crossref_primary_10_1109_TIT_2025_3532280
crossref_primary_10_3390_biomimetics8010086
crossref_primary_10_1016_j_procir_2021_11_035
crossref_primary_10_1108_TQM_09_2024_0342
crossref_primary_10_1016_j_engappai_2024_108603
crossref_primary_10_1016_j_eswa_2022_118441
crossref_primary_10_1109_ACCESS_2022_3183609
crossref_primary_10_1145_3626954
crossref_primary_10_1108_IR_03_2025_0089
crossref_primary_10_1109_LRA_2022_3150511
crossref_primary_10_3389_fnbot_2023_1280773
crossref_primary_10_1109_TIM_2024_3470030
crossref_primary_10_1109_ACCESS_2020_3028740
crossref_primary_10_3390_s23115344
crossref_primary_10_1109_LRA_2024_3359542
crossref_primary_10_1016_j_neunet_2025_107202
crossref_primary_10_1109_TOH_2023_3269086
crossref_primary_10_1109_LRA_2022_3159163
crossref_primary_10_1007_s11633_025_1542_8
crossref_primary_10_1109_LRA_2021_3061982
crossref_primary_10_1109_TII_2022_3224966
crossref_primary_10_1109_LRA_2024_3366023
crossref_primary_10_3390_s21113818
crossref_primary_10_1109_TMECH_2024_3452509
crossref_primary_10_3389_fnbot_2023_1320251
crossref_primary_10_1016_j_actaastro_2025_01_017
crossref_primary_10_1016_j_asr_2024_08_028
crossref_primary_10_1109_TII_2024_3353797
crossref_primary_10_1109_TCDS_2023_3277288
crossref_primary_10_1109_TIE_2023_3269464
crossref_primary_10_7717_peerj_cs_383
crossref_primary_10_3389_fnbot_2024_1478181
crossref_primary_10_1109_JIOT_2024_3396401
crossref_primary_10_1109_LRA_2021_3100269
crossref_primary_10_1109_ACCESS_2024_3493755
crossref_primary_10_1109_LRA_2024_3379803
crossref_primary_10_1016_j_rcim_2022_102517
Cites_doi 10.2991/978-94-6239-133-8_25
10.1109/ICRA.2019.8794048
10.1177/027836498700600101
10.1109/IROS.2010.5652967
10.1109/ICRA.2017.7989384
10.1109/IROS.2015.7354090
10.1177/027836499501400103
10.1109/HUMANOIDS.2016.7803371
10.1007/978-3-030-01231-1_39
10.1109/TRO.2011.2162271
10.1109/CVPR.2016.264
10.1177/0278364914559753
10.1109/IROS.2016.7759578
10.1109/TPAMI.2018.2889774
10.1109/IROS.2011.6095096
10.1109/ICRA.2019.8793763
10.1109/TRO.2018.2819658
10.1109/IROS.2011.6094878
10.1177/027836499000900603
10.1007/978-3-030-33950-0_37
10.1109/ROBOT.1989.100031
10.1109/LRA.2018.2852779
10.1109/IROS40897.2019.8968201
10.15607/RSS.2018.XIV.009
10.1109/ICRA.2019.8794285
10.1109/ICRA.2016.7487176
10.1109/TRO.2017.2721939
10.1177/0278364919887447
10.1109/IROS.2014.6943202
10.1109/IROS.2018.8594077
10.1007/s10514-013-9365-9
10.15607/RSS.2015.XI.044
10.1111/j.0956-7976.2004.00691.x
10.1109/ICRA.2014.6907696
10.1016/j.neunet.2018.07.006
10.1109/HUMANOIDS.2015.7363558
10.1007/978-3-319-28872-7_16
10.1109/ICRA.2018.8460875
10.1109/ICRA.2018.8460528
10.1109/ICRA.2019.8794233
10.1177/0278364913506757
10.1109/ICRA.2019.8794219
10.1115/1.3149634
10.1126/scirobotics.aav3123
10.1109/IROS.2016.7759592
10.1038/nature14539
10.1109/ICRA.2019.8793485
10.1109/LRA.2016.2645124
10.1109/CVPR.2017.538
10.1109/LRA.2018.2800101
10.1109/IROS.2017.8206165
10.1007/978-3-030-28619-4_41
10.1109/CVPR.2019.01086
10.1109/ICRA.2017.7989326
10.1109/HUMANOIDS.2015.7363524
10.1109/ICRA.2017.7989324
10.1109/ICRA.2013.6630999
10.1007/s10514-015-9435-2
10.1109/IROS.2018.8593430
10.1109/ICRA.2019.8793520
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
DOI 10.1109/TRO.2019.2959445
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1941-0468
EndPage 596
ExternalDocumentID 10_1109_TRO_2019_2959445
9043710
Genre orig-research
GrantInformation_xml – fundername: Toyota Research Institute
  funderid: 10.13039/100015599
– fundername: American Technologies Corporation
GroupedDBID .DC
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AIBXA
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
H~9
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
VJK
AAYXX
CITATION
7SC
7SP
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c357t-75db2f3ae449afaaeff0de1febf1b0140804485168004949e6095accc13fff483
IEDL.DBID RIE
ISICitedReferencesCount 151
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000543027200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1552-3098
IngestDate Sun Nov 09 05:39:02 EST 2025
Tue Nov 18 22:34:46 EST 2025
Sat Nov 29 01:47:26 EST 2025
Wed Aug 27 02:41:18 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c357t-75db2f3ae449afaaeff0de1febf1b0140804485168004949e6095accc13fff483
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-9893-3591
0000-0003-0482-4296
0000-0002-4921-7193
PQID 2410517290
PQPubID 27625
PageCount 15
ParticipantIDs proquest_journals_2410517290
ieee_primary_9043710
crossref_primary_10_1109_TRO_2019_2959445
crossref_citationtrail_10_1109_TRO_2019_2959445
PublicationCentury 2000
PublicationDate 2020-June
2020-6-00
20200601
PublicationDateYYYYMMDD 2020-06-01
PublicationDate_xml – month: 06
  year: 2020
  text: 2020-June
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on robotics
PublicationTitleAbbrev TRO
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References wu (ref60) 0
ref57
oh (ref40) 0
ref13
ref12
ref58
ref14
ref53
ref55
ref11
agrawal (ref38) 0
ref54
ref10
oord (ref67) 2016
fischer (ref65) 0
ref17
ref16
ref19
ref18
ref50
kostrikov (ref86) 2018
liu (ref52) 0
kingma (ref85) 0
ref46
ref45
ref48
ref47
ref42
ref44
ref43
levine (ref7) 2016; 17
ref49
ref8
ref9
ref4
ref3
ref6
ref5
ref81
ref84
wüthrich (ref83) 0
suzuki (ref61) 2016
ghosh (ref82) 2019
ref79
ref35
ref78
ref34
ref37
ref36
kingma (ref63) 0
ngiam (ref41) 0
haarnoja (ref75) 0
ref31
ref74
ref30
ref77
ref33
ref32
srivastava (ref59) 0
ref2
lecun (ref62) 2015; 521
ref1
bohg (ref51) 2011
ref73
ref72
conti (ref80) 0
ref24
ref23
ref26
simonyan (ref66) 0
ref69
ref25
ref64
ref20
schulman (ref76) 2017
calandra (ref15) 0
schulman (ref71) 0
ref22
ref21
ref28
ref27
ref29
lillicrap (ref70) 0
babaeizadeh (ref39) 0
edelman (ref56) 1987
thrun (ref68) 2005
References_xml – year: 0
  ident: ref39
  article-title: Stochastic variational video prediction
  publication-title: Proc Intl Conf on Learning Representations
– ident: ref57
  doi: 10.2991/978-94-6239-133-8_25
– year: 2018
  ident: ref86
  article-title: Pytorch implementations of reinforcement learning algorithms
– ident: ref47
  doi: 10.1109/ICRA.2019.8794048
– ident: ref12
  doi: 10.1177/027836498700600101
– ident: ref50
  doi: 10.1109/IROS.2010.5652967
– ident: ref8
  doi: 10.1109/ICRA.2017.7989384
– ident: ref4
  doi: 10.1109/IROS.2015.7354090
– ident: ref79
  doi: 10.1177/027836499501400103
– ident: ref73
  doi: 10.1109/HUMANOIDS.2016.7803371
– ident: ref42
  doi: 10.1007/978-3-030-01231-1_39
– start-page: 5575
  year: 0
  ident: ref60
  article-title: Multimodal generative models for scalable weakly-supervised learning
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 496
  year: 0
  ident: ref80
  article-title: The chai libraries
  publication-title: Proc Eurohaptics
– ident: ref3
  doi: 10.1109/TRO.2011.2162271
– ident: ref43
  doi: 10.1109/CVPR.2016.264
– ident: ref78
  doi: 10.1177/0278364914559753
– start-page: 5074
  year: 0
  ident: ref38
  article-title: Learning to poke by poking: Experiential learning of intuitive physics
  publication-title: Proc Adv Neural Inf Process Syst
– year: 2005
  ident: ref68
  publication-title: Probabilistic Robotics
– start-page: 314
  year: 0
  ident: ref15
  article-title: The feeling of success: Does touch sensing help predict grasp outcomes?
  publication-title: Proc Conf Robot Learn
– ident: ref20
  doi: 10.1109/IROS.2016.7759578
– ident: ref64
  doi: 10.1109/TPAMI.2018.2889774
– ident: ref18
  doi: 10.1109/IROS.2011.6095096
– ident: ref53
  doi: 10.1109/ICRA.2019.8793763
– year: 1987
  ident: ref56
  publication-title: Neural Darwinism The Theory of Neuronal Group Selection
– ident: ref74
  doi: 10.1109/TRO.2018.2819658
– ident: ref24
  doi: 10.1109/IROS.2011.6094878
– ident: ref2
  doi: 10.1177/027836499000900603
– year: 0
  ident: ref63
  article-title: Auto-encoding variational bayes
  publication-title: Proc Intl Conf on Learning Representations
– ident: ref33
  doi: 10.1007/978-3-030-33950-0_37
– year: 2017
  ident: ref76
  article-title: Proximal policy optimization algorithms
  publication-title: CoRR
– start-page: 1856
  year: 0
  ident: ref75
  article-title: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor
  publication-title: Proc Int Conf Mach Learn
– ident: ref14
  doi: 10.1109/ROBOT.1989.100031
– volume: 17
  start-page: 1334
  year: 2016
  ident: ref7
  article-title: End-to-end training of deep visuomotor policies
  publication-title: J Mach Learn Res
– ident: ref25
  doi: 10.1109/LRA.2018.2852779
– ident: ref84
  doi: 10.1109/IROS40897.2019.8968201
– ident: ref10
  doi: 10.15607/RSS.2018.XIV.009
– year: 0
  ident: ref85
  article-title: Adam: A method for stochastic optimization
  publication-title: Proc 3rd Int Conf Learn Representations
– ident: ref55
  doi: 10.1109/ICRA.2019.8794285
– ident: ref44
  doi: 10.1109/ICRA.2016.7487176
– year: 0
  ident: ref70
  article-title: Continuous control with deep reinforcement learning
  publication-title: Proc Intl Conf on Learning Representations
– ident: ref58
  doi: 10.1109/TRO.2017.2721939
– ident: ref29
  doi: 10.1177/0278364919887447
– ident: ref5
  doi: 10.1109/IROS.2014.6943202
– year: 0
  ident: ref66
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: Proc 3rd Int Conf Learn Representations
– ident: ref48
  doi: 10.1109/IROS.2018.8594077
– start-page: 2758
  year: 0
  ident: ref65
  article-title: Flownet: Learning optical flow with convolutional networks
  publication-title: Proc IEEE Int Conf Comput Vision
– ident: ref77
  doi: 10.1007/s10514-013-9365-9
– ident: ref27
  doi: 10.15607/RSS.2015.XI.044
– year: 2016
  ident: ref67
  article-title: Wavenet: A generative model for raw audio
  publication-title: Proc 9th Speech Synthesis Workshop
– ident: ref1
  doi: 10.1111/j.0956-7976.2004.00691.x
– ident: ref45
  doi: 10.1109/ICRA.2014.6907696
– ident: ref35
  doi: 10.1016/j.neunet.2018.07.006
– ident: ref17
  doi: 10.1109/HUMANOIDS.2015.7363558
– ident: ref81
  doi: 10.1007/978-3-319-28872-7_16
– year: 2019
  ident: ref82
  article-title: From variational to deterministic autoencoders
– ident: ref31
  doi: 10.1109/ICRA.2018.8460875
– ident: ref30
  doi: 10.1109/ICRA.2018.8460528
– ident: ref26
  doi: 10.1109/ICRA.2019.8794233
– ident: ref72
  doi: 10.1177/0278364913506757
– ident: ref23
  doi: 10.1109/ICRA.2019.8794219
– ident: ref13
  doi: 10.1115/1.3149634
– ident: ref6
  doi: 10.1126/scirobotics.aav3123
– ident: ref32
  doi: 10.1109/IROS.2016.7759592
– start-page: 689
  year: 0
  ident: ref41
  article-title: Multimodal deep learning
  publication-title: Proc 28th Int Conf Mach Learn
– volume: 521
  start-page: 436
  year: 2015
  ident: ref62
  article-title: Deep learning
  publication-title: Nature
  doi: 10.1038/nature14539
– ident: ref11
  doi: 10.1109/ICRA.2019.8793485
– ident: ref69
  doi: 10.1109/LRA.2016.2645124
– ident: ref37
  doi: 10.1109/CVPR.2017.538
– ident: ref36
  doi: 10.1109/LRA.2018.2800101
– ident: ref46
  doi: 10.1109/IROS.2017.8206165
– year: 2011
  ident: ref51
  article-title: Multi-modal scene understanding for robotic grasping
– ident: ref34
  doi: 10.1007/978-3-030-28619-4_41
– start-page: 3195
  year: 0
  ident: ref83
  article-title: Probabilistic object tracking using a range camera
  publication-title: Proc IEEE/RSJ Int Conf Intell Robots Syst
– start-page: 249
  year: 0
  ident: ref52
  article-title: Learning end-to-end multimodal sensor policies for autonomous navigation
  publication-title: Proc Conf Robot Learn
– ident: ref54
  doi: 10.1109/CVPR.2019.01086
– ident: ref19
  doi: 10.1109/ICRA.2017.7989326
– ident: ref21
  doi: 10.1109/HUMANOIDS.2015.7363524
– year: 2016
  ident: ref61
  article-title: Joint multimodal learning with deep generative models
– start-page: 2863
  year: 0
  ident: ref40
  article-title: Action-conditional video prediction using deep networks in atari games
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref9
  doi: 10.1109/ICRA.2017.7989324
– start-page: 2222
  year: 0
  ident: ref59
  article-title: Multimodal learning with deep boltzmann machines
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref16
  doi: 10.1109/ICRA.2013.6630999
– ident: ref28
  doi: 10.1007/s10514-015-9435-2
– ident: ref49
  doi: 10.1109/IROS.2018.8593430
– ident: ref22
  doi: 10.1109/ICRA.2019.8793520
– start-page: 1889
  year: 0
  ident: ref71
  article-title: Trust region policy optimization
  publication-title: Proc Int Conf Mach Learn
SSID ssj0024903
Score 2.6731412
Snippet Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 582
SubjectTerms Algorithms
Clearances
Computer simulation
Control systems design
Deep learning in robotics and automation
Haptic interfaces
Machine learning
perception for grasping and manipulation
Reinforcement learning
Representations
Robot sensing systems
Robots
sensor fusion
sensor-based control
Solid modeling
Task analysis
Visualization
Title Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
URI https://ieeexplore.ieee.org/document/9043710
https://www.proquest.com/docview/2410517290
Volume 36
WOSCitedRecordID wos000543027200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1941-0468
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0024903
  issn: 1552-3098
  databaseCode: RIE
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVKxQADXwVRKMgDCxJpna8mx4YQFQMUVErVASmyHZsiIEFNy-_H56alEgiJLYMdJXdO7p597x0hJybmtkUa-U7QdrkTuEHoCBerLGIWS8ZEW1p1_sFN1O3GwyHcV8jZggujlLLFZ6qJl_YsP83lFLfKWoBCPMinWomiaMbV-tbVA9sFGRXFHJ9BPD-SZNDq9-6whguaHoQQIHFpKQTZnio_fsQ2unQ2__dcW2SjzCLpxczt26Sish2yvqQtWCNPt7bPFH0wOFXRXNOBZZFTnqW0n0_l6JyW2qrP1LJw3_PU3LJnK2NLQlJWUJPTUlSw4hLlRuSI9nnxWuySx85V__LaKVspONIPo4kThanwtM9VEADXnCutWapcrYR2BYKsmBmcFrrGRSgYAwp16LiU0vW11kHs75Fqlmdqn1DQvjIvLEToyiA1ANoDzTwBwCUYsBLWSWtu3USWOuPY7uItsXiDQWL8kaA_ktIfdXK6mPEx09j4Y2wN7b8YV5q-ThpzByblR1gkHpawmgQN2MHvsw7Jmofw2W6qNEh1Mp6qI7IqPycvxfjYrq8vOLfMOg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3BTtwwEB0hQGo5tBSK2JZSH7ggEdZOnN0MN1QVgVgWRAPiUCmyHRsQNKnILt9fjze7IIGQesvBjpIZJzPPnvcGYMvH3J4u-0kke0JFUsg00oKqLDKeGc51zwR1_stBfzjMrq7wbA52ZlwYa20oPrO7dBnO8svajGmrrIskxEN8qoVUylhM2FpPynoY-iCTpliUcMymh5Icu_n5KVVx4W6MKUqiLj0LQqGryotfcYgvBx__78mW4UObR7L9ieM_wZytVmDpmbrgKvw-CZ2m2C-PVC2rHbsMPHKmqpLl9djc7LFWXfWaBR7un7r0tzwPtbEtJalqmM9qGWlYKUOCI-aG5aq5az7DxcHP_Mdh1DZTiEyS9kdRPy117BJlpUTllLLO8dIKZ7UTmmBWxj1SS4V3EknGoCUlOmWMEYlzTmbJGsxXdWXXgaFLrH9hrVNhZOkhdIyOxxpRGfRwJe1Ad2rdwrRK49Tw4r4IiINj4f1RkD-K1h8d2J7N-DtR2Xhj7CrZfzauNX0HNqYOLNrPsCliKmL1KRryL6_P-g7vDvOTQTE4Gh5_hfcxgemwxbIB86OHsf0Gi-ZxdNs8bIa19g8BVM-B
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Making+Sense+of+Vision+and+Touch%3A+Learning+Multimodal+Representations+for+Contact-Rich+Tasks&rft.jtitle=IEEE+transactions+on+robotics&rft.au=Lee%2C+Michelle+A&rft.au=Zhu%2C+Yuke&rft.au=Zachares%2C+Peter&rft.au=Tan%2C+Matthew&rft.date=2020-06-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1552-3098&rft.eissn=1941-0468&rft.volume=36&rft.issue=3&rft.spage=582&rft_id=info:doi/10.1109%2FTRO.2019.2959445&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1552-3098&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1552-3098&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1552-3098&client=summon