Anytime 3D Object Reconstruction Using Multi-Modal Variational Autoencoder

For effective human-robot teaming, it is important for the robots to be able to share their visual perception with the human operators. In a harsh remote collaboration setting, data compression techniques such as autoencoder can be utilized to obtain and transmit the data in terms of latent variable...

Full description

Saved in:
Bibliographic Details
Published in:IEEE robotics and automation letters Vol. 7; no. 2; pp. 2162 - 2169
Main Authors: Yu, Hyeonwoo, Oh, Jean
Format: Journal Article
Language:English
Published: Piscataway IEEE 01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2377-3766, 2377-3766
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract For effective human-robot teaming, it is important for the robots to be able to share their visual perception with the human operators. In a harsh remote collaboration setting, data compression techniques such as autoencoder can be utilized to obtain and transmit the data in terms of latent variables in a compact form. In addition, to ensure real-time runtime performance even under unstable environments, an anytime estimation approach is desired that can reconstruct the full contents from incomplete information. In this context, we propose a method for imputation of latent variables whose elements are partially lost. To achieve the anytime property with only a few dimensions of variables, exploiting prior information of the category-level is essential. A prior distribution used in variational autoencoders is simply assumed to be isotropic Gaussian regardless of the labels of each training datapoint. This type of flattened prior makes it difficult to perform imputation from the category-level distributions. We overcome this limitation by exploiting a category-specific multi-modal prior distribution in the latent space. The missing elements of the partially transferred data can be sampled, by finding a specific modal according to the remaining elements. Since the method is designed to use partial elements for anytime estimation, it can also be applied for data over-compression. Based on the experiments on the ModelNet and Pascal3D datasets, the proposed approach shows consistently superior performance over autoencoder and variational autoencoder up to 70% data loss. The software is open source and is available from our repository 1 .
AbstractList For effective human-robot teaming, it is important for the robots to be able to share their visual perception with the human operators. In a harsh remote collaboration setting, data compression techniques such as autoencoder can be utilized to obtain and transmit the data in terms of latent variables in a compact form. In addition, to ensure real-time runtime performance even under unstable environments, an anytime estimation approach is desired that can reconstruct the full contents from incomplete information. In this context, we propose a method for imputation of latent variables whose elements are partially lost. To achieve the anytime property with only a few dimensions of variables, exploiting prior information of the category-level is essential. A prior distribution used in variational autoencoders is simply assumed to be isotropic Gaussian regardless of the labels of each training datapoint. This type of flattened prior makes it difficult to perform imputation from the category-level distributions. We overcome this limitation by exploiting a category-specific multi-modal prior distribution in the latent space. The missing elements of the partially transferred data can be sampled, by finding a specific modal according to the remaining elements. Since the method is designed to use partial elements for anytime estimation, it can also be applied for data over-compression. Based on the experiments on the ModelNet and Pascal3D datasets, the proposed approach shows consistently superior performance over autoencoder and variational autoencoder up to 70% data loss. The software is open source and is available from our repository 1 .
For effective human-robot teaming, it is important for the robots to be able to share their visual perception with the human operators. In a harsh remote collaboration setting, data compression techniques such as autoencoder can be utilized to obtain and transmit the data in terms of latent variables in a compact form. In addition, to ensure real-time runtime performance even under unstable environments, an anytime estimation approach is desired that can reconstruct the full contents from incomplete information. In this context, we propose a method for imputation of latent variables whose elements are partially lost. To achieve the anytime property with only a few dimensions of variables, exploiting prior information of the category-level is essential. A prior distribution used in variational autoencoders is simply assumed to be isotropic Gaussian regardless of the labels of each training datapoint. This type of flattened prior makes it difficult to perform imputation from the category-level distributions. We overcome this limitation by exploiting a category-specific multi-modal prior distribution in the latent space. The missing elements of the partially transferred data can be sampled, by finding a specific modal according to the remaining elements. Since the method is designed to use partial elements for anytime estimation, it can also be applied for data over-compression. Based on the experiments on the ModelNet and Pascal3D datasets, the proposed approach shows consistently superior performance over autoencoder and variational autoencoder up to 70% data loss. The software is open source and is available from our repository1.
Author Yu, Hyeonwoo
Oh, Jean
Author_xml – sequence: 1
  givenname: Hyeonwoo
  orcidid: 0000-0002-9505-7581
  surname: Yu
  fullname: Yu, Hyeonwoo
  email: hwyu2019@gmail.com
  organization: School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
– sequence: 2
  givenname: Jean
  orcidid: 0000-0001-9709-2658
  surname: Oh
  fullname: Oh, Jean
  email: jeanoh@cmu.edu
  organization: Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
BookMark eNp9kL1rwzAQxUVJoWmavdDF0NmpdLIlewzpNwmB0HQ1snwuCo6VSvKQ_75OE0rp0Oke3Hv38bskg9a2SMg1oxPGaH43X00nQAEmnCWQ8PyMDIFLGXMpxOCXviBj7zeUUpaC5Hk6JK_Tdh_MFiN-Hy3LDeoQrVDb1gfX6WBsG629aT-iRdcEEy9spZroXTmjDr1eT7tgsdW2QndFzmvVeByf6oisHx_eZs_xfPn0MpvOYw05C3HJdJILWcqEY6J5qYBVgmXIKXIhslJUDJjmeVZJWQtIgKsSZYI01WmKUPMRuT3O3Tn72aEPxcZ2rj_GFyAAMk4lS3sXPbq0s947rIudM1vl9gWjxQFa0UMrDtCKE7Q-Iv5EtAnfjwanTPNf8OYYNIj4sycXGQMp-Re_BXnK
CODEN IRALC6
CitedBy_id crossref_primary_10_1016_j_patcog_2023_109674
crossref_primary_10_1109_TMM_2023_3312944
crossref_primary_10_12677_MOS_2023_124375
crossref_primary_10_1109_ACCESS_2025_3562671
crossref_primary_10_3390_s24072314
crossref_primary_10_1007_s10462_023_10687_x
Cites_doi 10.1109/TII.2019.2951622
10.1109/CVPR.2018.00904
10.1016/j.ifacol.2018.09.406
10.1007/978-3-030-50423-6_17
10.1109/3DV50981.2020.00022
10.1016/j.conengprac.2019.104198
10.1007/978-3-030-58536-5_22
10.1109/TRO.2019.2909168
10.1109/ICRA.2019.8794244
10.1007/s10618-020-00706-8
10.1109/ICRA.2018.8460816
10.1109/CVPR.2013.178
10.1093/gigascience/giaa082
10.1109/ICRA.2017.7989203
10.1007/978-3-030-01252-6_4
10.1109/ICRA.2019.8794111
10.1109/ICCV.2017.155
10.1109/WACV.2014.6836101
10.1109/ICCV.2015.314
10.1007/s11042-020-09722-8
10.1109/IROS.2018.8593831
10.1109/CVPR.2015.7298800
10.1109/ICCV.2015.308
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/LRA.2022.3142439
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2377-3766
EndPage 2169
ExternalDocumentID 10_1109_LRA_2022_3142439
9681277
Genre orig-research
GrantInformation_xml – fundername: Air Force Office of Scientific Research
  grantid: FA2386-17-1-4660
  funderid: 10.13039/100000181
– fundername: US ARMY ACC-APG-RTP
  grantid: W911NF1820218
– fundername: AI-Assisted Detection and Threat Recognition Program
GroupedDBID 0R~
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c291t-b1c4967b743e4c3ba21d618e30e3668b6d121c398d77f62423abe74e05c55e2f3
IEDL.DBID RIE
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000748560800031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2377-3766
IngestDate Sun Nov 30 04:07:00 EST 2025
Tue Nov 18 19:37:57 EST 2025
Sat Nov 29 06:03:14 EST 2025
Wed Aug 27 03:00:23 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c291t-b1c4967b743e4c3ba21d618e30e3668b6d121c398d77f62423abe74e05c55e2f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-9505-7581
0000-0001-9709-2658
PQID 2622830715
PQPubID 4437225
PageCount 8
ParticipantIDs crossref_primary_10_1109_LRA_2022_3142439
ieee_primary_9681277
crossref_citationtrail_10_1109_LRA_2022_3142439
proquest_journals_2622830715
PublicationCentury 2000
PublicationDate 2022-04-01
PublicationDateYYYYMMDD 2022-04-01
PublicationDate_xml – month: 04
  year: 2022
  text: 2022-04-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE robotics and automation letters
PublicationTitleAbbrev LRA
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References wu (ref35) 0
ref34
ref37
ref15
ref36
ref14
ref30
larsson (ref17) 0
ref32
ref10
ref2
ref1
yu (ref21) 0
kingma (ref26) 0
zilberstein (ref16) 1996; 17
wu (ref11) 0
ref24
ran (ref20) 0
ref25
ref22
wu (ref13) 0
hinton (ref33) 2015
ref28
zhang (ref19) 0
ref27
ref29
ref8
camino (ref23) 0
ref7
ref9
ref4
ref3
ref6
ref5
han (ref31) 0
pontes (ref12) 2017
brock (ref18) 0
References_xml – ident: ref29
  doi: 10.1109/TII.2019.2951622
– ident: ref34
  doi: 10.1109/CVPR.2018.00904
– start-page: 46
  year: 0
  ident: ref21
  article-title: Zero-shot learning via simultaneous generating and learning
  publication-title: Proc Adv Neural Inf Process Syst
– year: 0
  ident: ref18
  article-title: Generative and discriminative voxel modeling with convolutional neural networks
  publication-title: Proc Neural Inofrmation Process Conf 3D Deep Learn
– ident: ref27
  doi: 10.1016/j.ifacol.2018.09.406
– ident: ref24
  doi: 10.1007/978-3-030-50423-6_17
– ident: ref30
  doi: 10.1109/3DV50981.2020.00022
– year: 0
  ident: ref17
  article-title: Fractalnet: Ultra-deep neural networks without residuals
  publication-title: Proc Int Conf Learn Representations
– start-page: 82
  year: 0
  ident: ref13
  article-title: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 1912
  year: 0
  ident: ref35
  article-title: 3D shapenets: A deep representation for volumetric shapes
  publication-title: Proc IEEE Conf Comput Vis Pattern Recognit
– ident: ref28
  doi: 10.1016/j.conengprac.2019.104198
– volume: 17
  start-page: 73
  year: 1996
  ident: ref16
  article-title: Using anytime algorithms in intelligent systems
  publication-title: AI Mag
– start-page: 540
  year: 0
  ident: ref11
  article-title: MarrNet: 3D shape reconstruction via 2.5D sketches
  publication-title: Proc Adv Neural Inf Process Syst
– year: 0
  ident: ref26
  article-title: Auto-encoding variational bayes
– ident: ref15
  doi: 10.1007/978-3-030-58536-5_22
– ident: ref3
  doi: 10.1109/TRO.2019.2909168
– ident: ref5
  doi: 10.1109/ICRA.2019.8794244
– year: 0
  ident: ref19
  article-title: PVT: Point-voxel transformer for 3D deep learning
– ident: ref25
  doi: 10.1007/s10618-020-00706-8
– ident: ref2
  doi: 10.1109/ICRA.2018.8460816
– ident: ref1
  doi: 10.1109/CVPR.2013.178
– ident: ref22
  doi: 10.1093/gigascience/giaa082
– ident: ref4
  doi: 10.1109/ICRA.2017.7989203
– ident: ref14
  doi: 10.1007/978-3-030-01252-6_4
– ident: ref10
  doi: 10.1109/ICRA.2019.8794111
– ident: ref37
  doi: 10.1109/ICCV.2017.155
– ident: ref36
  doi: 10.1109/WACV.2014.6836101
– ident: ref32
  doi: 10.1109/ICCV.2015.314
– year: 0
  ident: ref31
  article-title: Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding
– ident: ref8
  doi: 10.1007/s11042-020-09722-8
– ident: ref9
  doi: 10.1109/IROS.2018.8593831
– start-page: 15477
  year: 0
  ident: ref20
  article-title: Learning inner-group relations on point clouds
  publication-title: Proc IEEE/CVF Int Conf Comput Vis
– year: 0
  ident: ref23
  article-title: Improving missing data imputation with deep generative models
– year: 2017
  ident: ref12
  article-title: Image2Mesh: A learning framework for single image 3D reconstruction
– ident: ref7
  doi: 10.1109/CVPR.2015.7298800
– ident: ref6
  doi: 10.1109/ICCV.2015.308
– year: 2015
  ident: ref33
  article-title: Distilling the knowledge in a neural network
SSID ssj0001527395
Score 2.238921
Snippet For effective human-robot teaming, it is important for the robots to be able to share their visual perception with the human operators. In a harsh remote...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 2162
SubjectTerms 3D object reconstruction
anytime algorithm
Data compression
data imputation
Data loss
Decoding
Estimation
multi-modal variational autoencoder
Real-time systems
Robots
Shape
Three-dimensional displays
Training
Visual perception
Visualization
Title Anytime 3D Object Reconstruction Using Multi-Modal Variational Autoencoder
URI https://ieeexplore.ieee.org/document/9681277
https://www.proquest.com/docview/2622830715
Volume 7
WOSCitedRecordID wos000748560800031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared)
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: RIE
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: M~E
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGH7Zhgc9-DXF6Rw9eBGsa9J8NMeiGyJuiqjsVtokBUFW2Yfgxd9uknYfoAjeekhCeZL2zfMm7_MAnGkuiSZp6BMTzn2Sp9JPTeD2lUBSsAjnUVaaTfDhMBqNxEMNLpa1MFprd_lMX9pHd5avCjm3qbKusGJZnNehzjkra7VW-RSrJCbo4iQyEN27x9jwP4wNLSWYWDfwtcjjrFR-_H9dUOnv_O91dmG72jx6cTnbe1DT433YWpMUbMJtPP60fvFeeO3dZzbJ4lmGudKJ9dwlAc8V3vqDQpnxXgxfrnKCXjyfFVbbUunJATz3e09XN37ll-BLLNDMz5AkgvHMbAo0kWGWYqQYinQY6JCxKGMKYSRDESnOc1sXEqaZ5kQHVFKqcR4eQmNcjPUReBHlShqmlCEVEEWlSJWTdWG5orZ-vgXdBZaJrMTErafFW-JIRSASg35i0U8q9FtwvuzxXgpp_NG2adFetquAbkF7MV1J9aVNE8ychhlH9Pj3Xiewaccub9u0oWHg1qewIT9mr9NJB-qDr17HLaVvWfLFnQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFL3MKagPfk1xOrUPvgjWNWnSNI9DHVO3KTJlb6VNUhCklX0I_nuTtPsARfCtD0lbTtLenJvccwDOFRNEkdh3iQ7nLklj4cY6cLuSI8GDEKdhUphNsH4_HA75UwUu57UwSil7-ExdmUu7ly9zMTWpsiY3YlmMrcAqJQR7RbXWIqNitMQ4ne1FerzZfW5pBoixJqYEE-MHvhR7rJnKjz-wDSvt7f-90A5slctHp1WM9y5UVLYHm0uigjW4b2VfxjHe8W-cx8SkWRzDMRdKsY49JuDY0lu3l0t9v1fNmMusoNOaTnKjbinVaB9e2reD645bOia4AnM0cRMkCA9YopcFigg_iTGSAQqV7yk_CMIkkAgj4fNQMpaayhA_ThQjyqOCUoVT_wCqWZ6pQ3BCyqTQXClB0iOSCh5LK-wSpJKaCvo6NGdYRqKUEzeuFu-RpRUejzT6kUE_KtGvw8W8x0chpfFH25pBe96uBLoOjdlwReW3No5wYFXMGKJHv_c6g_XOoNeNunf9h2PYMM8pzt40oKqhVyewJj4nb-PRqZ1Q3x5Jx7M
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Anytime+3D+Object+Reconstruction+Using+Multi-Modal+Variational+Autoencoder&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Yu%2C+Hyeonwoo&rft.au=Oh%2C+Jean&rft.date=2022-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.eissn=2377-3766&rft.volume=7&rft.issue=2&rft.spage=2162&rft_id=info:doi/10.1109%2FLRA.2022.3142439&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon