Private-Shared Disentangled Multimodal VAE for Learning of Latent Representations
Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or modalities. However, current deep multi-modal models focus on the inference of shared representations, while neglecting the important private...
Saved in:
| Published in: | IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops pp. 1692 - 1700 |
|---|---|
| Main Authors: | , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.06.2021
|
| Subjects: | |
| ISSN: | 2160-7516 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or modalities. However, current deep multi-modal models focus on the inference of shared representations, while neglecting the important private aspects of data within individual modalities. In this paper, we introduce a disentangled multi-modal variational autoencoder (DMVAE) that utilizes disentangled VAE strategy to separate the private and shared latent spaces of multiple modalities. We demonstrate the utility of DMVAE two image modalities of MNIST and Google Street View House Number (SVHN) datasets as well as image and text modalities from the Oxford-102 Flowers dataset. Our experiments indicate the essence of retaining the private representation as well as the private-shared disentanglement to effectively direct the information across multiple analysis-synthesis conduits. |
|---|---|
| AbstractList | Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or modalities. However, current deep multi-modal models focus on the inference of shared representations, while neglecting the important private aspects of data within individual modalities. In this paper, we introduce a disentangled multi-modal variational autoencoder (DMVAE) that utilizes disentangled VAE strategy to separate the private and shared latent spaces of multiple modalities. We demonstrate the utility of DMVAE two image modalities of MNIST and Google Street View House Number (SVHN) datasets as well as image and text modalities from the Oxford-102 Flowers dataset. Our experiments indicate the essence of retaining the private representation as well as the private-shared disentanglement to effectively direct the information across multiple analysis-synthesis conduits. |
| Author | Pavlovic, Vladimir Lee, Mihee |
| Author_xml | – sequence: 1 givenname: Mihee surname: Lee fullname: Lee, Mihee email: ml1323@rutgers.edu organization: Rutgers University,Piscataway,NJ,USA – sequence: 2 givenname: Vladimir surname: Pavlovic fullname: Pavlovic, Vladimir email: vladimir@cs.rutgers.edu organization: Rutgers University,Piscataway,NJ,USA |
| BookMark | eNotjG1LwzAUhaMouM39AhHyB1pvcps0-TjmfIGKc-r8ONL1dka6dKRV8N9b1E-Hw3meM2YnoQ3E2KWAVAiwV_P1cvWmEKxJJUiRAgijjthYaK2yzFibH7ORFBqSXAl9xqZd9wEDBEYpiyP2tIz-y_WUPL-7SBW_9h2F3oVdM5SHz6b3-7ZyDV_PFrxuIy_IxeDDjrc1LwYv9HxFh0i_Vu_b0J2z09o1HU3_c8JebxYv87ukeLy9n8-KxEvAPiGXqbIEzHVWlzrPlMISIN-iycDJykoUziiDpA1haWorh1Eh1VtDlIPDCbv4-_VEtDlEv3fxe2OVRBAafwCyuVH7 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/CVPRW53098.2021.00185 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 1665448997 9781665448994 |
| EISSN | 2160-7516 |
| EndPage | 1700 |
| ExternalDocumentID | 9523016 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Science Foundation funderid: 10.13039/100000001 |
| GroupedDBID | 6IE 6IF 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-i203t-ea45bb03764fb674553b007c3840a2d9231a8583e68e3b8f9207c53efc8ee70a3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 28 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000705890201084&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:23:10 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i203t-ea45bb03764fb674553b007c3840a2d9231a8583e68e3b8f9207c53efc8ee70a3 |
| PageCount | 9 |
| ParticipantIDs | ieee_primary_9523016 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-June |
| PublicationDateYYYYMMDD | 2021-06-01 |
| PublicationDate_xml | – month: 06 year: 2021 text: 2021-June |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops |
| PublicationTitleAbbrev | CVPRW |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0001085593 |
| Score | 1.9250941 |
| Snippet | Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1692 |
| SubjectTerms | Computational modeling Computer vision Conferences Data models Internet Pattern recognition Task analysis |
| Title | Private-Shared Disentangled Multimodal VAE for Learning of Latent Representations |
| URI | https://ieeexplore.ieee.org/document/9523016 |
| WOSCitedRecordID | wos000705890201084&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG6AePCECsbf6cGjlW7d1vVoEOKBkEkUuZG2eyUkuBkY_v22ZQEPXrz1R5omfWn7-vp970PoXgc6AQ6CBFyGJNIqJ4pGhsRSGBblTCvwlh7x8TidzUTWQA97LgwAePAZPLqi_8vPS711obKecCHMIGmiJufJjqt1iKc4wJVgNUknoKLXn2aTj5hR4RBcYeD-HJxi8i8RFX-HDNv_m_0EdQ9kPJztr5lT1IDiDLVr7xHXe3PTQa_Z2imVAXE5mG3P89LziorFylY8z_azzOUKT58G2HqquM6susClwSM7rqjwxMNiazZSsemi9-Hgrf9CasEEsgwpqwjIKFaK2jMjMirhURwzu6u4ZvYVJ8Pc-XIyjVMGSQpMpUaEtjNmYHQKwKlk56hVlAVcIExZKBRQnQoNkZRU5U69j5rEOJdR6kvUcSs0_9rlxJjXi3P1d_M1OnYm2EGsblCrWm_hFh3p72q5Wd95Q_4A6W-ftg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG4QTfSECsbf9uDRSbe223o0KME4ySSI3EjbvRES3AwM_37bsYAHL976I02TvrR9ff2-9yF0q13tQwDCcQPpOUyrxFGEpQ6XIqUsoVpBaeko6PfD8VjENXS34cIAQAk-g3tbLP_yk1yvbKisLWwI0_V30C5nzCNrttY2omIhV4JWNB2XiHZnFA8-OCXCYrg81_46WM3kXzIq5S3Sbfxv_kPU2tLxcLy5aI5QDbJj1Kj8R1ztzmUTvcULq1UGjs3CbHoeZyWzKJvOTaVk2n7miZzj0cMTNr4qrnKrTnGe4siMywo8KIGxFR8pW7bQe_dp2Ok5lWSCM_MILRyQjCtFzKnBUuUHjHNq9lWgqXnHSS-x3pwMeUjBD4GqMBWe6eQUUh0CBETSE1TP8gxOESbUEwqIDoUGJiVRidXvI6mfWqdR6jPUtCs0-VpnxZhUi3P-d_MN2u8NX6NJ9Nx_uUAH1hxrwNUlqheLFVyhPf1dzJaL69KoP2g9ov0 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition+workshops&rft.atitle=Private-Shared+Disentangled+Multimodal+VAE+for+Learning+of+Latent+Representations&rft.au=Lee%2C+Mihee&rft.au=Pavlovic%2C+Vladimir&rft.date=2021-06-01&rft.pub=IEEE&rft.eissn=2160-7516&rft.spage=1692&rft.epage=1700&rft_id=info:doi/10.1109%2FCVPRW53098.2021.00185&rft.externalDocID=9523016 |