ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders

The variational autoencoder (VAE) [19], [41] is a popular, deep, latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data distribution. Moreover, optimizing the VAE objective function is more manageable than other DLVMs. The bottleneck dimension of the VAE is a...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings / IEEE Workshop on Applications of Computer Vision pp. 889 - 898
Main Authors: Saha, Surojit, Joshi, Sarang, Whitaker, Ross
Format: Conference Proceeding
Language:English
Published: IEEE 26.02.2025
Subjects:
ISSN:2642-9381
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract The variational autoencoder (VAE) [19], [41] is a popular, deep, latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data distribution. Moreover, optimizing the VAE objective function is more manageable than other DLVMs. The bottleneck dimension of the VAE is a crucial design choice, and it has strong ramifications for the model's performance, such as finding the hidden explanatory factors of a dataset using the representations learned by the VAE. However, the size of the latent dimension of the VAE is often treated as a hyperparameter estimated empirically through trial and error. To this end, we propose a statistical formulation to discover the relevant latent factors required for modeling a dataset. In this work, we use a hierarchical prior in the latent space that estimates the variance of the latent axes using the encoded data, which identifies the relevant latent dimensions. For this, we replace the fixed prior in the VAE objective function with a hierarchical prior, keeping the remainder of the formulation unchanged. We call the proposed method the automatic relevancy detection in the variational autoencoder (ARD-VAE) 1 1 https://github.com/Surojit-Utah/ARD-VAE. We demonstrate the efficacy of the ARD-VAE on multiple benchmark datasets in finding the relevant latent dimensions and their effect on different evaluation metrics, such as FID score and disentanglement analysis.
AbstractList The variational autoencoder (VAE) [19], [41] is a popular, deep, latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data distribution. Moreover, optimizing the VAE objective function is more manageable than other DLVMs. The bottleneck dimension of the VAE is a crucial design choice, and it has strong ramifications for the model's performance, such as finding the hidden explanatory factors of a dataset using the representations learned by the VAE. However, the size of the latent dimension of the VAE is often treated as a hyperparameter estimated empirically through trial and error. To this end, we propose a statistical formulation to discover the relevant latent factors required for modeling a dataset. In this work, we use a hierarchical prior in the latent space that estimates the variance of the latent axes using the encoded data, which identifies the relevant latent dimensions. For this, we replace the fixed prior in the VAE objective function with a hierarchical prior, keeping the remainder of the formulation unchanged. We call the proposed method the automatic relevancy detection in the variational autoencoder (ARD-VAE) 1 1 https://github.com/Surojit-Utah/ARD-VAE. We demonstrate the efficacy of the ARD-VAE on multiple benchmark datasets in finding the relevant latent dimensions and their effect on different evaluation metrics, such as FID score and disentanglement analysis.
Author Saha, Surojit
Whitaker, Ross
Joshi, Sarang
Author_xml – sequence: 1
  givenname: Surojit
  surname: Saha
  fullname: Saha, Surojit
  email: surojit.saha@utah.edu
  organization: The University of Utah,USA
– sequence: 2
  givenname: Sarang
  surname: Joshi
  fullname: Joshi, Sarang
  email: sarang.joshi@utah.edu
  organization: The University of Utah,USA
– sequence: 3
  givenname: Ross
  surname: Whitaker
  fullname: Whitaker, Ross
  email: whitaker@cs.utah.edu
  organization: The University of Utah,USA
BookMark eNotUNtKAzEUjKJgW_sHfcgPbM19N74tvahQEKrWx5ImZzGyTWSTCv59g_o0DHOBmTG6CjEAQjNK5pQSfffeLnaKEkHnjDA5J4RodYGmutYN51RS0nB6iUZMCVZp3tAbNE7pkxCuqeYjBO12We3a1T1u8Us22afsrenxOg7HU194DDhHvPbB4fwBeAs9fJuQ8cZkKLD0RwipuBKOHd6Zwf9mSkN7yhGCjQ6GdIuuO9MnmP7jBL2tV6-Lx2rz_PC0aDeVp3WTK-saS0ALc1DCSddZI7RxB-G0BusMc7RMOtRWGamkYVBrqZRmHSGqo4IIPkGzv14PAPuvwR_N8LMvNwkui3wG4spZQA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/WACV61041.2025.00096
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798331510831
EISSN 2642-9381
EndPage 898
ExternalDocumentID 10943504
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i178t-cd8c0e94ab64d5dfca49adb4d99ecda2d1331b7c6a565a2e7956692f006f14043
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001481328900086&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:04:19 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i178t-cd8c0e94ab64d5dfca49adb4d99ecda2d1331b7c6a565a2e7956692f006f14043
PageCount 10
ParticipantIDs ieee_primary_10943504
PublicationCentury 2000
PublicationDate 2025-Feb.-26
PublicationDateYYYYMMDD 2025-02-26
PublicationDate_xml – month: 02
  year: 2025
  text: 2025-Feb.-26
  day: 26
PublicationDecade 2020
PublicationTitle Proceedings / IEEE Workshop on Applications of Computer Vision
PublicationTitleAbbrev WACV
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0039193
Score 2.2976549
Snippet The variational autoencoder (VAE) [19], [41] is a popular, deep, latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data...
SourceID ieee
SourceType Publisher
StartPage 889
SubjectTerms Autoencoders
automatic relevancy detection
Benchmark testing
Computer architecture
Computer vision
Data models
hierarchical prior
Linear programming
Measurement
Neural networks
Robustness
Stability analysis
variational autoencoders
Title ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders
URI https://ieeexplore.ieee.org/document/10943504
WOSCitedRecordID wos001481328900086&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1aPHiqHxW_ycFrtJtmk8bbUrt4kFKK1t5KNh8gSFfarb_fmTS1Jw-edlkWFiZM5r3ZvDeE3MGy5t0q5ywLPmdCyYoZZy0zfe8s7-U2qzbDJtRo1J_N9DiJ1aMWxnsfD5_5e7yN__JdbdfYKoMM11Dd0f1zXym5EWttt92eBiiStHHw3sN7MZgCNBDIATn2TbrRl383QSUWkLL9z08fkc5OikfHv0XmmOz5xQlpJ-xIU2auTokvJk9sWgwfaUERQEb_ZfNJS8CkaUIXbWpaAgWngPnoBHXlEFX6AmATLk_o8o-dsxWtA50Cg05dQlqsmxrdLvHEc4e8lcPXwTNLIxTYR6b6DUPlf9drYSopXO6CNUIbVwmntbfOcAcUNauUlQaAneFeAV2SmgfIxRCNd85Ia1Ev_DmhKkC9cxwIjpYiw2mnPW1N0FmuAqDMcEE6GLb518YlY76N2OUfz6_IIa5MlIfLa9Jqlmt_Qw7sN0RoeRvX9gfKsqX9
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Na8JAEF2KLbQn-2Hpd_fQ67bJuknc3oIaLLUiYq032ewHCMUUjf39nVljPfXQU0IIBGaZnfcm-94Q8gDLGgV5xFnobMREEudMGa2ZalmjeTPSYb4ZNpEMBq3pVA4rsbrXwlhr_eEz-4i3_l--KfQaW2WQ4RKqO7p_7kdC8GAj19puvE0JYKRSx8GbTx9pewLgQCAL5Ng5Cbwz_26Gii8hWf2fHz8mjZ0Yjw5_y8wJ2bOLU1Kv0COtcnN1Rmw66rBJ2n2mKUUI6R2Y1SfNAJVWM7poWdAMSDgF1EdHqCyHuNI-wE24dNDnH3tnK1o4OgEOXfUJabouC_S7xDPPDfKedcftHquGKLB5mLRKhtr_wEqh8liYyDithFQmF0ZKq43iBkhqmCc6VgDtFLcJEKZYcgfZ6Lz1zjmpLYqFvSA0cVDxDAeKI2MR4rzTptTKyTBKHOBMd0kaGLbZ18YnY7aN2NUfz-_JYW_81p_1Xwav1-QIV8mLxeMbUiuXa3tLDvQ3RGt559f5B5rlqUQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+Workshop+on+Applications+of+Computer+Vision&rft.atitle=ARD-VAE%3A+A+Statistical+Formulation+to+Find+the+Relevant+Latent+Dimensions+of+Variational+Autoencoders&rft.au=Saha%2C+Surojit&rft.au=Joshi%2C+Sarang&rft.au=Whitaker%2C+Ross&rft.date=2025-02-26&rft.pub=IEEE&rft.eissn=2642-9381&rft.spage=889&rft.epage=898&rft_id=info:doi/10.1109%2FWACV61041.2025.00096&rft.externalDocID=10943504