Deterministic Autoencoder using Wasserstein loss for tabular data generation
Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitati...
Gespeichert in:
| Veröffentlicht in: | Neural networks Jg. 185; S. 107208 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
United States
Elsevier Ltd
01.05.2025
|
| Schlagworte: | |
| ISSN: | 0893-6080, 1879-2782, 1879-2782 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model’s latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE’s superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE’s advantages, establishing TWAE as a robust solution for complex tabular data synthesis. |
|---|---|
| AbstractList | Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model’s latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE’s superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE’s advantages, establishing TWAE as a robust solution for complex tabular data synthesis. Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model's latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE's superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE's advantages, establishing TWAE as a robust solution for complex tabular data synthesis.Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model's latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE's superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE's advantages, establishing TWAE as a robust solution for complex tabular data synthesis. |
| ArticleNumber | 107208 |
| Author | Nguyen, Binh P. Wang, Alex X. |
| Author_xml | – sequence: 1 givenname: Alex X. orcidid: 0000-0002-3691-8652 surname: Wang fullname: Wang, Alex X. email: alex.wang@vuw.ac.nz organization: School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6012, New Zealand – sequence: 2 givenname: Binh P. orcidid: 0000-0001-6203-6664 surname: Nguyen fullname: Nguyen, Binh P. email: binh.p.nguyen@vuw.ac.nz organization: School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6012, New Zealand |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39893805$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFkMtKAzEUhoMo9qJvIDJLN1OTzC3jQij1CgU3isuQSc6UlGlSk4zg25s67caFrg6c8_0_nG-Cjo01gNAFwTOCSXm9nhnoDYQZxbSIq4pidoTGhFV1SitGj9EYszpLS8zwCE28X2OMS5Znp2iU1fHCcDFGyzsI4DbaaB-0TOZ9sGCkVeCS3muzSt6F9-B8AG2SznqftNYlQTR9J1yiRBDJCgw4EbQ1Z-ikFZ2H8_2coreH-9fFU7p8eXxezJepzDELaaEakNCITIBi0AqiamBKKJFRBg0IVlLW1kUOSjUKFzVuZVXKtqZlJZumJtkUXQ29W2c_evCBb7SX0HXCgO09z0hsyEuS5xG93KN9swHFt05vhPviBwMRuBkA6eJ3Dloudfj5JjihO04w3-nmaz7o5jvdfNAdw_mv8KH_n9jtEIMo6VOD417qqB2UdiADV1b_XfAN6kGeNw |
| CitedBy_id | crossref_primary_10_7717_peerj_cs_2732 crossref_primary_10_1016_j_engappai_2025_110819 crossref_primary_10_3390_electronics14112230 crossref_primary_10_1007_s10115_025_02377_7 crossref_primary_10_1038_s41598_025_05545_5 |
| Cites_doi | 10.1016/j.asoc.2024.112223 10.1186/s40537-023-00792-7 10.1145/3534678.3539454 10.1609/aaai.v33i01.33015049 10.1007/s44163-021-00016-y 10.1613/jair.953 10.1016/j.asoc.2023.110895 10.1016/j.ins.2021.12.018 10.1145/3583780.3615202 10.1609/aaai.v33i01.33014610 10.29012/jpc.v1i1.568 10.1016/j.ins.2024.121610 10.1146/annurev-statistics-040720-031848 10.1016/j.inffus.2021.11.011 10.1038/s41467-022-35295-1 10.1016/j.ins.2023.02.004 10.1016/j.ipm.2023.103558 10.1109/CVPR42600.2020.00611 |
| ContentType | Journal Article |
| Copyright | 2025 The Authors Copyright © 2025 The Authors. Published by Elsevier Ltd.. All rights reserved. |
| Copyright_xml | – notice: 2025 The Authors – notice: Copyright © 2025 The Authors. Published by Elsevier Ltd.. All rights reserved. |
| DBID | 6I. AAFTH AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1016/j.neunet.2025.107208 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1879-2782 |
| ExternalDocumentID | 39893805 10_1016_j_neunet_2025_107208 S0893608025000875 |
| Genre | Journal Article |
| GroupedDBID | --- --K --M -~X .DC .~1 0R~ 123 186 1B1 1RT 1~. 1~5 29N 4.4 457 4G. 53G 5RE 5VS 6I. 6TJ 7-5 71M 8P~ 9JM 9JN AABNK AAEDT AAEDW AAFTH AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AATTM AAXKI AAXLA AAXUO AAYFN AAYWO ABAOU ABBOA ABCQJ ABDPE ABEFU ABFNM ABFRF ABHFT ABIVO ABJNI ABLJU ABMAC ABWVN ABXDB ACDAQ ACGFO ACGFS ACIUM ACNNM ACRLP ACRPL ACVFH ACZNC ADBBV ADCNI ADEZE ADGUI ADJOM ADMUD ADNMO ADRHT AEBSH AECPX AEFWE AEIPS AEKER AENEX AEUPX AFJKZ AFPUW AFTJW AFXIZ AGCQF AGHFR AGQPQ AGRNS AGUBO AGWIK AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIGII AIIUN AIKHN AITUG AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD APXCP ARUGR ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC BNPGV CS3 DU5 EBS EFJIC EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HMQ HVGLF HZ~ IHE J1W JJJVA K-O KOM KZ1 LG9 LMP M2V M41 MHUIS MO0 MOBAO MVM N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCC SDF SDG SDP SES SEW SNS SPC SPCBC SSH SSN SST SSV SSW SSZ T5K TAE UAP UNMZH VOH WUQ XPP ZMT ~G- 9DU AAYXX ACLOT CITATION EFKBS EFLBG ~HD CGR CUY CVF ECM EIF NPM 7X8 |
| ID | FETCH-LOGICAL-c408t-5dbeceba3aed8efa1d9e8dada328ebea8628f954eddbd0590fc76cf9267cbb913 |
| ISICitedReferencesCount | 7 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001417182100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0893-6080 1879-2782 |
| IngestDate | Tue Sep 30 21:17:16 EDT 2025 Mon Jul 21 05:55:35 EDT 2025 Sat Nov 29 07:57:49 EST 2025 Tue Nov 18 22:40:46 EST 2025 Sat Jun 07 17:02:19 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Deep neural networks Generative AI Tabular data synthesis Latent space interpolation Wasserstein Autoencoder |
| Language | English |
| License | This is an open access article under the CC BY license. Copyright © 2025 The Authors. Published by Elsevier Ltd.. All rights reserved. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c408t-5dbeceba3aed8efa1d9e8dada328ebea8628f954eddbd0590fc76cf9267cbb913 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ORCID | 0000-0002-3691-8652 0000-0001-6203-6664 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.neunet.2025.107208 |
| PMID | 39893805 |
| PQID | 3162846144 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_3162846144 pubmed_primary_39893805 crossref_citationtrail_10_1016_j_neunet_2025_107208 crossref_primary_10_1016_j_neunet_2025_107208 elsevier_sciencedirect_doi_10_1016_j_neunet_2025_107208 |
| PublicationCentury | 2000 |
| PublicationDate | May 2025 2025-05-00 2025-May 20250501 |
| PublicationDateYYYYMMDD | 2025-05-01 |
| PublicationDate_xml | – month: 05 year: 2025 text: May 2025 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Neural networks |
| PublicationTitleAlternate | Neural Netw |
| PublicationYear | 2025 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | (pp. 5902–5911). Kim, J., Lee, C., & Park, N. (2023). STaSy: Score-based Tabular data Synthesis. In Zhao, J., Kim, Y., Zhang, K., Rush, A., & LeCun, Y. (2018). Adversarially regularized autoencoders. In (pp. 7176–7185). Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In Wang, Chukova, Simpson, Nguyen (b33) 2024; 166 Kim, J., Lee, C., Shin, Y., Park, S., Kim, M., Park, N., et al. (2022). SOS: Score-based oversampling for tabular data. In Wang, Chukova, Nguyen (b31) 2023; 629 Borisov, Leemann, Seßler, Haug, Pawelczyk, Kasneci (b3) 2022 Woo, Reiter, Oganian, Karr (b36) 2009; 1 James, Harbron, Branson, Sundler (b12) 2021; 1 Yan, Yan, Wan, Zhang, Omberg, Guinney (b38) 2022; 13 Beckham, Honari, Verma, Lamb, Ghadiri, Hjelm (b2) 2019; 32 (pp. 97–112). Raghunathan (b24) 2021; 8 Sun, Y., Cuesta-Infante, A., & Veeramachaneni, K. (2019). Learning vine copula models for synthetic data generation. Naeem, M. F., Oh, S. J., Uh, Y., Choi, Y., & Yoo, J. (2020). Reliable fidelity and diversity metrics for generative models. In Torfi, Fox, Reddy (b30) 2022; 586 Lampis, Lomurno, Matteucci (b19) 2023 Struski, Sadowski, Danel, Tabor, Podolak (b27) 2023 Zhao, Birke, Chen (b42) 2023 . Wang, Chukova, Nguyen (b32) 2023; 148 Rabbani, Samad (b23) 2023 Borisov, V., Seßler, K., Leemann, T., Pawelczyk, M., & Kasneci, G. (2023). Language models are realistic tabular data generators. In Wang, Simpson, Nguyen (b35) 2025; 691 Tolstikhin, I., Bousquet, O., Gelly, S., & Schoelkopf, B. (2018). A note on the evaluation of generative models. In Zhao, Z., Birke, R., & Chen, L. Y. (2023a). FCT-GAN: Enhancing Global Correlation of Table Synthesis via Fourier Transform. In (pp. 17564–17579). Chong, M. J., & Forsyth, D. (2020). Effectively unbiased fid and inception score and where to find them. In In (pp. 4450–4454). Shwartz-Ziv, Armon (b25) 2022; 81 ICLR 2024. (pp. 1–18). Xu, Skoularidou, Cuesta-Infante, Veeramachaneni (b37) 2019 Dablain, Krawczyk, Chawla (b8) 2022 Wang, Chukova, Sporle, Milne, Simpson, Nguyen (b34) 2024; 61 Lai, C.-H., Zou, D., & Lerman, G. (2023). Robust Variational Autoencoding with Wasserstein Penalty for Novelty Detection. In (pp. 762–772). (pp. 3538–3567). (pp. 4610–4617). (pp. 1–16). (pp. 5049–5057). Kolouri, S., Pope, P. E., Martin, C. E., & Rohde, G. K. (2019). Sliced Wasserstein auto-encoders. In Platzer, Reutterer (b22) 2021 Gai, Zhang (b10) 2023 (pp. 6070–6079). Yi, M., & Liu, S. (2023). Sliced Wasserstein variational inference. In Zhao, Z., Kunar, A., Birke, R., & Chen, L. Y. (2021). CTAB-GAN: Effective table data synthesizing. In ICLR 2023. Zhang, H., Zhang, J., Srinivasan, B., Shen, Z., Qin, X., Faloutsos, C., et al. (2024). Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space. In Arbel, Korba, Salim, Gretton (b1) 2019; 32 Solatorio, Dupriez (b26) 2023 Kotelnikov, A., Baranchuk, D., Rubachev, I., & Babenko, A. (2023). TABDDPM: Modelling tabular data with diffusion models. In Bousquet, Gelly, Tolstikhin, Simon-Gabriel, Schoelkopf (b5) 2017 Gupta, A., Bhatt, D., & Pandey, A. (2021). Transitioning from Real to Synthetic data: Quantifying the bias in model. In Mukherjee, S., Asnani, H., Lin, E., & Kannan, S. (2019). ClusterGAN: Latent Space Clustering in Generative Adversarial Networks. (pp. 1213–1228). Zhao, Kunar, Birke, Chen (b45) 2022 Fonseca, Bacao (b9) 2023; 10 Chawla, Bowyer, Hall, Kegelmeyer (b6) 2002; 16 Dablain (10.1016/j.neunet.2025.107208_b8) 2022 Chawla (10.1016/j.neunet.2025.107208_b6) 2002; 16 Fonseca (10.1016/j.neunet.2025.107208_b9) 2023; 10 Solatorio (10.1016/j.neunet.2025.107208_b26) 2023 Wang (10.1016/j.neunet.2025.107208_b32) 2023; 148 Wang (10.1016/j.neunet.2025.107208_b35) 2025; 691 Woo (10.1016/j.neunet.2025.107208_b36) 2009; 1 Platzer (10.1016/j.neunet.2025.107208_b22) 2021 James (10.1016/j.neunet.2025.107208_b12) 2021; 1 Raghunathan (10.1016/j.neunet.2025.107208_b24) 2021; 8 Xu (10.1016/j.neunet.2025.107208_b37) 2019 10.1016/j.neunet.2025.107208_b44 Gai (10.1016/j.neunet.2025.107208_b10) 2023 10.1016/j.neunet.2025.107208_b21 10.1016/j.neunet.2025.107208_b43 10.1016/j.neunet.2025.107208_b20 10.1016/j.neunet.2025.107208_b41 Borisov (10.1016/j.neunet.2025.107208_b3) 2022 10.1016/j.neunet.2025.107208_b29 10.1016/j.neunet.2025.107208_b28 Wang (10.1016/j.neunet.2025.107208_b34) 2024; 61 Rabbani (10.1016/j.neunet.2025.107208_b23) 2023 Wang (10.1016/j.neunet.2025.107208_b31) 2023; 629 Zhao (10.1016/j.neunet.2025.107208_b42) 2023 Arbel (10.1016/j.neunet.2025.107208_b1) 2019; 32 Shwartz-Ziv (10.1016/j.neunet.2025.107208_b25) 2022; 81 Torfi (10.1016/j.neunet.2025.107208_b30) 2022; 586 Yan (10.1016/j.neunet.2025.107208_b38) 2022; 13 Lampis (10.1016/j.neunet.2025.107208_b19) 2023 Wang (10.1016/j.neunet.2025.107208_b33) 2024; 166 10.1016/j.neunet.2025.107208_b40 Bousquet (10.1016/j.neunet.2025.107208_b5) 2017 10.1016/j.neunet.2025.107208_b11 10.1016/j.neunet.2025.107208_b7 Zhao (10.1016/j.neunet.2025.107208_b45) 2022 10.1016/j.neunet.2025.107208_b15 Struski (10.1016/j.neunet.2025.107208_b27) 2023 10.1016/j.neunet.2025.107208_b14 10.1016/j.neunet.2025.107208_b4 10.1016/j.neunet.2025.107208_b13 10.1016/j.neunet.2025.107208_b18 10.1016/j.neunet.2025.107208_b17 10.1016/j.neunet.2025.107208_b39 10.1016/j.neunet.2025.107208_b16 Beckham (10.1016/j.neunet.2025.107208_b2) 2019; 32 |
| References_xml | – volume: 1 start-page: 111 year: 2009 end-page: 124 ident: b36 article-title: Global measures of data utility for microdata masked for disclosure limitation publication-title: Journal of Privacy and Confidentiality – year: 2023 ident: b27 article-title: Feature-based interpolation and geodesics in the latent spaces of generative models publication-title: IEEE Transactions on Neural Networks and Learning Systems – year: 2023 ident: b19 article-title: Bridging the gap: Enhancing the utility of synthetic data via post-processing techniques – reference: Zhang, H., Zhang, J., Srinivasan, B., Shen, Z., Qin, X., Faloutsos, C., et al. (2024). Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space. In – volume: 166 year: 2024 ident: b33 article-title: Challenges and opportunities of generative models on tabular data publication-title: Applied Soft Computing – volume: 629 start-page: 313 year: 2023 end-page: 323 ident: b31 article-title: Ensemble k-nearest neighbors based on centroid displacement publication-title: Information Sciences – reference: Kim, J., Lee, C., Shin, Y., Park, S., Kim, M., Park, N., et al. (2022). SOS: Score-based oversampling for tabular data. In – reference: (pp. 1–16). – reference: (pp. 17564–17579). – reference: Sun, Y., Cuesta-Infante, A., & Veeramachaneni, K. (2019). Learning vine copula models for synthetic data generation. – reference: Zhao, J., Kim, Y., Zhang, K., Rush, A., & LeCun, Y. (2018). Adversarially regularized autoencoders. In – year: 2022 ident: b3 article-title: Deep neural networks and tabular data: A survey publication-title: IEEE Transactions on Neural Networks and Learning Systems – reference: Gupta, A., Bhatt, D., & Pandey, A. (2021). Transitioning from Real to Synthetic data: Quantifying the bias in model. In – reference: Naeem, M. F., Oh, S. J., Uh, Y., Choi, Y., & Yoo, J. (2020). Reliable fidelity and diversity metrics for generative models. In – reference: Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In – reference: Kim, J., Lee, C., & Park, N. (2023). STaSy: Score-based Tabular data Synthesis. In – reference: Zhao, Z., Kunar, A., Birke, R., & Chen, L. Y. (2021). CTAB-GAN: Effective table data synthesizing. In – year: 2023 ident: b42 article-title: Tabula: Harnessing language models for tabular data synthesis – reference: (pp. 1–18). – reference: (pp. 3538–3567). – volume: 148 year: 2023 ident: b32 article-title: Synthetic minority oversampling using edited displacement-based k-nearest neighbors publication-title: Applied Soft Computing – volume: 32 year: 2019 ident: b1 article-title: Maximum mean discrepancy gradient flow publication-title: Advances in Neural Information Processing Systems – year: 2017 ident: b5 article-title: From optimal transport to generative modeling: the VEGAN cookbook – reference: (pp. 762–772). – reference: . ICLR 2024. – year: 2022 ident: b45 article-title: CTAB-GAN+: Enhancing tabular data synthesis – volume: 1 start-page: 15 year: 2021 ident: b12 article-title: Synthetic data use: exploring use cases to optimise data utility publication-title: Discover Artificial Intelligence – reference: Yi, M., & Liu, S. (2023). Sliced Wasserstein variational inference. In – reference: Kotelnikov, A., Baranchuk, D., Rubachev, I., & Babenko, A. (2023). TABDDPM: Modelling tabular data with diffusion models. In – reference: (pp. 4450–4454). – volume: 16 start-page: 321 year: 2002 end-page: 357 ident: b6 article-title: SMOTE: synthetic minority over-sampling technique publication-title: Journal of Artificial Intelligence Research – reference: . ICLR 2023. – year: 2023 ident: b10 article-title: Tessellating the latent space for non-adversarial generative auto-encoders publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence – volume: 32 year: 2019 ident: b2 article-title: On adversarial mixup resynthesis publication-title: Advances in Neural Information Processing Systems – reference: (pp. 5049–5057). – volume: 13 start-page: 7609 year: 2022 ident: b38 article-title: A multifaceted benchmarking of synthetic electronic health record generation models publication-title: Nature Communications – reference: (pp. 97–112). – reference: (pp. 7176–7185). – year: 2023 ident: b26 article-title: Realtabformer: Generating realistic relational and tabular data using transformers – reference: Mukherjee, S., Asnani, H., Lin, E., & Kannan, S. (2019). ClusterGAN: Latent Space Clustering in Generative Adversarial Networks. – year: 2021 ident: b22 article-title: Holdout-based fidelity and privacy assessment of mixed-type synthetic data – reference: Chong, M. J., & Forsyth, D. (2020). Effectively unbiased fid and inception score and where to find them. In – reference: Tolstikhin, I., Bousquet, O., Gelly, S., & Schoelkopf, B. (2018). A note on the evaluation of generative models. In – reference: Zhao, Z., Birke, R., & Chen, L. Y. (2023a). FCT-GAN: Enhancing Global Correlation of Table Synthesis via Fourier Transform. In – reference: Kolouri, S., Pope, P. E., Martin, C. E., & Rohde, G. K. (2019). Sliced Wasserstein auto-encoders. In – volume: 81 start-page: 84 year: 2022 end-page: 90 ident: b25 article-title: Tabular data: Deep learning is not all you need publication-title: Information Fusion – reference: Borisov, V., Seßler, K., Leemann, T., Pawelczyk, M., & Kasneci, G. (2023). Language models are realistic tabular data generators. In – volume: 8 year: 2021 ident: b24 article-title: Synthetic data publication-title: Annual Review of Statistics and its Application – reference: (pp. 6070–6079). – year: 2023 ident: b23 article-title: Between-sample relationship in learning tabular data using graph and attention networks – volume: 61 year: 2024 ident: b34 article-title: Enhancing public research on citizen data: An empirical investigation of data synthesis using Statistics New Zealand’s Integrated Data Infrastructure publication-title: Information Processing & Management – reference: (pp. 1213–1228). – start-page: 7335 year: 2019 end-page: 7345 ident: b37 article-title: Modeling tabular data using conditional GAN publication-title: Advances in neural information processing systems – reference: , In – reference: (pp. 5902–5911). – reference: . – volume: 586 start-page: 485 year: 2022 end-page: 500 ident: b30 article-title: Differentially private synthetic medical data generation using convolutional GANs publication-title: Information Sciences – reference: (pp. 4610–4617). – volume: 691 year: 2025 ident: b35 article-title: Blending is all you need: Data-centric ensemble synthetic data publication-title: Information Sciences – reference: Lai, C.-H., Zou, D., & Lerman, G. (2023). Robust Variational Autoencoding with Wasserstein Penalty for Novelty Detection. In – start-page: 1 year: 2022 end-page: 15 ident: b8 article-title: DeepSMOTE: Fusing deep learning and SMOTE for imbalanced data publication-title: IEEE Transactions on Neural Networks and Learning Systems – volume: 10 start-page: 115 year: 2023 ident: b9 article-title: Tabular and latent space synthetic data generation: a literature review publication-title: Journal of Big Data – volume: 166 year: 2024 ident: 10.1016/j.neunet.2025.107208_b33 article-title: Challenges and opportunities of generative models on tabular data publication-title: Applied Soft Computing doi: 10.1016/j.asoc.2024.112223 – year: 2023 ident: 10.1016/j.neunet.2025.107208_b42 – volume: 10 start-page: 115 issue: 1 year: 2023 ident: 10.1016/j.neunet.2025.107208_b9 article-title: Tabular and latent space synthetic data generation: a literature review publication-title: Journal of Big Data doi: 10.1186/s40537-023-00792-7 – ident: 10.1016/j.neunet.2025.107208_b14 doi: 10.1145/3534678.3539454 – volume: 32 year: 2019 ident: 10.1016/j.neunet.2025.107208_b2 article-title: On adversarial mixup resynthesis publication-title: Advances in Neural Information Processing Systems – year: 2022 ident: 10.1016/j.neunet.2025.107208_b3 article-title: Deep neural networks and tabular data: A survey publication-title: IEEE Transactions on Neural Networks and Learning Systems – year: 2023 ident: 10.1016/j.neunet.2025.107208_b10 article-title: Tessellating the latent space for non-adversarial generative auto-encoders publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence – ident: 10.1016/j.neunet.2025.107208_b28 doi: 10.1609/aaai.v33i01.33015049 – volume: 1 start-page: 15 issue: 1 year: 2021 ident: 10.1016/j.neunet.2025.107208_b12 article-title: Synthetic data use: exploring use cases to optimise data utility publication-title: Discover Artificial Intelligence doi: 10.1007/s44163-021-00016-y – year: 2023 ident: 10.1016/j.neunet.2025.107208_b19 – volume: 16 start-page: 321 year: 2002 ident: 10.1016/j.neunet.2025.107208_b6 article-title: SMOTE: synthetic minority over-sampling technique publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.953 – volume: 148 year: 2023 ident: 10.1016/j.neunet.2025.107208_b32 article-title: Synthetic minority oversampling using edited displacement-based k-nearest neighbors publication-title: Applied Soft Computing doi: 10.1016/j.asoc.2023.110895 – volume: 586 start-page: 485 year: 2022 ident: 10.1016/j.neunet.2025.107208_b30 article-title: Differentially private synthetic medical data generation using convolutional GANs publication-title: Information Sciences doi: 10.1016/j.ins.2021.12.018 – ident: 10.1016/j.neunet.2025.107208_b17 – year: 2023 ident: 10.1016/j.neunet.2025.107208_b26 – year: 2022 ident: 10.1016/j.neunet.2025.107208_b45 – start-page: 1 year: 2022 ident: 10.1016/j.neunet.2025.107208_b8 article-title: DeepSMOTE: Fusing deep learning and SMOTE for imbalanced data publication-title: IEEE Transactions on Neural Networks and Learning Systems – ident: 10.1016/j.neunet.2025.107208_b21 – ident: 10.1016/j.neunet.2025.107208_b41 doi: 10.1145/3583780.3615202 – ident: 10.1016/j.neunet.2025.107208_b15 – ident: 10.1016/j.neunet.2025.107208_b20 doi: 10.1609/aaai.v33i01.33014610 – ident: 10.1016/j.neunet.2025.107208_b40 – year: 2021 ident: 10.1016/j.neunet.2025.107208_b22 – ident: 10.1016/j.neunet.2025.107208_b44 – volume: 1 start-page: 111 issue: 1 year: 2009 ident: 10.1016/j.neunet.2025.107208_b36 article-title: Global measures of data utility for microdata masked for disclosure limitation publication-title: Journal of Privacy and Confidentiality doi: 10.29012/jpc.v1i1.568 – ident: 10.1016/j.neunet.2025.107208_b13 – volume: 691 year: 2025 ident: 10.1016/j.neunet.2025.107208_b35 article-title: Blending is all you need: Data-centric ensemble synthetic data publication-title: Information Sciences doi: 10.1016/j.ins.2024.121610 – ident: 10.1016/j.neunet.2025.107208_b11 – ident: 10.1016/j.neunet.2025.107208_b29 – volume: 8 year: 2021 ident: 10.1016/j.neunet.2025.107208_b24 article-title: Synthetic data publication-title: Annual Review of Statistics and its Application doi: 10.1146/annurev-statistics-040720-031848 – start-page: 7335 year: 2019 ident: 10.1016/j.neunet.2025.107208_b37 article-title: Modeling tabular data using conditional GAN – year: 2023 ident: 10.1016/j.neunet.2025.107208_b27 article-title: Feature-based interpolation and geodesics in the latent spaces of generative models publication-title: IEEE Transactions on Neural Networks and Learning Systems – volume: 81 start-page: 84 year: 2022 ident: 10.1016/j.neunet.2025.107208_b25 article-title: Tabular data: Deep learning is not all you need publication-title: Information Fusion doi: 10.1016/j.inffus.2021.11.011 – volume: 13 start-page: 7609 issue: 1 year: 2022 ident: 10.1016/j.neunet.2025.107208_b38 article-title: A multifaceted benchmarking of synthetic electronic health record generation models publication-title: Nature Communications doi: 10.1038/s41467-022-35295-1 – volume: 629 start-page: 313 year: 2023 ident: 10.1016/j.neunet.2025.107208_b31 article-title: Ensemble k-nearest neighbors based on centroid displacement publication-title: Information Sciences doi: 10.1016/j.ins.2023.02.004 – ident: 10.1016/j.neunet.2025.107208_b4 – ident: 10.1016/j.neunet.2025.107208_b16 – year: 2023 ident: 10.1016/j.neunet.2025.107208_b23 – ident: 10.1016/j.neunet.2025.107208_b39 – ident: 10.1016/j.neunet.2025.107208_b18 – ident: 10.1016/j.neunet.2025.107208_b43 – volume: 61 issue: 1 year: 2024 ident: 10.1016/j.neunet.2025.107208_b34 article-title: Enhancing public research on citizen data: An empirical investigation of data synthesis using Statistics New Zealand’s Integrated Data Infrastructure publication-title: Information Processing & Management doi: 10.1016/j.ipm.2023.103558 – ident: 10.1016/j.neunet.2025.107208_b7 doi: 10.1109/CVPR42600.2020.00611 – volume: 32 year: 2019 ident: 10.1016/j.neunet.2025.107208_b1 article-title: Maximum mean discrepancy gradient flow publication-title: Advances in Neural Information Processing Systems – year: 2017 ident: 10.1016/j.neunet.2025.107208_b5 |
| SSID | ssj0006843 |
| Score | 2.4973443 |
| Snippet | Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted... |
| SourceID | proquest pubmed crossref elsevier |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 107208 |
| SubjectTerms | Algorithms Autoencoder Deep Learning Deep neural networks Generative AI Humans Latent space interpolation Neural Networks, Computer Tabular data synthesis Wasserstein Autoencoder |
| Title | Deterministic Autoencoder using Wasserstein loss for tabular data generation |
| URI | https://dx.doi.org/10.1016/j.neunet.2025.107208 https://www.ncbi.nlm.nih.gov/pubmed/39893805 https://www.proquest.com/docview/3162846144 |
| Volume | 185 |
| WOSCitedRecordID | wos001417182100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2782 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006843 issn: 0893-6080 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxELZoyoEL70cKVEZCXCJH2VdsHwOkAhQFDindm-W1vdCo2oQkW5V_z_ixm1SlajlwWa2s9drrzx6PZ2fmQ-htGcWSK1MQbXhGUtgjSKF0RFiRSqnoQBlHyfJ9QqdTluf8W2CzXzs6AVpV7OKCL_8r1FAGYNvQ2X-Au30pFMA9gA5XgB2utwL-Y3BwcRmYe6N6s7C5Km3KiNrZBU6ki7C0LJe9M9givaOhLJw_qnUYtazKZrVFbN4keHIZOirvN96q4ifB4GwjZXp5a1v-Uf_2Au39afUzBJEF40KcbV35gjxklJOYen6gvvlLWStEsx0xCGfK2KVruCqhvbFg3q9MDR3u20b728cvJ8SefhVHx5OJmI3z2bvlL2K5wuw_9UCcsof2Y5px1kH7o8_j_Eu7Aw-ZD6xoOtqETDq_vqsNX6eSXHfkcKrH7CG6H84MeOSxfoTumOoxetDwceAgnp-gySXo8Q702EGPd6DHFnoM0OMAPbbQ4y30T9Hx0Xj24RMJbBlEpQO2IZmG5WgKmUijmSllpLlhWmqZxAxWqoSjKyt5lhqtC21DjktFh6rk8ZCqouBR8gx1qkVlXiCcREPDBiyFxRqnTBdMGZ2qjCpegPoroy5KmgETKqSSt4wmZ6LxGZwLP8zCDrPww9xFpK219KlUbnieNliIoA56NU_AXLqh5psGOgHS0v4Ck5VZ1GsBXwb6mDWCdNFzj2nbl4SD7s4G2cEtar9E97bL5RXqbFa1eY3uqvPN6Xp1iPZozg7DrPwDj2CZOg |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deterministic+Autoencoder+using+Wasserstein+loss+for+tabular+data+generation&rft.jtitle=Neural+networks&rft.au=Wang%2C+Alex+X&rft.au=Nguyen%2C+Binh+P&rft.date=2025-05-01&rft.issn=1879-2782&rft.eissn=1879-2782&rft.volume=185&rft.spage=107208&rft_id=info:doi/10.1016%2Fj.neunet.2025.107208&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0893-6080&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0893-6080&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0893-6080&client=summon |