MVAESynth: a unified framework for multimodal data generation, modality restoration, and controlled generation

Synthetic data generation is used nowadays in a number of applications with privacy issues, such as training and testing of systems for analyzing the behavior of social network users or bank customers. Very often, personal data is complex and describes different aspects of a person, some of which ma...

Full description

Saved in:
Bibliographic Details
Published in:Procedia computer science Vol. 193; pp. 422 - 431
Main Authors: Lysenko, Anton, Deeva, Irina, Shikov, Egor
Format: Journal Article
Language:English
Published: Elsevier B.V 2021
Subjects:
ISSN:1877-0509, 1877-0509
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Synthetic data generation is used nowadays in a number of applications with privacy issues, such as training and testing of systems for analyzing the behavior of social network users or bank customers. Very often, personal data is complex and describes different aspects of a person, some of which may be missing for some records, which makes it very hard to deal with. In this paper, we present MVAESynth, a novel framework for the data-driven generation of multimodal synthetic data. It contains our implementation of a multimodal variational auto-encoder (MVAE), which is capable of generating user multimodal personal profiles (for example, social media profiles data and transactional data) and training even with missing modalities. Extensive experimental studies of MVAESynth performance were conducted demonstrating its effectiveness compared with the available solutions for the following tasks 1) training on data with missing modalities; 2) generating realistic social network profiles; 3) restoring missing profile modalities; 4) generating profiles with the specified characteristics.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2021.10.044