A feature mapping technique for complex data object generation with likelihood and deep generative approaches

When a sufficient amount of training data is available, Machine Learning (ML) models show great promise for solving problems involving complex and dynamic patterns. Social and behavioral domains are rich with such challenging problems, with complex object data extracted from documents, surveys, etc....

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE access Ročník 11; s. 1
Hlavní autori:	Muramudalige, Shashika R., Jayasumana, Anura P., Wang, Haonan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Piscataway IEEE 01.01.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Adversarial autoencoder Behavioral sciences Copulas Data generation Data models Datasets Domains Generative adversarial networks Graphical representations Machine learning Mapping Object generation Object oriented modeling Problem solving Social networking (online) Sociology Synthesis Synthetic data synthetic data generation Trees (mathematics)
ISSN:	2169-3536, 2169-3536
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	When a sufficient amount of training data is available, Machine Learning (ML) models show great promise for solving problems involving complex and dynamic patterns. Social and behavioral domains are rich with such challenging problems, with complex object data extracted from documents, surveys, etc., and represented in forms such as graphs and trees. However, many social and behavioral data sets are inherently sparse and incomplete. The same data field may be unavailable in different records of a data set due to different causes, e.g., because it was not measured, not known, or simply not applicable to that particular record. Furthermore, collection challenges, cost, lack of participation, small affected populations, etc., result in very small sets of data. Resulting unconventional datasets cannot be directly used with potent approaches such as machine learning. A technique to model and synthesize large sets of such complex data objects while maintaining the same statistical and topological characteristics of original data helps overcome these challenges. To the best of our knowledge, our work is the first attempt to synthesize unconventional datasets. We propose a novel feature mapping technique to eliminate data inconsistencies and model data objects from unconventional datasets. The feature-mapped data objects are used to synthesize data using two likelihood approaches, i.e., multi-variate Gaussian and regular vine copulas, and one generative adversarial approach using an adversarial autoencoder (AAE). We demonstrate the robustness of the proposed technique with three real-world datasets representing disparate domains and validate the performance of likelihood and deep-generative approaches with these object synthesis strategies.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3335375