Information Theoretic Learning-Enhanced Dual-Generative Adversarial Networks With Causal Representation for Robust OOD Generalization

Recently, machine/deep learning techniques are achieving remarkable success in a variety of intelligent control and management systems, promising to change the future of artificial intelligence (AI) scenarios. However, they still suffer from some intractable difficulty or limitations for model train...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems Jg. 36; H. 2; S. 2066 - 2079
Hauptverfasser: Zhou, Xiaokang, Zheng, Xuzhe, Shu, Tian, Liang, Wei, Wang, Kevin I-Kai, Qi, Lianyong, Shimizu, Shohei, Jin, Qun
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States IEEE 01.02.2025
Schlagworte:
ISSN:2162-237X, 2162-2388, 2162-2388
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, machine/deep learning techniques are achieving remarkable success in a variety of intelligent control and management systems, promising to change the future of artificial intelligence (AI) scenarios. However, they still suffer from some intractable difficulty or limitations for model training, such as the out-of-distribution (OOD) issue, in modern smart manufacturing or intelligent transportation systems (ITSs). In this study, we newly design and introduce a deep generative model framework, which seamlessly incorporates the information theoretic learning (ITL) and causal representation learning (CRL) in a dual-generative adversarial network (Dual-GAN) architecture, aiming to enhance the robust OOD generalization in modern machine learning (ML) paradigms. In particular, an ITL- and CRL-enhanced Dual-GAN (ITCRL-DGAN) model is presented, which includes an autoencoder with CRL (AE-CRL) structure to aid the dual-adversarial training with causality-inspired feature representations and a Dual-GAN structure to improve the data augmentation in both feature and data levels. Following a newly designed feature separation strategy, a causal graph is built and improved based on the information theory, which can enhance the causally related factors among the separated core features and further enrich the feature representation with the counterfactual features via interventions based on the refined causal relationships. The ITL is incorporated to improve the extraction of low-dimensional feature representations and learn the optimized causal representations based on the idea of "information flow." A dual-adversarial training mechanism is then developed, which not only enables the generator to expand the boundary of feature distribution in accordance with the optimized feature representation from AE-CRL, but also allows the discriminator to further verify and improve the quality of the augmented data for OOD generalization. Experiment and evaluation results based on an open-source dataset demonstrate the outstanding learning efficiency and classification performance of our proposed model for robust OOD generalization in modern smart applications compared with three baseline methods.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2023.3330864