Disentangled Representation and Contrastive Learning with Adaptive Affinity Squeeze-Excitation for Multimodal Emotion Recognition.

Gespeichert in:
Bibliographische Detailangaben
Titel: Disentangled Representation and Contrastive Learning with Adaptive Affinity Squeeze-Excitation for Multimodal Emotion Recognition.
Autoren: Wang, Jhing-Fa1,2 (AUTHOR), Tsai, Hsin-Chun3 (AUTHOR) tsaihcmail@gmail.com, Liu, Jie-Ming1 (AUTHOR)
Quelle: International Journal of Pattern Recognition & Artificial Intelligence. Nov2025, p1. 20p. 2 Illustrations.
Schlagwörter: EMOTION recognition, FEATURE extraction
Abstract: Multimodal Emotion Recognition (MER) aims to leverage information from multiple modalities to enhance the accuracy of emotion classification. However, existing methods struggle with properly disentangling shared and modality-specific features while maintaining discriminative power across modalities. To address this, we propose a novel contrastive learning-based MER framework that enforces inter-modal alignment while preserving intra-modal distinctiveness. Our method employs disentangled representation learning to extract both shared and private representations, ensuring that complementary features are effectively utilized. Additionally, we introduce an Adaptive Affinity Squeeze-Excitation (AASE) mechanism to dynamically refine multimodal representations by capturing inter-channel relationships, further enhancing the model’s ability to identify crucial emotional cues. Experimental results demonstrate that our method improves emotion classification accuracy to 84.6% on benchmark datasets. Accordingly, the proposed MER framework provides a robust and highly adaptive solution for multimodal emotion analysis, effectively integrating modality-specific features while enhancing discriminability across modalities. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Pattern Recognition & Artificial Intelligence is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank: Business Source Index
Beschreibung
Abstract:Multimodal Emotion Recognition (MER) aims to leverage information from multiple modalities to enhance the accuracy of emotion classification. However, existing methods struggle with properly disentangling shared and modality-specific features while maintaining discriminative power across modalities. To address this, we propose a novel contrastive learning-based MER framework that enforces inter-modal alignment while preserving intra-modal distinctiveness. Our method employs disentangled representation learning to extract both shared and private representations, ensuring that complementary features are effectively utilized. Additionally, we introduce an Adaptive Affinity Squeeze-Excitation (AASE) mechanism to dynamically refine multimodal representations by capturing inter-channel relationships, further enhancing the model’s ability to identify crucial emotional cues. Experimental results demonstrate that our method improves emotion classification accuracy to 84.6% on benchmark datasets. Accordingly, the proposed MER framework provides a robust and highly adaptive solution for multimodal emotion analysis, effectively integrating modality-specific features while enhancing discriminability across modalities. [ABSTRACT FROM AUTHOR]
ISSN:02180014
DOI:10.1142/s0218001425510231