Modality-uncertainty-aware knowledge distillation framework for multimodal sentiment analysis

Multimodal sentiment analysis (MSA) has become increasingly important for understanding human emotions, with applications in areas such as human-computer interaction, social media analysis, and emotion recognition. MSA leverages multimodal data, including text, audio, and visual inputs, to achieve b...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Complex & intelligent systems Ročník 12; číslo 1; s. 14 - 22
Hlavní autoři:	Wang, Nan, Wang, Qi
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Cham Springer International Publishing 01.01.2026 Springer Nature B.V Springer
Témata:	Attention mechanism Audio data Complexity Computational Intelligence Data Structures and Information Theory Deep learning Effectiveness Emotion recognition Emotions Engineering Knowledge Knowledge distillation Knowledge representation Multimodal learning Multimodal sentiment analysis Original Paper Performance enhancement Semantics Sentiment analysis Social networks Uncertainty User generated content Attention mechanism Knowledge distillation Multimodal sentiment analysis Multimodal learning
ISSN:	2199-4536, 2198-6053
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Multimodal sentiment analysis (MSA) has become increasingly important for understanding human emotions, with applications in areas such as human-computer interaction, social media analysis, and emotion recognition. MSA leverages multimodal data, including text, audio, and visual inputs, to achieve better performance in emotion recognition. However, existing methods face challenges, particularly when dealing with missing modalities. While some approaches attempt to handle modality dropout, they often fail to effectively recover missing information or account for the complex interactions between different modalities. Moreover, many models treat modalities equally, not fully utilizing the unique strengths of each modality. To address these limitations, we propose the Modality-Uncertainty-aware Knowledge Distillation Framework (MUKDF). Specifically, we introduce a modality random missing strategy that enhances the model's adaptability to uncertain modality scenarios. To further improve performance, we incorporate a Dual-Branch Modality Knowledge Extractor (DMKE) that balances feature contributions across modalities and a multimodal masked transformer (MMT) designed to capture nuanced interactions between modalities. Additionally, we present a contrastive feature-level and align-based representation distillation mechanism (CFD&ARD), which strengthens the alignment between teacher and student representations, ensuring effective knowledge transfer and improved robustness in learning. Comprehensive experiments conducted on two benchmark datasets demonstrate that MUKDF outperforms several baseline models, achieving superior performance not only under complete modality conditions but also in the more challenging scenario with incomplete modalities. This highlights the effectiveness of our framework in handling the uncertainty and complexities inherent in multimodal sentiment analysis.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2199-4536 2198-6053
DOI:	10.1007/s40747-025-02135-w