Content-aware sentiment understanding: cross-modal analysis with encoder-decoder architectures

The analysis of sentiment from social media data has attracted significant attention due to the proliferation of user-generated opinions and comments on these platforms. Social media content is often multi-modal, frequently combining images and text within single posts. To effectively estimate user...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of computational social science Ročník 8; číslo 2
Hlavní autoři: Pakdaman, Zahra, Koochari, Abbas, Sharifi, Arash
Médium: Journal Article
Jazyk:angličtina
Vydáno: Singapore Springer Nature Singapore 01.05.2025
Témata:
ISSN:2432-2717, 2432-2725
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The analysis of sentiment from social media data has attracted significant attention due to the proliferation of user-generated opinions and comments on these platforms. Social media content is often multi-modal, frequently combining images and text within single posts. To effectively estimate user sentiment across multiple content types, this study proposes a multimodal content-aware approach. It distinguishes text-dominant images, memes, and regular images, extracting embedded text from memes or text-dominant images. Using the Swin Transformer-GPT-2 (encoder-decoder) architecture, captions are generated for image analysis. The user’s sentiment is then estimated by analyzing embedded text, generated captions, and user-provided captions through a BiLSTM-LSTM (encoder-decoder) architecture and fully connected layers. The proposed method demonstrates superior performance, achieving 93% accuracy on the MVSA-Single dataset, 79% accuracy on the MVSA-Multiple dataset, and 90% accuracy on the TWITTER (Large) dataset surpassing current state-of-the-art methods.
ISSN:2432-2717
2432-2725
DOI:10.1007/s42001-025-00374-y