Content-aware sentiment understanding: cross-modal analysis with encoder-decoder architectures
The analysis of sentiment from social media data has attracted significant attention due to the proliferation of user-generated opinions and comments on these platforms. Social media content is often multi-modal, frequently combining images and text within single posts. To effectively estimate user...
Uloženo v:
| Vydáno v: | Journal of computational social science Ročník 8; číslo 2 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Singapore
Springer Nature Singapore
01.05.2025
|
| Témata: | |
| ISSN: | 2432-2717, 2432-2725 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | The analysis of sentiment from social media data has attracted significant attention due to the proliferation of user-generated opinions and comments on these platforms. Social media content is often multi-modal, frequently combining images and text within single posts. To effectively estimate user sentiment across multiple content types, this study proposes a multimodal content-aware approach. It distinguishes text-dominant images, memes, and regular images, extracting embedded text from memes or text-dominant images. Using the Swin Transformer-GPT-2 (encoder-decoder) architecture, captions are generated for image analysis. The user’s sentiment is then estimated by analyzing embedded text, generated captions, and user-provided captions through a BiLSTM-LSTM (encoder-decoder) architecture and fully connected layers. The proposed method demonstrates superior performance, achieving 93% accuracy on the MVSA-Single dataset, 79% accuracy on the MVSA-Multiple dataset, and 90% accuracy on the TWITTER (Large) dataset surpassing current state-of-the-art methods. |
|---|---|
| ISSN: | 2432-2717 2432-2725 |
| DOI: | 10.1007/s42001-025-00374-y |