Unbinding tensor product representations for image captioning with semantic alignment and complementation

Image captioning, which describes an image with natural language, is an important but challenging multi-modal task. Many state-of-the-art methods generally adopt the encoder–decoder framework to implement information conversion from image modality to text modality. However, most methods are limited...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia systems Jg. 30; H. 3; S. 117
Hauptverfasser:	Wu, Bicheng, Wo, Yan
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2024 Springer Nature B.V
Schlagworte:	Alignment Coding Cognition Computer Communication Networks Computer Graphics Computer Science Cryptology Data Storage Representation Decoding Image acquisition Multimedia Information Systems Natural language Natural language processing Operating Systems Optimization Regular Paper Representations Semantics Tensors Words (language) Image captioning Tensor product representations Semantic content Intermediate representations
ISSN:	0942-4962, 1432-1882
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!