Unbinding tensor product representations for image captioning with semantic alignment and complementation

Image captioning, which describes an image with natural language, is an important but challenging multi-modal task. Many state-of-the-art methods generally adopt the encoder–decoder framework to implement information conversion from image modality to text modality. However, most methods are limited...

Full description

Saved in:

Bibliographic Details
Published in:	Multimedia systems Vol. 30; no. 3; p. 117
Main Authors:	Wu, Bicheng, Wo, Yan
Format:	Journal Article
Language:	English
Published:	Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2024 Springer Nature B.V
Subjects:	Alignment Coding Cognition Computer Communication Networks Computer Graphics Computer Science Cryptology Data Storage Representation Decoding Image acquisition Multimedia Information Systems Natural language Natural language processing Operating Systems Optimization Regular Paper Representations Semantics Tensors Words (language) Image captioning Tensor product representations Semantic content Intermediate representations
ISSN:	0942-4962, 1432-1882
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!