Unbinding tensor product representations for image captioning with semantic alignment and complementation

Image captioning, which describes an image with natural language, is an important but challenging multi-modal task. Many state-of-the-art methods generally adopt the encoder–decoder framework to implement information conversion from image modality to text modality. However, most methods are limited...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia systems Vol. 30; no. 3; p. 117
Main Authors: Wu, Bicheng, Wo, Yan
Format: Journal Article
Language:English
Published: Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2024
Springer Nature B.V
Subjects:
ISSN:0942-4962, 1432-1882
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first