An encoder-decoder based framework for hindi image caption generation

In recent times, research activity on image caption generation has attracted several researchers. The present work attempt to address the problem of Hindi image caption generation using Hindi Visual genome dataset. Hindi is the official and most spoken language in India. In a linguistically diverse...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Multimedia tools and applications Ročník 80; číslo 28-29; s. 35721 - 35740
Hlavní autoři: Singh, Alok, Singh, Thoudam Doren, Bandyopadhyay, Sivaji
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.11.2021
Springer Nature B.V
Témata:
ISSN:1380-7501, 1573-7721
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In recent times, research activity on image caption generation has attracted several researchers. The present work attempt to address the problem of Hindi image caption generation using Hindi Visual genome dataset. Hindi is the official and most spoken language in India. In a linguistically diverse country like India, it is essential to provide a means that can help the people to understand the visual entities in their native languages. In this paper, an encoder-decoder based architecture is proposed where Convolutional Neural Network (CNN) is employed for encoding visual features of an image and stacked Long Short-Term Memory (sLSTM) in combination with both uni-directional LSTM and bi-directional LSTM for generating the captions in Hindi. For encoding the visual feature representation of an image, V G G 19 based pre-trained model is used and sLSTM architecture is employed for caption generation at the decoder side. The model is tested over Hindi visual genome dataset to validate the proposed approach’s performance and cross-verification is carried out for English captions with Flickr dataset. The experimental results of the proposed approach manifest that the model is qualitatively and quantitatively better than state-of-the-art approaches for Hindi caption generation.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-021-11106-5