Character-level arabic text generation from sign language video using encoder–decoder model
Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text generation in English, but there are a few research works available for other languages. In this paper, we propose a novel encoding–decoding system th...
Uloženo v:
| Vydáno v: | Displays Ročník 76; s. 102340 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.01.2023
|
| Témata: | |
| ISSN: | 0141-9382, 1872-7387 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text generation in English, but there are a few research works available for other languages. In this paper, we propose a novel encoding–decoding system that generates character-level Arabic sentences from isolated RGB videos of Moroccan sign language. The video sequence was encoded by a spatiotemporal feature extraction using pose estimation models, while the label text of the video is transmitted to a sequence of representative vectors. Both the features and the label vector are joined and treated by a decoder layer to derive a final prediction. We trained the proposed system on an isolated Moroccan Sign Language dataset (MoSLD), composed of RGB videos from 125 MoSL signs. The experimental results reveal that the proposed model attains the best performance under several evaluation metrics.
•A method of generating character-level Arabic text from Moroccan sign language datasets has been proposed.•To the best of our knowledge, the proposed model is the first neural encoder–decoder model for Arabic video captioning.•Different landmark estimation schemes are used at the Arabic character level to improve the accuracy and interpretive performance of the results. |
|---|---|
| AbstractList | Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text generation in English, but there are a few research works available for other languages. In this paper, we propose a novel encoding–decoding system that generates character-level Arabic sentences from isolated RGB videos of Moroccan sign language. The video sequence was encoded by a spatiotemporal feature extraction using pose estimation models, while the label text of the video is transmitted to a sequence of representative vectors. Both the features and the label vector are joined and treated by a decoder layer to derive a final prediction. We trained the proposed system on an isolated Moroccan Sign Language dataset (MoSLD), composed of RGB videos from 125 MoSL signs. The experimental results reveal that the proposed model attains the best performance under several evaluation metrics.
•A method of generating character-level Arabic text from Moroccan sign language datasets has been proposed.•To the best of our knowledge, the proposed model is the first neural encoder–decoder model for Arabic video captioning.•Different landmark estimation schemes are used at the Arabic character level to improve the accuracy and interpretive performance of the results. |
| ArticleNumber | 102340 |
| Author | Meslouhi, Othmane El Kardouchi, Mustapha Benaddy, Mohamed Boukdir, Abdelbasset Akhloufi, Moulay |
| Author_xml | – sequence: 1 givenname: Abdelbasset orcidid: 0000-0002-5032-7445 surname: Boukdir fullname: Boukdir, Abdelbasset email: abdelbasset.boukdir@edu.uiz.ac.ma organization: LabSI Laboratory, FSA/PFO, Ibn Zohr University, Ouarzazate, Morocco – sequence: 2 givenname: Mohamed surname: Benaddy fullname: Benaddy, Mohamed organization: LabSI Laboratory, FSA/PFO, Ibn Zohr University, Ouarzazate, Morocco – sequence: 3 givenname: Othmane El surname: Meslouhi fullname: Meslouhi, Othmane El organization: SARS Group, National School of Applied Sciences - Safi, Cadi Ayyad University, Morocco – sequence: 4 givenname: Mustapha surname: Kardouchi fullname: Kardouchi, Mustapha organization: PRIME Group, Department of Computer Sciences, Université de Moncton, Moncton, Canada – sequence: 5 givenname: Moulay surname: Akhloufi fullname: Akhloufi, Moulay organization: PRIME Group, Department of Computer Sciences, Université de Moncton, Moncton, Canada |
| BookMark | eNqFkMtKAzEUhoNUsFbfwEVeYGpykpmMLgQp3qDgRpcSMsmZMWWaKcm06M538A19EqetKxe6OTf4DvzfMRmFLiAhZ5xNOePF-WLqfFq1ZgoMYDiBkOyAjHmpIFOiVCMyZlzy7EKUcESOU1owxkAqGJOX2auJxvYYsxY32NJhq7ylPb71tMGA0fS-C7SO3ZIm3wTamtCsTYN04x12dJ18aCgG2zmMXx-fDncTXQ61PSGHtWkTnv70CXm-vXma3Wfzx7uH2fU8s4IVfebAFgxBVWDyOrdGMOdqUTEJKAuUAnkJuSsqxVjNwTqQtRUF5pVCcIpVYkLk_q-NXUoRa72Kfmniu-ZMbxXphd4r0ltFeq9owC5_Ydb3u7x9NL79D77awzgE23iMOlk_eEDnI9peu87__eAbOjWJUw |
| CitedBy_id | crossref_primary_10_3389_frai_2025_1630743 crossref_primary_10_1016_j_displa_2023_102489 crossref_primary_10_1109_ACCESS_2024_3485131 crossref_primary_10_1007_s11227_024_05898_0 |
| Cites_doi | 10.3390/bdcc3010014 10.1016/j.patrec.2018.07.024 10.1109/TIP.2018.2855422 10.1016/j.ipm.2020.102302 10.1016/j.patrec.2018.09.022 10.1109/CVPR.2016.308 10.1109/CVPR.2016.90 10.1145/3308561.3353781 10.1007/s13369-020-04384-y 10.1007/s11042-021-11106-5 10.1109/TCSVT.2018.2867286 10.1109/ICCV.2015.510 10.1109/TIP.2019.2941267 10.1016/j.cviu.2019.102840 10.1109/TETCI.2019.2892755 10.4310/SII.2009.v2.n3.a8 10.1016/j.neucom.2019.08.042 10.1007/s11554-014-0423-0 10.1145/3432246 10.3115/1073083.1073135 10.24963/ijcai.2019/106 10.1007/s13369-021-06167-5 10.4018/IJGCMS.2020040103 10.1007/s11801-021-0100-z |
| ContentType | Journal Article |
| Copyright | 2022 Elsevier B.V. |
| Copyright_xml | – notice: 2022 Elsevier B.V. |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.displa.2022.102340 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1872-7387 |
| ExternalDocumentID | 10_1016_j_displa_2022_102340 S0141938222001585 |
| GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 29G 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9JO AABXZ AACTN AAEDT AAEDW AAEPC AAFJI AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABFNM ABIVO ABJNI ABMAC ABMMH ABXDB ABXRA ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AEZYN AFKWA AFRZQ AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV AKYCK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOMHK AOUOD ASPBG AVARZ AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 LY7 M41 MAGPM MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PRBVW Q38 R2- RIG ROL RPZ SBC SDF SDG SES SET SEW SPC SPCBC SSB SSM SSO SST SSV SSZ T5K TN5 WUQ XPP ZMT ~G- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c306t-d2c60e27b2a5f5ca30ddf3b042e46e43e1825d6b700f12cd24fc36e5b7e2d70b3 |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000912028900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0141-9382 |
| IngestDate | Sat Nov 29 07:23:21 EST 2025 Tue Nov 18 21:52:00 EST 2025 Fri Feb 23 02:38:27 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Video caption Deep learning Gated Recurrent Unit Pose estimation Arabic text |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c306t-d2c60e27b2a5f5ca30ddf3b042e46e43e1825d6b700f12cd24fc36e5b7e2d70b3 |
| ORCID | 0000-0002-5032-7445 |
| ParticipantIDs | crossref_primary_10_1016_j_displa_2022_102340 crossref_citationtrail_10_1016_j_displa_2022_102340 elsevier_sciencedirect_doi_10_1016_j_displa_2022_102340 |
| PublicationCentury | 2000 |
| PublicationDate | January 2023 2023-01-00 |
| PublicationDateYYYYMMDD | 2023-01-01 |
| PublicationDate_xml | – month: 01 year: 2023 text: January 2023 |
| PublicationDecade | 2020 |
| PublicationTitle | Displays |
| PublicationYear | 2023 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Pan, Wang, Duan, Gan, Hong (b15) 2021; 1861,1 D. Guo, S. Tang, M. Wang, Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling, in: IJCAI, 2019, pp. 751–757. Hori, Hori, Marks, Hershey (b9) 2017 Zhao, Yang, Guo (b16) 2021; 17 Nabati, Behrad (b35) 2020; 57 Zhou, Lei, Long, Zhang, Hu, Zhang (b4) 2018 Wang, Gao, Han (b31) 2020; 130 Tang, Guo, Hong, Wang (b30) 2021 Mahadi, Arifianto, Ramadhani (b20) 2020 K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002, pp. 311–318. Biswas, Barz, Hartmann, Sonntag (b21) 2021 Pawade, Sakhapara, Shah, Wala, Tripathi, Shah (b26) 2019 Islam, Jahan, Das (b5) 2014 Boukdir, Benaddy, Ellahyani, Meslouhi, Kardouchi (b13) 2022; 47 Guo, Zhou, Li, Li, Wang (b29) 2019; 29 Mishra, Dhir, Saha, Bhattacharyya (b18) 2021; 20 Liu, Ren, Yuan (b27) 2020 Yang, Zhou, Ai, Bin, Hanjalic, Shen, Ji (b23) 2018; 27 Bodini (b6) 2019; 3 Liu, Hu, Li, Yu, Guan (b17) 2020 K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. Hastie, Rosset, Zhu, Zou (b34) 2009; 2 Jin, Li, Zhang (b25) 2019; 370 D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri, Learning spatiotemporal features with 3d convolutional networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 4489–4497. Li, Tao, Li, Fu (b1) 2019; 3 Nabati, Behrad (b33) 2020; 190 Wu, Yao, Fu, Jiang (b14) 2017 Alsmadi (b3) 2020; 45 S. Kafle, P. Yeung, M. Huenerfauth, Evaluating the Benefit of Highlighting Key Words in Captions for People who are Deaf or Hard of Hearing, in: The 21st International ACM SIGACCESS Conference on Computers and Accessibility, 2019, pp. 43–55. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818–2826. Singh, Singh, Bandyopadhyay (b19) 2021; 80 Vinodhini, Sathiyabhama, Sankar, Somula (b32) 2020; 12 Xu, Liu, Wong, Zhang, Nie, Su, Kankanhalli (b24) 2018; 29 Plyer, Le Besnerais, Champagnat (b7) 2016; 11 Simonyan, Vedaldi, Zisserman (b12) 2013 Daskalakis, Tzelepi, Tefas (b22) 2018; 116 Wu (10.1016/j.displa.2022.102340_b14) 2017 Singh (10.1016/j.displa.2022.102340_b19) 2021; 80 10.1016/j.displa.2022.102340_b36 10.1016/j.displa.2022.102340_b11 Zhao (10.1016/j.displa.2022.102340_b16) 2021; 17 10.1016/j.displa.2022.102340_b2 10.1016/j.displa.2022.102340_b10 Nabati (10.1016/j.displa.2022.102340_b33) 2020; 190 Tang (10.1016/j.displa.2022.102340_b30) 2021 Vinodhini (10.1016/j.displa.2022.102340_b32) 2020; 12 Biswas (10.1016/j.displa.2022.102340_b21) 2021 Hastie (10.1016/j.displa.2022.102340_b34) 2009; 2 Daskalakis (10.1016/j.displa.2022.102340_b22) 2018; 116 Xu (10.1016/j.displa.2022.102340_b24) 2018; 29 Li (10.1016/j.displa.2022.102340_b1) 2019; 3 Jin (10.1016/j.displa.2022.102340_b25) 2019; 370 10.1016/j.displa.2022.102340_b28 Hori (10.1016/j.displa.2022.102340_b9) 2017 Alsmadi (10.1016/j.displa.2022.102340_b3) 2020; 45 Simonyan (10.1016/j.displa.2022.102340_b12) 2013 Nabati (10.1016/j.displa.2022.102340_b35) 2020; 57 Zhou (10.1016/j.displa.2022.102340_b4) 2018 Boukdir (10.1016/j.displa.2022.102340_b13) 2022; 47 Mishra (10.1016/j.displa.2022.102340_b18) 2021; 20 Yang (10.1016/j.displa.2022.102340_b23) 2018; 27 Liu (10.1016/j.displa.2022.102340_b17) 2020 Liu (10.1016/j.displa.2022.102340_b27) 2020 Guo (10.1016/j.displa.2022.102340_b29) 2019; 29 Pan (10.1016/j.displa.2022.102340_b15) 2021; 1861,1 10.1016/j.displa.2022.102340_b8 Islam (10.1016/j.displa.2022.102340_b5) 2014 Wang (10.1016/j.displa.2022.102340_b31) 2020; 130 Bodini (10.1016/j.displa.2022.102340_b6) 2019; 3 Pawade (10.1016/j.displa.2022.102340_b26) 2019 Plyer (10.1016/j.displa.2022.102340_b7) 2016; 11 Mahadi (10.1016/j.displa.2022.102340_b20) 2020 |
| References_xml | – volume: 3 start-page: 297 year: 2019 end-page: 312 ident: b1 article-title: Visual to text: Survey of image and video captioning publication-title: IEEE Trans. Emerg. Top. Comput. Intell. – start-page: 313 year: 2019 end-page: 322 ident: b26 article-title: Text caption generation based on lip movement of speaker in video using neural network publication-title: International Conference on Advances in Computing and Data Sciences – start-page: 430 year: 2017 end-page: 436 ident: b9 article-title: Early and late integration of audio features for automatic video description publication-title: 2017 IEEE Automatic Speech Recognition and Understanding Workshop – reference: K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. – start-page: 1 year: 2020 end-page: 6 ident: b20 article-title: Adaptive attention generation for Indonesian image captioning publication-title: 2020 8th International Conference on Information and Communication Technology – start-page: 108062M year: 2018 ident: b4 article-title: A novel real-time video mosaic block detection based on intensity order and shape feature publication-title: Tenth International Conference on Digital Image Processing, vol. 10806 – volume: 1861,1 year: 2021 ident: b15 article-title: Chinese image caption of Inceptionv4 and double-layer GRUs based on attention mechanism publication-title: J. Phys.: Conf. Ser. – reference: K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002, pp. 311–318. – volume: 11 start-page: 713 year: 2016 end-page: 730 ident: b7 article-title: Massively parallel lucas kanade optical flow for real-time video processing applications publication-title: J. Real-Time Image Process. – volume: 370 start-page: 118 year: 2019 end-page: 127 ident: b25 article-title: Recurrent convolutional video captioning with global and local attention publication-title: Neurocomputing – volume: 116 start-page: 143 year: 2018 end-page: 149 ident: b22 article-title: Learning deep spatiotemporal features for video captioning publication-title: Pattern Recognit. Lett. – start-page: 197 year: 2014 end-page: 202 ident: b5 article-title: Color feature based video content extraction and its application for poster generation with relevance feedback publication-title: 16th Int’L Conf. Computer and Information Technology – reference: D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri, Learning spatiotemporal features with 3d convolutional networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 4489–4497. – start-page: 3 year: 2021 end-page: 14 ident: b21 article-title: Improving german image captions using machine translation and transfer learning publication-title: International Conference on Statistical Language and Speech Processing – volume: 57 year: 2020 ident: b35 article-title: Multi-sentence video captioning using content-oriented beam searching and multi-stage refining algorithm publication-title: Inf. Process. Manage. – year: 2020 ident: b27 article-title: Sibnet: Sibling convolutional encoder for video captioning publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – year: 2013 ident: b12 article-title: Deep inside convolutional networks: Visualising image classification models and saliency maps – volume: 2 start-page: 349 year: 2009 end-page: 360 ident: b34 article-title: Multi-class adaboost publication-title: Stat. Interface – reference: D. Guo, S. Tang, M. Wang, Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling, in: IJCAI, 2019, pp. 751–757. – volume: 130 start-page: 327 year: 2020 end-page: 334 ident: b31 article-title: Sequence in sequence for video captioning publication-title: Pattern Recognit. Lett. – volume: 47 start-page: 2187 year: 2022 end-page: 2199 ident: b13 article-title: Isolated video-based arabic sign language recognition using convolutional and recursive neural networks publication-title: Arab. J. Sci. Eng. – year: 2020 ident: b17 article-title: Chinese image caption generation via visual attention and topic modeling publication-title: IEEE Trans. Cybern. – volume: 190 year: 2020 ident: b33 article-title: Video captioning using boosted and parallel long short-term memory networks publication-title: Comput. Vis. Image Underst. – volume: 29 start-page: 2482 year: 2018 end-page: 2493 ident: b24 article-title: Dual-stream recurrent neural network for video captioning publication-title: IEEE Trans. Circuits Syst. Video Technol. – volume: 12 start-page: 44 year: 2020 end-page: 56 ident: b32 article-title: A deep structured model for video captioning publication-title: Int. J. Gaming Comput.-Mediat. Simul. (IJGCMS) – volume: 20 start-page: 1 year: 2021 end-page: 19 ident: b18 article-title: A Hindi image caption generation framework using deep learning publication-title: Trans. Asian Low-Resour. Lang. Inf. Process. – year: 2021 ident: b30 article-title: Graph-based multimodal sequential embedding for sign language translation publication-title: IEEE Trans. Multimed. – start-page: 3 year: 2017 end-page: 29 ident: b14 article-title: Deep learning for video classification and captioning publication-title: Frontiers of Multimedia Research – reference: S. Kafle, P. Yeung, M. Huenerfauth, Evaluating the Benefit of Highlighting Key Words in Captions for People who are Deaf or Hard of Hearing, in: The 21st International ACM SIGACCESS Conference on Computers and Accessibility, 2019, pp. 43–55. – volume: 17 start-page: 361 year: 2021 end-page: 366 ident: b16 article-title: A lightweight convolutional neural network for large-scale Chinese image caption publication-title: Optoelectron. Lett. – volume: 3 start-page: 14 year: 2019 ident: b6 article-title: A review of facial landmark extraction in 2d images and videos using deep learning publication-title: Big Data Cogn. Comput. – volume: 80 start-page: 35721 year: 2021 end-page: 35740 ident: b19 article-title: An encoder-decoder based framework for hindi image caption generation publication-title: Multimedia Tools Appl. – volume: 29 start-page: 1575 year: 2019 end-page: 1590 ident: b29 article-title: Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation publication-title: IEEE Trans. Image Process. – reference: C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818–2826. – volume: 27 start-page: 5600 year: 2018 end-page: 5611 ident: b23 article-title: Video captioning by adversarial LSTM publication-title: IEEE Trans. Image Process. – volume: 45 start-page: 3317 year: 2020 end-page: 3330 ident: b3 article-title: Content-based image retrieval using color, shape and texture descriptors and features publication-title: Arab. J. Sci. Eng. – volume: 3 start-page: 14 issue: 1 year: 2019 ident: 10.1016/j.displa.2022.102340_b6 article-title: A review of facial landmark extraction in 2d images and videos using deep learning publication-title: Big Data Cogn. Comput. doi: 10.3390/bdcc3010014 – volume: 130 start-page: 327 year: 2020 ident: 10.1016/j.displa.2022.102340_b31 article-title: Sequence in sequence for video captioning publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2018.07.024 – volume: 27 start-page: 5600 issue: 11 year: 2018 ident: 10.1016/j.displa.2022.102340_b23 article-title: Video captioning by adversarial LSTM publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2018.2855422 – start-page: 430 year: 2017 ident: 10.1016/j.displa.2022.102340_b9 article-title: Early and late integration of audio features for automatic video description – start-page: 3 year: 2017 ident: 10.1016/j.displa.2022.102340_b14 article-title: Deep learning for video classification and captioning – start-page: 197 year: 2014 ident: 10.1016/j.displa.2022.102340_b5 article-title: Color feature based video content extraction and its application for poster generation with relevance feedback – volume: 57 issue: 6 year: 2020 ident: 10.1016/j.displa.2022.102340_b35 article-title: Multi-sentence video captioning using content-oriented beam searching and multi-stage refining algorithm publication-title: Inf. Process. Manage. doi: 10.1016/j.ipm.2020.102302 – volume: 116 start-page: 143 year: 2018 ident: 10.1016/j.displa.2022.102340_b22 article-title: Learning deep spatiotemporal features for video captioning publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2018.09.022 – start-page: 313 year: 2019 ident: 10.1016/j.displa.2022.102340_b26 article-title: Text caption generation based on lip movement of speaker in video using neural network – ident: 10.1016/j.displa.2022.102340_b11 doi: 10.1109/CVPR.2016.308 – year: 2021 ident: 10.1016/j.displa.2022.102340_b30 article-title: Graph-based multimodal sequential embedding for sign language translation publication-title: IEEE Trans. Multimed. – ident: 10.1016/j.displa.2022.102340_b10 doi: 10.1109/CVPR.2016.90 – ident: 10.1016/j.displa.2022.102340_b2 doi: 10.1145/3308561.3353781 – volume: 45 start-page: 3317 issue: 4 year: 2020 ident: 10.1016/j.displa.2022.102340_b3 article-title: Content-based image retrieval using color, shape and texture descriptors and features publication-title: Arab. J. Sci. Eng. doi: 10.1007/s13369-020-04384-y – start-page: 108062M year: 2018 ident: 10.1016/j.displa.2022.102340_b4 article-title: A novel real-time video mosaic block detection based on intensity order and shape feature – volume: 80 start-page: 35721 issue: 28 year: 2021 ident: 10.1016/j.displa.2022.102340_b19 article-title: An encoder-decoder based framework for hindi image caption generation publication-title: Multimedia Tools Appl. doi: 10.1007/s11042-021-11106-5 – volume: 29 start-page: 2482 issue: 8 year: 2018 ident: 10.1016/j.displa.2022.102340_b24 article-title: Dual-stream recurrent neural network for video captioning publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2018.2867286 – ident: 10.1016/j.displa.2022.102340_b8 doi: 10.1109/ICCV.2015.510 – volume: 29 start-page: 1575 year: 2019 ident: 10.1016/j.displa.2022.102340_b29 article-title: Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2019.2941267 – year: 2020 ident: 10.1016/j.displa.2022.102340_b27 article-title: Sibnet: Sibling convolutional encoder for video captioning publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 190 year: 2020 ident: 10.1016/j.displa.2022.102340_b33 article-title: Video captioning using boosted and parallel long short-term memory networks publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2019.102840 – year: 2020 ident: 10.1016/j.displa.2022.102340_b17 article-title: Chinese image caption generation via visual attention and topic modeling publication-title: IEEE Trans. Cybern. – volume: 3 start-page: 297 issue: 4 year: 2019 ident: 10.1016/j.displa.2022.102340_b1 article-title: Visual to text: Survey of image and video captioning publication-title: IEEE Trans. Emerg. Top. Comput. Intell. doi: 10.1109/TETCI.2019.2892755 – volume: 2 start-page: 349 issue: 3 year: 2009 ident: 10.1016/j.displa.2022.102340_b34 article-title: Multi-class adaboost publication-title: Stat. Interface doi: 10.4310/SII.2009.v2.n3.a8 – volume: 370 start-page: 118 year: 2019 ident: 10.1016/j.displa.2022.102340_b25 article-title: Recurrent convolutional video captioning with global and local attention publication-title: Neurocomputing doi: 10.1016/j.neucom.2019.08.042 – volume: 11 start-page: 713 issue: 4 year: 2016 ident: 10.1016/j.displa.2022.102340_b7 article-title: Massively parallel lucas kanade optical flow for real-time video processing applications publication-title: J. Real-Time Image Process. doi: 10.1007/s11554-014-0423-0 – volume: 1861,1 year: 2021 ident: 10.1016/j.displa.2022.102340_b15 article-title: Chinese image caption of Inceptionv4 and double-layer GRUs based on attention mechanism – volume: 20 start-page: 1 issue: 2 year: 2021 ident: 10.1016/j.displa.2022.102340_b18 article-title: A Hindi image caption generation framework using deep learning publication-title: Trans. Asian Low-Resour. Lang. Inf. Process. doi: 10.1145/3432246 – ident: 10.1016/j.displa.2022.102340_b36 doi: 10.3115/1073083.1073135 – start-page: 1 year: 2020 ident: 10.1016/j.displa.2022.102340_b20 article-title: Adaptive attention generation for Indonesian image captioning – year: 2013 ident: 10.1016/j.displa.2022.102340_b12 – ident: 10.1016/j.displa.2022.102340_b28 doi: 10.24963/ijcai.2019/106 – volume: 47 start-page: 2187 issue: 2 year: 2022 ident: 10.1016/j.displa.2022.102340_b13 article-title: Isolated video-based arabic sign language recognition using convolutional and recursive neural networks publication-title: Arab. J. Sci. Eng. doi: 10.1007/s13369-021-06167-5 – start-page: 3 year: 2021 ident: 10.1016/j.displa.2022.102340_b21 article-title: Improving german image captions using machine translation and transfer learning – volume: 12 start-page: 44 issue: 2 year: 2020 ident: 10.1016/j.displa.2022.102340_b32 article-title: A deep structured model for video captioning publication-title: Int. J. Gaming Comput.-Mediat. Simul. (IJGCMS) doi: 10.4018/IJGCMS.2020040103 – volume: 17 start-page: 361 issue: 6 year: 2021 ident: 10.1016/j.displa.2022.102340_b16 article-title: A lightweight convolutional neural network for large-scale Chinese image caption publication-title: Optoelectron. Lett. doi: 10.1007/s11801-021-0100-z |
| SSID | ssj0002472 |
| Score | 2.3586574 |
| Snippet | Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 102340 |
| SubjectTerms | Arabic text Deep learning Gated Recurrent Unit Pose estimation Video caption |
| Title | Character-level arabic text generation from sign language video using encoder–decoder model |
| URI | https://dx.doi.org/10.1016/j.displa.2022.102340 |
| Volume | 76 |
| WOSCitedRecordID | wos000912028900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-7387 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002472 issn: 0141-9382 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELZKlwMcEE-xvOQDt8irxE7i9LgsRTzEwmGRekGRHdtql2xSbdvVcoL_wD_klzB-JK1axEviUjVR3bSeT-NvRjPfIPQ0l5TCMZQTYMOGpNwUpDBVSgyQByW4EUkh3bAJfnxcTCaj94PBl64X5qLmTVNcXo7m_9XUcA-MbVtn_8Lc_ZfCDXgPRodXMDu8_pHhjzoJZlLbgqAIruSsimyFh52XrIPJXV-Jrd7oU5aRbclro5XLHlh9S6stEmohmNLu2k_O2WS0z2eLeS0-r7Pt7eoTnJPO50groQXsfN1k_Uw34OqcYd-2U3EWWqucAvCiblduxHD0bjk9E8B-x335xxsAcmvntriVtulrPhWbGQvKtjIWu600IbOZkBHzk4gOtPfGBQf6z8KJHNy1Hxez4_l9EuL0QLl_DYE_pU6WwotBbWlq25K2xD6M2pIyiJiuoD0KkVM8RHuHr8aT1_1hTlM3_6v_dV33pSsR3H3Wz9nNBmM5uYluhFADH3qI3EID3dxG1zcEKO-gj1tgwR4s2IIFr8GCLViwBQvuwIIdWLADCw5g-f71W4AJdjC5iz68GJ8cvSRh4AapIHJcEkWrPNaUSyoyk1WCxUoZJsGv6zTXKdMQjGYqlzyOTUIrRVNTsVxnkmuqeCzZPTRs2kbfR1gYCUzQmEy7rdaFFGkSjwxXRuospfuIdRtVVkGN3g5Fqcuu7PC09Ntb2u0t_fbuI9Kvmns1lt98nnc2KAOj9EyxBNj8cuWDf175EF1bg_4RGi7PV_oxulpdLGeL8ycBXz8AUzWfiw |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Character-level+arabic+text+generation+from+sign+language+video+using+encoder%E2%80%93decoder+model&rft.jtitle=Displays&rft.au=Boukdir%2C+Abdelbasset&rft.au=Benaddy%2C+Mohamed&rft.au=Meslouhi%2C+Othmane+El&rft.au=Kardouchi%2C+Mustapha&rft.date=2023-01-01&rft.pub=Elsevier+B.V&rft.issn=0141-9382&rft.eissn=1872-7387&rft.volume=76&rft_id=info:doi/10.1016%2Fj.displa.2022.102340&rft.externalDocID=S0141938222001585 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0141-9382&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0141-9382&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0141-9382&client=summon |