STAT: Spatial-Temporal Attention Mechanism for Video Captioning
Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of human beings, temporal attention mechanism has been widely used in video description to selectively focus on important frames. However, most exi...
Uložené v:
| Vydané v: | IEEE transactions on multimedia Ročník 22; číslo 1; s. 229 - 241 |
|---|---|
| Hlavní autori: | , , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Piscataway
IEEE
01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1520-9210, 1941-0077 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of human beings, temporal attention mechanism has been widely used in video description to selectively focus on important frames. However, most existing methods based on temporal attention mechanism suffer from the problems of recognition error and detail missing, because temporal attention mechanism cannot further catch significant regions in frames. In order to address above problems, we propose the use of a novel spatial-temporal attention mechanism (STAT) within an encoder-decoder neural network for video captioning. The proposed STAT successfully takes into account both the spatial and temporal structures in a video, so it makes the decoder to automatically select the significant regions in the most relevant temporal segments for word prediction. We evaluate our STAT on two well-known benchmarks: MSVD and MSR-VTT-10K. Experimental results show that our proposed STAT achieves the state-of-the-art performance with several popular evaluation metrics: BLEU-4, METEOR, and CIDEr. |
|---|---|
| AbstractList | Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of human beings, temporal attention mechanism has been widely used in video description to selectively focus on important frames. However, most existing methods based on temporal attention mechanism suffer from the problems of recognition error and detail missing, because temporal attention mechanism cannot further catch significant regions in frames. In order to address above problems, we propose the use of a novel spatial-temporal attention mechanism (STAT) within an encoder-decoder neural network for video captioning. The proposed STAT successfully takes into account both the spatial and temporal structures in a video, so it makes the decoder to automatically select the significant regions in the most relevant temporal segments for word prediction. We evaluate our STAT on two well-known benchmarks: MSVD and MSR-VTT-10K. Experimental results show that our proposed STAT achieves the state-of-the-art performance with several popular evaluation metrics: BLEU-4, METEOR, and CIDEr. |
| Author | Tu, Yunbin Zhang, Yongdong Wang, Xingzheng Yan, Chenggang Zhang, Yongbing Hao, Xinhong Dai, Qionghai |
| Author_xml | – sequence: 1 givenname: Chenggang orcidid: 0000-0003-1204-0512 surname: Yan fullname: Yan, Chenggang email: cgyan@hdu.edu.cn organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China – sequence: 2 givenname: Yunbin orcidid: 0000-0002-9525-9060 surname: Tu fullname: Tu, Yunbin email: tuyunbin1995@foxmail.com organization: Institute of Information and Control, Hangzhou Dianzi University, Hangzhou, China – sequence: 3 givenname: Xingzheng orcidid: 0000-0003-4080-6888 surname: Wang fullname: Wang, Xingzheng email: xingzheng.wang@szu.edu.cn organization: College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen, China – sequence: 4 givenname: Yongbing orcidid: 0000-0003-3320-2904 surname: Zhang fullname: Zhang, Yongbing email: zhang.yongbing@sz.tsinghua.edu.cn organization: Graduate School at Shenzhen, Tsinghua University, Shenzhen, China – sequence: 5 givenname: Xinhong orcidid: 0000-0002-6448-4839 surname: Hao fullname: Hao, Xinhong email: haoxinhong@bit.edu.cn organization: Science and Technology on Mechatronic Dynamic Control Laboratory, Beijing Institute of Technology, Beijing, China – sequence: 6 givenname: Yongdong orcidid: 0000-0002-1151-1792 surname: Zhang fullname: Zhang, Yongdong email: zhyd@ict.ac.cn organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China – sequence: 7 givenname: Qionghai surname: Dai fullname: Dai, Qionghai email: daiqionghai@tsinghua.edu.cn organization: Department of Automation, Tsinghua University, Beijing, China |
| BookMark | eNp9kE1Lw0AQhhepYFu9C14CnlNnP5JtvEgpfkGLh0avyySZ6JY2Gzfbg__ehIoHD55mYN5nZngmbNS4hhi75DDjHLKbfL2eCeDZTGRCJTo9YWOeKR4DaD3q-0RAnAkOZ2zSdVsArhLQY3a3yRf5bbRpMVjcxTntW-dxFy1CoCZY10RrKj-wsd0-qp2P3mxFLlpiO8xs837OTmvcdXTxU6fs9eE-Xz7Fq5fH5-ViFZcy0SEusBRFihLrmjSWWFQcaqE5pzkv0kollaiyNKkFEmBScN4_pzJdkNJlASjllF0f97befR6oC2brDr7pTxohpdQiVTDvU-kxVXrXdZ5qU9qAw6vBo90ZDmaQZXpZZpBlfmT1IPwBW2_36L_-Q66OiCWi3_hcK6VAy290r3Y1 |
| CODEN | ITMUF8 |
| CitedBy_id | crossref_primary_10_1007_s11045_021_00796_7 crossref_primary_10_1016_j_cviu_2022_103453 crossref_primary_10_1016_j_compositesa_2024_108429 crossref_primary_10_1109_TMM_2022_3141256 crossref_primary_10_1109_TNNLS_2019_2946636 crossref_primary_10_1016_j_jvcir_2020_102964 crossref_primary_10_1109_TMM_2025_3535394 crossref_primary_10_1007_s11042_023_15978_7 crossref_primary_10_1109_TMM_2023_3242142 crossref_primary_10_1016_j_media_2021_102135 crossref_primary_10_1016_j_ipm_2023_103534 crossref_primary_10_1016_j_prime_2023_100372 crossref_primary_10_1109_TMM_2022_3232022 crossref_primary_10_1145_3679203 crossref_primary_10_1016_j_autcon_2020_103376 crossref_primary_10_1109_TNSRE_2023_3342331 crossref_primary_10_1016_j_jvcir_2020_102833 crossref_primary_10_1109_TMM_2022_3143324 crossref_primary_10_1007_s10462_023_10414_6 crossref_primary_10_1109_TMM_2020_2995258 crossref_primary_10_1109_TAI_2021_3134190 crossref_primary_10_1109_ACCESS_2024_3388532 crossref_primary_10_3390_app10124312 crossref_primary_10_1007_s11042_023_15409_7 crossref_primary_10_1109_TMM_2020_2999176 crossref_primary_10_1109_TPAMI_2021_3115139 crossref_primary_10_1016_j_suscom_2025_101151 crossref_primary_10_1016_j_eswa_2025_127831 crossref_primary_10_1016_j_patrec_2024_01_002 crossref_primary_10_1109_TIFS_2022_3226905 crossref_primary_10_1109_TMM_2020_3003592 crossref_primary_10_1109_TCSVT_2022_3218018 crossref_primary_10_1109_TIP_2022_3159472 crossref_primary_10_1016_j_patcog_2022_109204 crossref_primary_10_1016_j_patcog_2022_109202 crossref_primary_10_1109_TCSVT_2025_3526841 crossref_primary_10_1016_j_still_2025_106747 crossref_primary_10_1145_3539225 crossref_primary_10_1109_TNNLS_2022_3146004 crossref_primary_10_1007_s11042_023_15933_6 crossref_primary_10_1109_TCSVT_2022_3228731 crossref_primary_10_1186_s13640_019_0479_7 crossref_primary_10_1007_s11760_025_04030_w crossref_primary_10_1007_s10462_021_10104_1 crossref_primary_10_1109_ACCESS_2022_3202526 crossref_primary_10_1016_j_neucom_2022_11_094 crossref_primary_10_1109_ACCESS_2020_3012692 crossref_primary_10_1109_TMM_2021_3050092 crossref_primary_10_1007_s11042_023_17822_4 crossref_primary_10_3390_electronics13010201 crossref_primary_10_1016_j_adhoc_2023_103260 crossref_primary_10_1016_j_jvcir_2020_102813 crossref_primary_10_1109_TNNLS_2020_3043110 crossref_primary_10_1186_s13640_020_00537_z crossref_primary_10_1016_j_engappai_2023_107573 crossref_primary_10_1016_j_knosys_2020_105988 crossref_primary_10_1016_j_media_2023_102960 crossref_primary_10_1109_ACCESS_2022_3206449 crossref_primary_10_1049_cvi2_12043 crossref_primary_10_1109_TNNLS_2020_2985099 crossref_primary_10_1007_s11760_025_04657_9 crossref_primary_10_1109_TMM_2020_3011317 crossref_primary_10_1109_TETCI_2019_2947319 crossref_primary_10_1109_TIP_2020_2969330 crossref_primary_10_1145_3729536 crossref_primary_10_1109_TNNLS_2020_2979745 crossref_primary_10_1109_TNNLS_2020_2997289 crossref_primary_10_1109_TMM_2019_2960588 crossref_primary_10_1016_j_asoc_2022_108447 crossref_primary_10_1109_TIA_2023_3274099 crossref_primary_10_1016_j_oceaneng_2023_116486 crossref_primary_10_1016_j_jiixd_2025_03_001 crossref_primary_10_1038_s41598_025_93758_z crossref_primary_10_1109_TCSVT_2020_2990989 crossref_primary_10_1109_TMM_2022_3146005 crossref_primary_10_1007_s00500_021_06360_6 crossref_primary_10_1109_TMM_2020_2967645 crossref_primary_10_1016_j_neunet_2025_107817 crossref_primary_10_1109_TCSVT_2023_3287329 crossref_primary_10_1186_s13640_020_00514_6 crossref_primary_10_1007_s13042_023_01876_9 crossref_primary_10_1007_s11432_019_2784_4 crossref_primary_10_1109_TMM_2019_2949434 crossref_primary_10_1016_j_knosys_2022_109449 crossref_primary_10_1109_TIP_2022_3195643 crossref_primary_10_1007_s10489_023_04597_2 crossref_primary_10_1109_ACCESS_2024_3378313 crossref_primary_10_1109_TCBB_2022_3220902 crossref_primary_10_1080_00387010_2021_1931788 crossref_primary_10_1016_j_patcog_2020_107702 crossref_primary_10_1109_TMM_2023_3295098 crossref_primary_10_1002_eng2_12785 crossref_primary_10_1109_TASE_2024_3352903 crossref_primary_10_1109_TMM_2021_3074803 crossref_primary_10_1016_j_neucom_2024_129177 crossref_primary_10_1109_TCSVT_2021_3131721 crossref_primary_10_1186_s13640_019_0477_9 crossref_primary_10_1016_j_eswa_2025_128287 crossref_primary_10_1109_TMM_2022_3177308 crossref_primary_10_1016_j_patcog_2024_111138 crossref_primary_10_1117_1_JEI_33_6_063031 crossref_primary_10_3390_drones7060401 crossref_primary_10_1016_j_displa_2024_102860 crossref_primary_10_1177_14727978251321985 crossref_primary_10_3390_app15094990 crossref_primary_10_1109_TPAMI_2023_3243812 crossref_primary_10_1016_j_neucom_2019_12_006 crossref_primary_10_1109_TAFFC_2021_3064940 crossref_primary_10_3390_su13042250 crossref_primary_10_1109_TIP_2021_3082765 crossref_primary_10_1016_j_jvcir_2020_102970 crossref_primary_10_3390_electronics12040969 crossref_primary_10_1145_3387920 crossref_primary_10_1109_TMM_2021_3060948 crossref_primary_10_1007_s00432_022_04180_1 crossref_primary_10_1109_TWC_2023_3244484 crossref_primary_10_1109_TSMC_2019_2957386 crossref_primary_10_1007_s11042_019_08537_6 crossref_primary_10_1109_TMM_2019_2959448 crossref_primary_10_1109_TVT_2023_3319377 crossref_primary_10_4018_IJeC_359891 crossref_primary_10_1109_TCSVT_2024_3502621 crossref_primary_10_1016_j_neucom_2023_126523 crossref_primary_10_1109_TMM_2019_2939707 crossref_primary_10_1145_3404374 crossref_primary_10_1145_3712059 crossref_primary_10_1109_TIP_2023_3307969 crossref_primary_10_1109_TITS_2023_3311541 crossref_primary_10_1109_TIP_2023_3270105 crossref_primary_10_1016_j_jksuci_2023_03_006 crossref_primary_10_1109_TPAMI_2024_3357709 crossref_primary_10_1109_THMS_2022_3145097 crossref_primary_10_1186_s13640_020_0492_x crossref_primary_10_1002_int_22840 crossref_primary_10_1007_s10845_022_02015_x crossref_primary_10_1016_j_neucom_2020_03_016 crossref_primary_10_1186_s13640_019_0480_1 crossref_primary_10_1109_TMM_2019_2931441 crossref_primary_10_1109_ACCESS_2024_3357980 crossref_primary_10_1109_TCSVT_2022_3232634 crossref_primary_10_1109_TMM_2021_3072479 crossref_primary_10_1109_TIP_2020_2981813 crossref_primary_10_3390_electronics10080888 crossref_primary_10_1109_TMM_2023_3237166 crossref_primary_10_1109_ACCESS_2022_3160451 crossref_primary_10_1109_TCSS_2022_3230262 crossref_primary_10_1109_JIOT_2023_3294421 crossref_primary_10_1007_s13735_023_00303_7 crossref_primary_10_3390_plants13121681 crossref_primary_10_1186_s40537_022_00664_6 crossref_primary_10_1109_TMM_2020_3003631 crossref_primary_10_3389_fpsyg_2022_948721 crossref_primary_10_1007_s41095_022_0271_y crossref_primary_10_1016_j_knosys_2025_113038 crossref_primary_10_3233_JIFS_210862 crossref_primary_10_1109_TMM_2023_3335875 crossref_primary_10_1109_TNNLS_2022_3179805 crossref_primary_10_3390_sym17010050 crossref_primary_10_1186_s13640_020_00506_6 crossref_primary_10_1186_s13640_019_0476_x crossref_primary_10_3389_fphy_2023_1174220 crossref_primary_10_1109_TNNLS_2020_3005325 crossref_primary_10_1109_TIM_2025_3548815 crossref_primary_10_1109_TFUZZ_2024_3472043 crossref_primary_10_1016_j_jvcir_2019_102684 crossref_primary_10_1007_s11432_020_3171_4 crossref_primary_10_1109_TCSVT_2021_3056725 crossref_primary_10_1016_j_eswa_2025_129504 crossref_primary_10_1109_JSEN_2022_3158271 crossref_primary_10_7717_peerj_cs_664 crossref_primary_10_1016_j_cviu_2025_104340 crossref_primary_10_1109_TITS_2021_3110713 crossref_primary_10_1186_s13640_019_0487_7 crossref_primary_10_3390_s22083088 crossref_primary_10_1109_TCSVT_2021_3051277 crossref_primary_10_1186_s13640_020_00510_w crossref_primary_10_1080_15435075_2025_2477627 crossref_primary_10_2174_0115733947265211230922050928 crossref_primary_10_1016_j_eswa_2020_113728 crossref_primary_10_1007_s11783_024_1780_y crossref_primary_10_1109_ACCESS_2021_3132787 crossref_primary_10_1016_j_engappai_2025_111870 crossref_primary_10_1016_j_compeleceng_2023_108861 crossref_primary_10_1109_TMM_2023_3322329 crossref_primary_10_3390_s23094425 crossref_primary_10_1007_s10043_024_00873_9 crossref_primary_10_1109_TPAMI_2021_3132229 crossref_primary_10_1007_s11042_019_08410_6 crossref_primary_10_1109_TMM_2021_3097502 crossref_primary_10_1186_s40537_022_00569_4 crossref_primary_10_1109_TMM_2021_3109430 crossref_primary_10_1016_j_rse_2022_112902 crossref_primary_10_1049_cvi2_12103 crossref_primary_10_1109_TIP_2020_3013138 crossref_primary_10_1016_j_jvcir_2020_102917 crossref_primary_10_3390_ijgi9090538 crossref_primary_10_1109_TMM_2020_2969782 crossref_primary_10_1016_j_inffus_2022_11_019 crossref_primary_10_1109_ACCESS_2023_3312219 crossref_primary_10_1109_TASE_2020_3030852 crossref_primary_10_1155_2021_2464648 crossref_primary_10_1109_TMM_2020_2966830 crossref_primary_10_1145_3503927 crossref_primary_10_1177_16878140211013138 crossref_primary_10_1186_s40538_024_00681_y crossref_primary_10_1016_j_neucom_2025_129543 crossref_primary_10_1109_TIP_2023_3268004 crossref_primary_10_1186_s13640_019_0486_8 crossref_primary_10_1007_s00138_023_01457_4 crossref_primary_10_1109_TIA_2022_3159617 crossref_primary_10_1007_s11042_024_19247_z crossref_primary_10_1007_s11517_024_03050_x crossref_primary_10_1016_j_cviu_2023_103671 crossref_primary_10_1109_TGRS_2024_3365990 crossref_primary_10_1016_j_neucom_2019_06_088 crossref_primary_10_1109_TG_2023_3263013 crossref_primary_10_1109_ACCESS_2023_3293646 crossref_primary_10_1109_TMM_2021_3063631 crossref_primary_10_1109_TMM_2021_3096080 crossref_primary_10_1145_3446792 crossref_primary_10_1109_TMM_2021_3058050 crossref_primary_10_1109_ACCESS_2022_3212745 crossref_primary_10_1007_s10115_025_02444_z crossref_primary_10_1016_j_eswa_2025_127235 crossref_primary_10_1109_ACCESS_2021_3116882 crossref_primary_10_1109_JLT_2022_3199040 crossref_primary_10_1016_j_engappai_2023_107487 crossref_primary_10_1016_j_ijdrr_2024_104263 crossref_primary_10_1016_j_neucom_2019_06_097 crossref_primary_10_1109_TPAMI_2020_2975798 crossref_primary_10_1049_ell2_12334 crossref_primary_10_1007_s11431_023_2532_9 |
| Cites_doi | 10.1109/TMM.2017.2722687 10.1145/2964284.2984064 10.1109/TMM.2015.2443556 10.1109/TIP.2018.2814344 10.1109/CVPR.2014.223 10.1109/TMM.2015.2482228 10.24963/ijcai.2018/157 10.1109/TMM.2008.917346 10.1109/TMM.2008.2004912 10.1109/TPAMI.2016.2577031 10.1371/journal.pone.0158664 10.1109/ICCV.2015.508 10.1109/CVPR.2016.497 10.1109/CVPR.2016.90 10.1109/CVPR.2015.7298594 10.1109/TCSVT.2018.2808685 10.1109/CVPR.2015.7298935 10.24963/ijcai.2018/365 10.1109/TIP.2018.2846664 10.1007/978-3-030-05710-7_4 10.1109/TITS.2017.2749977 10.3115/v1/N15-1173 10.1145/2964284.2984062 10.1109/TMM.2004.840598 10.1109/LSP.2014.2310494 10.1109/TMM.2017.2729019 10.1145/2964284.2984066 10.1109/CVPR.2015.7299087 10.1109/TMM.2003.811617 10.1145/3123266.3123354 10.1109/CVPR.2016.503 10.3115/1073083.1073135 10.1109/CVPR.2014.81 10.1109/CVPR.2016.117 10.1109/TMM.2010.2089504 10.1371/journal.pone.0162939 10.1109/ICCV.2015.512 10.1109/TMM.2011.2177646 10.1145/2964284.2984065 10.1109/TITS.2017.2749965 10.3115/v1/D14-1162 10.1109/ICCV.2017.450 10.1109/CVPR.2016.496 10.1109/CVPR.2017.662 10.1109/TMM.2017.2751140 10.1109/CVPR.2016.571 10.1109/TIP.2018.2855422 10.1109/ICCV.2015.510 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TMM.2019.2924576 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1941-0077 |
| EndPage | 241 |
| ExternalDocumentID | 10_1109_TMM_2019_2924576 8744407 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Basic Research Program of China (973 Program); National Key Research and Development Program of China grantid: 2017YFC0820600; 2017YFC0820605; 2017YFC0820604 funderid: 10.13039/501100012166 – fundername: Shenzhen Fundamental Research fund grantid: JCYJ20180306174120445; JCYJ20160331185006518 – fundername: National Natural Science Foundation of China; National Nature Science Foundation of China grantid: 61671196; 61525206; 61701149 funderid: 10.13039/501100001809 – fundername: Zhejiang Province Nature Science Foundation of China grantid: LR17F030006 – fundername: Shenzhen University grantid: 2019041 funderid: 10.13039/501100009019 |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS TN5 VH1 ZY4 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c357t-bac2b6a3affe7acabd10f2711e81b6d45d2d965f2ae0a5b11507497be47cb0a33 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 326 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000506577000020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1520-9210 |
| IngestDate | Sun Jun 29 16:56:36 EDT 2025 Sat Nov 29 03:10:03 EST 2025 Tue Nov 18 22:30:39 EST 2025 Wed Aug 27 02:39:31 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c357t-bac2b6a3affe7acabd10f2711e81b6d45d2d965f2ae0a5b11507497be47cb0a33 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-9525-9060 0000-0003-1204-0512 0000-0003-3320-2904 0000-0003-4080-6888 0000-0002-6448-4839 0000-0002-1151-1792 |
| PQID | 2333726408 |
| PQPubID | 75737 |
| PageCount | 13 |
| ParticipantIDs | crossref_citationtrail_10_1109_TMM_2019_2924576 proquest_journals_2333726408 crossref_primary_10_1109_TMM_2019_2924576 ieee_primary_8744407 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-Jan. 2020-1-00 20200101 |
| PublicationDateYYYYMMDD | 2020-01-01 |
| PublicationDate_xml | – month: 01 year: 2020 text: 2020-Jan. |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE transactions on multimedia |
| PublicationTitleAbbrev | TMM |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref15 ref14 ref53 ref52 ref11 ref54 ref10 ref17 ref16 ref19 ref18 ref51 ref50 ref46 ref45 ref48 ref47 ref42 ref44 ref49 ref8 ref7 ref9 ref4 ref6 ref5 ref40 ref35 ref34 chen (ref43) 2015 ref37 shetty (ref30) 2015 ref36 ref31 ref33 ref32 ref2 ref1 ref39 zeiler (ref55) 2012 chen (ref38) 0 xu (ref28) 2015; 14 ref24 ref23 ref26 ref25 xu (ref3) 2008; 10 ref20 ref22 ref21 ref27 ref29 lavie (ref41) 0 |
| References_xml | – ident: ref8 doi: 10.1109/TMM.2017.2722687 – ident: ref54 doi: 10.1145/2964284.2984064 – ident: ref13 doi: 10.1109/TMM.2015.2443556 – ident: ref6 doi: 10.1109/TIP.2018.2814344 – ident: ref47 doi: 10.1109/CVPR.2014.223 – year: 2012 ident: ref55 article-title: ADADELTA: An adaptive learning rate method – ident: ref7 doi: 10.1109/TMM.2015.2482228 – ident: ref19 doi: 10.24963/ijcai.2018/157 – volume: 10 start-page: 421 year: 2008 ident: ref3 article-title: A novel framework for semantic annotation and personalized retrieval of sports video publication-title: IEEE Trans Multimedia doi: 10.1109/TMM.2008.917346 – year: 2015 ident: ref43 article-title: Microsoft coco captions: Data collection and evaluation server – ident: ref16 doi: 10.1109/TMM.2008.2004912 – ident: ref49 doi: 10.1109/TPAMI.2016.2577031 – ident: ref27 doi: 10.1371/journal.pone.0158664 – ident: ref37 doi: 10.1109/ICCV.2015.508 – ident: ref36 doi: 10.1109/CVPR.2016.497 – year: 2015 ident: ref30 article-title: Video captioning with recurrent networks based on frame-and video-level features and visual content classification – ident: ref45 doi: 10.1109/CVPR.2016.90 – ident: ref44 doi: 10.1109/CVPR.2015.7298594 – ident: ref17 doi: 10.1109/TCSVT.2018.2808685 – start-page: 376 year: 0 ident: ref41 article-title: Meteor universal: Language specific translation evaluation for any target language publication-title: Proc Workshop Statist Mach Translation – ident: ref34 doi: 10.1109/CVPR.2015.7298935 – ident: ref18 doi: 10.24963/ijcai.2018/365 – ident: ref32 doi: 10.1109/TIP.2018.2846664 – ident: ref33 doi: 10.1007/978-3-030-05710-7_4 – ident: ref22 doi: 10.1109/TITS.2017.2749977 – ident: ref20 doi: 10.3115/v1/N15-1173 – ident: ref51 doi: 10.1145/2964284.2984062 – ident: ref15 doi: 10.1109/TMM.2004.840598 – ident: ref25 doi: 10.1109/LSP.2014.2310494 – ident: ref1 doi: 10.1109/TMM.2017.2729019 – ident: ref53 doi: 10.1145/2964284.2984066 – ident: ref42 doi: 10.1109/CVPR.2015.7299087 – ident: ref14 doi: 10.1109/TMM.2003.811617 – ident: ref31 doi: 10.1145/3123266.3123354 – ident: ref35 doi: 10.1109/CVPR.2016.503 – ident: ref40 doi: 10.3115/1073083.1073135 – ident: ref48 doi: 10.1109/CVPR.2014.81 – ident: ref10 doi: 10.1109/CVPR.2016.117 – ident: ref4 doi: 10.1109/TMM.2010.2089504 – volume: 14 start-page: 77 year: 2015 ident: ref28 article-title: Show, attend and tell: Neural image caption generation with visual attention publication-title: ICML – ident: ref26 doi: 10.1371/journal.pone.0162939 – ident: ref11 doi: 10.1109/ICCV.2015.512 – ident: ref2 doi: 10.1109/TMM.2011.2177646 – ident: ref52 doi: 10.1145/2964284.2984065 – ident: ref23 doi: 10.1109/TITS.2017.2749965 – ident: ref50 doi: 10.3115/v1/D14-1162 – ident: ref9 doi: 10.1109/ICCV.2017.450 – ident: ref12 doi: 10.1109/CVPR.2016.496 – ident: ref29 doi: 10.1109/CVPR.2017.662 – start-page: 190 year: 0 ident: ref38 article-title: Collecting highly parallel data for paraphrase evaluation publication-title: Proc 49th Ann Meeting of the Assoc for Computational Linguistics Human Language Technologies?Volume 1 – ident: ref5 doi: 10.1109/TMM.2017.2751140 – ident: ref39 doi: 10.1109/CVPR.2016.571 – ident: ref21 doi: 10.1109/TIP.2018.2855422 – ident: ref46 doi: 10.1109/ICCV.2015.510 – ident: ref24 doi: 10.1109/TITS.2017.2749977 |
| SSID | ssj0014507 |
| Score | 2.6692977 |
| Snippet | Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 229 |
| SubjectTerms | Cider Coders Convolutional neural networks Decoding encoder-decoder neural networks Encoders-Decoders Feature extraction Fuses Neural networks Performance evaluation Semantics Sentences spatial-temporal attention mechanism Video captioning Video data Visualization |
| Title | STAT: Spatial-Temporal Attention Mechanism for Video Captioning |
| URI | https://ieeexplore.ieee.org/document/8744407 https://www.proquest.com/docview/2333726408 |
| Volume | 22 |
| WOSCitedRecordID | wos000506577000020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1941-0077 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014507 issn: 1520-9210 databaseCode: RIE dateStart: 19990101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA9zeNCD001xOqUHL4LZ0qRdGm8yHF42PFTZrSTpKwx0G1vn32-SpmOgCN56yAvlvSTv-_cQuisgZCoiBWYgFI5oJLFkkmFGdKJyVQAUxA2b4NNpMpuJ1wZ62PXCAIArPoO-_XS5_HyptzZUNrBQ7ZFtHT_gnFe9WruMQRS71mijjggWxo-pU5JEDNLJxNZwiT41zkZs0UX2VJCbqfLjIXbaZdz633-dohNvRQZPldjPUAMWbdSqJzQE_sK20fEe3GDHGObpU_oY2CnE5tThtEKlMtuUZVX0GEzANgLPN5-BsWWD93kOy2AkVz5oe47exs_p6AX7AQpYs5iXWElN1dDwviiASy1VHpKC8jAEY6wO8yjOaS6GcUElEBkrZxxGgiuIuFZEMnaBmovlAi5RYGhUAsbZ1LahJLeY1hCLxOg2YnZnYRcNap5m2qOL2yEXH5nzMojIjBQyK4XMS6GL7ncUqwpZ44-1Hcv13TrP8C7q1WLL_NXbZJQxxo2ZR5Kr36mu0RG1TrOLo_RQs1xv4QYd6q9yvlnfulP1DdmEyaE |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH8MFdSD8xPnZw9eBOPSJF0bbyKOiW54qOKtJOkrDHQbW-ffb5J2Q1AEbz3khfJekvf9ewAXBYZcC1oQjlITwYQiiitOODWJznWBWFA_bCIeDJK3N_ncgKtlLwwi-uIzvHafPpefj83chcraDqpduNbx1UgIFlbdWsucgYh8c7RVSJRI68kskpJUttN-31VxyWtm3Y3I4Yt8U0J-qsqPp9jrl27zf3-2DVu1HRncVoLfgQaOdqG5mNEQ1Fd2Fza_AQ7uWdM8vU1vAjeH2J47kla4VHabsqzKHoM-ulbg4ewjsNZs8DrMcRzcqUkdtt2Hl-59etcj9QgFYngUl0Qrw3THcr8oMFZG6TykBYvDEK252slFlLNcdqKCKaQq0t48FDLWKGKjqeL8AFZG4xEeQmBpdILW3TSupSR3qNYYycRqN2p352EL2gueZqbGF3djLt4z72dQmVkpZE4KWS2FFlwuKSYVtsYfa_cc15fraoa34GQhtqy-fLOMcc5ja-jR5Oh3qnNY76X9p-zpYfB4DBvMudA-qnICK-V0jqewZj7L4Wx65k_YF4NDzOg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=STAT%3A+Spatial-Temporal+Attention+Mechanism+for+Video+Captioning&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Yan%2C+Chenggang&rft.au=Tu%2C+Yunbin&rft.au=Wang%2C+Xingzheng&rft.au=Zhang%2C+Yongbing&rft.date=2020-01-01&rft.issn=1520-9210&rft.eissn=1941-0077&rft.volume=22&rft.issue=1&rft.spage=229&rft.epage=241&rft_id=info:doi/10.1109%2FTMM.2019.2924576&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TMM_2019_2924576 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon |