STAT: Spatial-Temporal Attention Mechanism for Video Captioning

Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of human beings, temporal attention mechanism has been widely used in video description to selectively focus on important frames. However, most exi...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on multimedia Ročník 22; číslo 1; s. 229 - 241
Hlavní autori: Yan, Chenggang, Tu, Yunbin, Wang, Xingzheng, Zhang, Yongbing, Hao, Xinhong, Zhang, Yongdong, Dai, Qionghai
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Piscataway IEEE 01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1520-9210, 1941-0077
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of human beings, temporal attention mechanism has been widely used in video description to selectively focus on important frames. However, most existing methods based on temporal attention mechanism suffer from the problems of recognition error and detail missing, because temporal attention mechanism cannot further catch significant regions in frames. In order to address above problems, we propose the use of a novel spatial-temporal attention mechanism (STAT) within an encoder-decoder neural network for video captioning. The proposed STAT successfully takes into account both the spatial and temporal structures in a video, so it makes the decoder to automatically select the significant regions in the most relevant temporal segments for word prediction. We evaluate our STAT on two well-known benchmarks: MSVD and MSR-VTT-10K. Experimental results show that our proposed STAT achieves the state-of-the-art performance with several popular evaluation metrics: BLEU-4, METEOR, and CIDEr.
AbstractList Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of human beings, temporal attention mechanism has been widely used in video description to selectively focus on important frames. However, most existing methods based on temporal attention mechanism suffer from the problems of recognition error and detail missing, because temporal attention mechanism cannot further catch significant regions in frames. In order to address above problems, we propose the use of a novel spatial-temporal attention mechanism (STAT) within an encoder-decoder neural network for video captioning. The proposed STAT successfully takes into account both the spatial and temporal structures in a video, so it makes the decoder to automatically select the significant regions in the most relevant temporal segments for word prediction. We evaluate our STAT on two well-known benchmarks: MSVD and MSR-VTT-10K. Experimental results show that our proposed STAT achieves the state-of-the-art performance with several popular evaluation metrics: BLEU-4, METEOR, and CIDEr.
Author Tu, Yunbin
Zhang, Yongdong
Wang, Xingzheng
Yan, Chenggang
Zhang, Yongbing
Hao, Xinhong
Dai, Qionghai
Author_xml – sequence: 1
  givenname: Chenggang
  orcidid: 0000-0003-1204-0512
  surname: Yan
  fullname: Yan, Chenggang
  email: cgyan@hdu.edu.cn
  organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China
– sequence: 2
  givenname: Yunbin
  orcidid: 0000-0002-9525-9060
  surname: Tu
  fullname: Tu, Yunbin
  email: tuyunbin1995@foxmail.com
  organization: Institute of Information and Control, Hangzhou Dianzi University, Hangzhou, China
– sequence: 3
  givenname: Xingzheng
  orcidid: 0000-0003-4080-6888
  surname: Wang
  fullname: Wang, Xingzheng
  email: xingzheng.wang@szu.edu.cn
  organization: College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen, China
– sequence: 4
  givenname: Yongbing
  orcidid: 0000-0003-3320-2904
  surname: Zhang
  fullname: Zhang, Yongbing
  email: zhang.yongbing@sz.tsinghua.edu.cn
  organization: Graduate School at Shenzhen, Tsinghua University, Shenzhen, China
– sequence: 5
  givenname: Xinhong
  orcidid: 0000-0002-6448-4839
  surname: Hao
  fullname: Hao, Xinhong
  email: haoxinhong@bit.edu.cn
  organization: Science and Technology on Mechatronic Dynamic Control Laboratory, Beijing Institute of Technology, Beijing, China
– sequence: 6
  givenname: Yongdong
  orcidid: 0000-0002-1151-1792
  surname: Zhang
  fullname: Zhang, Yongdong
  email: zhyd@ict.ac.cn
  organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China
– sequence: 7
  givenname: Qionghai
  surname: Dai
  fullname: Dai, Qionghai
  email: daiqionghai@tsinghua.edu.cn
  organization: Department of Automation, Tsinghua University, Beijing, China
BookMark eNp9kE1Lw0AQhhepYFu9C14CnlNnP5JtvEgpfkGLh0avyySZ6JY2Gzfbg__ehIoHD55mYN5nZngmbNS4hhi75DDjHLKbfL2eCeDZTGRCJTo9YWOeKR4DaD3q-0RAnAkOZ2zSdVsArhLQY3a3yRf5bbRpMVjcxTntW-dxFy1CoCZY10RrKj-wsd0-qp2P3mxFLlpiO8xs837OTmvcdXTxU6fs9eE-Xz7Fq5fH5-ViFZcy0SEusBRFihLrmjSWWFQcaqE5pzkv0kollaiyNKkFEmBScN4_pzJdkNJlASjllF0f97befR6oC2brDr7pTxohpdQiVTDvU-kxVXrXdZ5qU9qAw6vBo90ZDmaQZXpZZpBlfmT1IPwBW2_36L_-Q66OiCWi3_hcK6VAy290r3Y1
CODEN ITMUF8
CitedBy_id crossref_primary_10_1007_s11045_021_00796_7
crossref_primary_10_1016_j_cviu_2022_103453
crossref_primary_10_1016_j_compositesa_2024_108429
crossref_primary_10_1109_TMM_2022_3141256
crossref_primary_10_1109_TNNLS_2019_2946636
crossref_primary_10_1016_j_jvcir_2020_102964
crossref_primary_10_1109_TMM_2025_3535394
crossref_primary_10_1007_s11042_023_15978_7
crossref_primary_10_1109_TMM_2023_3242142
crossref_primary_10_1016_j_media_2021_102135
crossref_primary_10_1016_j_ipm_2023_103534
crossref_primary_10_1016_j_prime_2023_100372
crossref_primary_10_1109_TMM_2022_3232022
crossref_primary_10_1145_3679203
crossref_primary_10_1016_j_autcon_2020_103376
crossref_primary_10_1109_TNSRE_2023_3342331
crossref_primary_10_1016_j_jvcir_2020_102833
crossref_primary_10_1109_TMM_2022_3143324
crossref_primary_10_1007_s10462_023_10414_6
crossref_primary_10_1109_TMM_2020_2995258
crossref_primary_10_1109_TAI_2021_3134190
crossref_primary_10_1109_ACCESS_2024_3388532
crossref_primary_10_3390_app10124312
crossref_primary_10_1007_s11042_023_15409_7
crossref_primary_10_1109_TMM_2020_2999176
crossref_primary_10_1109_TPAMI_2021_3115139
crossref_primary_10_1016_j_suscom_2025_101151
crossref_primary_10_1016_j_eswa_2025_127831
crossref_primary_10_1016_j_patrec_2024_01_002
crossref_primary_10_1109_TIFS_2022_3226905
crossref_primary_10_1109_TMM_2020_3003592
crossref_primary_10_1109_TCSVT_2022_3218018
crossref_primary_10_1109_TIP_2022_3159472
crossref_primary_10_1016_j_patcog_2022_109204
crossref_primary_10_1016_j_patcog_2022_109202
crossref_primary_10_1109_TCSVT_2025_3526841
crossref_primary_10_1016_j_still_2025_106747
crossref_primary_10_1145_3539225
crossref_primary_10_1109_TNNLS_2022_3146004
crossref_primary_10_1007_s11042_023_15933_6
crossref_primary_10_1109_TCSVT_2022_3228731
crossref_primary_10_1186_s13640_019_0479_7
crossref_primary_10_1007_s11760_025_04030_w
crossref_primary_10_1007_s10462_021_10104_1
crossref_primary_10_1109_ACCESS_2022_3202526
crossref_primary_10_1016_j_neucom_2022_11_094
crossref_primary_10_1109_ACCESS_2020_3012692
crossref_primary_10_1109_TMM_2021_3050092
crossref_primary_10_1007_s11042_023_17822_4
crossref_primary_10_3390_electronics13010201
crossref_primary_10_1016_j_adhoc_2023_103260
crossref_primary_10_1016_j_jvcir_2020_102813
crossref_primary_10_1109_TNNLS_2020_3043110
crossref_primary_10_1186_s13640_020_00537_z
crossref_primary_10_1016_j_engappai_2023_107573
crossref_primary_10_1016_j_knosys_2020_105988
crossref_primary_10_1016_j_media_2023_102960
crossref_primary_10_1109_ACCESS_2022_3206449
crossref_primary_10_1049_cvi2_12043
crossref_primary_10_1109_TNNLS_2020_2985099
crossref_primary_10_1007_s11760_025_04657_9
crossref_primary_10_1109_TMM_2020_3011317
crossref_primary_10_1109_TETCI_2019_2947319
crossref_primary_10_1109_TIP_2020_2969330
crossref_primary_10_1145_3729536
crossref_primary_10_1109_TNNLS_2020_2979745
crossref_primary_10_1109_TNNLS_2020_2997289
crossref_primary_10_1109_TMM_2019_2960588
crossref_primary_10_1016_j_asoc_2022_108447
crossref_primary_10_1109_TIA_2023_3274099
crossref_primary_10_1016_j_oceaneng_2023_116486
crossref_primary_10_1016_j_jiixd_2025_03_001
crossref_primary_10_1038_s41598_025_93758_z
crossref_primary_10_1109_TCSVT_2020_2990989
crossref_primary_10_1109_TMM_2022_3146005
crossref_primary_10_1007_s00500_021_06360_6
crossref_primary_10_1109_TMM_2020_2967645
crossref_primary_10_1016_j_neunet_2025_107817
crossref_primary_10_1109_TCSVT_2023_3287329
crossref_primary_10_1186_s13640_020_00514_6
crossref_primary_10_1007_s13042_023_01876_9
crossref_primary_10_1007_s11432_019_2784_4
crossref_primary_10_1109_TMM_2019_2949434
crossref_primary_10_1016_j_knosys_2022_109449
crossref_primary_10_1109_TIP_2022_3195643
crossref_primary_10_1007_s10489_023_04597_2
crossref_primary_10_1109_ACCESS_2024_3378313
crossref_primary_10_1109_TCBB_2022_3220902
crossref_primary_10_1080_00387010_2021_1931788
crossref_primary_10_1016_j_patcog_2020_107702
crossref_primary_10_1109_TMM_2023_3295098
crossref_primary_10_1002_eng2_12785
crossref_primary_10_1109_TASE_2024_3352903
crossref_primary_10_1109_TMM_2021_3074803
crossref_primary_10_1016_j_neucom_2024_129177
crossref_primary_10_1109_TCSVT_2021_3131721
crossref_primary_10_1186_s13640_019_0477_9
crossref_primary_10_1016_j_eswa_2025_128287
crossref_primary_10_1109_TMM_2022_3177308
crossref_primary_10_1016_j_patcog_2024_111138
crossref_primary_10_1117_1_JEI_33_6_063031
crossref_primary_10_3390_drones7060401
crossref_primary_10_1016_j_displa_2024_102860
crossref_primary_10_1177_14727978251321985
crossref_primary_10_3390_app15094990
crossref_primary_10_1109_TPAMI_2023_3243812
crossref_primary_10_1016_j_neucom_2019_12_006
crossref_primary_10_1109_TAFFC_2021_3064940
crossref_primary_10_3390_su13042250
crossref_primary_10_1109_TIP_2021_3082765
crossref_primary_10_1016_j_jvcir_2020_102970
crossref_primary_10_3390_electronics12040969
crossref_primary_10_1145_3387920
crossref_primary_10_1109_TMM_2021_3060948
crossref_primary_10_1007_s00432_022_04180_1
crossref_primary_10_1109_TWC_2023_3244484
crossref_primary_10_1109_TSMC_2019_2957386
crossref_primary_10_1007_s11042_019_08537_6
crossref_primary_10_1109_TMM_2019_2959448
crossref_primary_10_1109_TVT_2023_3319377
crossref_primary_10_4018_IJeC_359891
crossref_primary_10_1109_TCSVT_2024_3502621
crossref_primary_10_1016_j_neucom_2023_126523
crossref_primary_10_1109_TMM_2019_2939707
crossref_primary_10_1145_3404374
crossref_primary_10_1145_3712059
crossref_primary_10_1109_TIP_2023_3307969
crossref_primary_10_1109_TITS_2023_3311541
crossref_primary_10_1109_TIP_2023_3270105
crossref_primary_10_1016_j_jksuci_2023_03_006
crossref_primary_10_1109_TPAMI_2024_3357709
crossref_primary_10_1109_THMS_2022_3145097
crossref_primary_10_1186_s13640_020_0492_x
crossref_primary_10_1002_int_22840
crossref_primary_10_1007_s10845_022_02015_x
crossref_primary_10_1016_j_neucom_2020_03_016
crossref_primary_10_1186_s13640_019_0480_1
crossref_primary_10_1109_TMM_2019_2931441
crossref_primary_10_1109_ACCESS_2024_3357980
crossref_primary_10_1109_TCSVT_2022_3232634
crossref_primary_10_1109_TMM_2021_3072479
crossref_primary_10_1109_TIP_2020_2981813
crossref_primary_10_3390_electronics10080888
crossref_primary_10_1109_TMM_2023_3237166
crossref_primary_10_1109_ACCESS_2022_3160451
crossref_primary_10_1109_TCSS_2022_3230262
crossref_primary_10_1109_JIOT_2023_3294421
crossref_primary_10_1007_s13735_023_00303_7
crossref_primary_10_3390_plants13121681
crossref_primary_10_1186_s40537_022_00664_6
crossref_primary_10_1109_TMM_2020_3003631
crossref_primary_10_3389_fpsyg_2022_948721
crossref_primary_10_1007_s41095_022_0271_y
crossref_primary_10_1016_j_knosys_2025_113038
crossref_primary_10_3233_JIFS_210862
crossref_primary_10_1109_TMM_2023_3335875
crossref_primary_10_1109_TNNLS_2022_3179805
crossref_primary_10_3390_sym17010050
crossref_primary_10_1186_s13640_020_00506_6
crossref_primary_10_1186_s13640_019_0476_x
crossref_primary_10_3389_fphy_2023_1174220
crossref_primary_10_1109_TNNLS_2020_3005325
crossref_primary_10_1109_TIM_2025_3548815
crossref_primary_10_1109_TFUZZ_2024_3472043
crossref_primary_10_1016_j_jvcir_2019_102684
crossref_primary_10_1007_s11432_020_3171_4
crossref_primary_10_1109_TCSVT_2021_3056725
crossref_primary_10_1016_j_eswa_2025_129504
crossref_primary_10_1109_JSEN_2022_3158271
crossref_primary_10_7717_peerj_cs_664
crossref_primary_10_1016_j_cviu_2025_104340
crossref_primary_10_1109_TITS_2021_3110713
crossref_primary_10_1186_s13640_019_0487_7
crossref_primary_10_3390_s22083088
crossref_primary_10_1109_TCSVT_2021_3051277
crossref_primary_10_1186_s13640_020_00510_w
crossref_primary_10_1080_15435075_2025_2477627
crossref_primary_10_2174_0115733947265211230922050928
crossref_primary_10_1016_j_eswa_2020_113728
crossref_primary_10_1007_s11783_024_1780_y
crossref_primary_10_1109_ACCESS_2021_3132787
crossref_primary_10_1016_j_engappai_2025_111870
crossref_primary_10_1016_j_compeleceng_2023_108861
crossref_primary_10_1109_TMM_2023_3322329
crossref_primary_10_3390_s23094425
crossref_primary_10_1007_s10043_024_00873_9
crossref_primary_10_1109_TPAMI_2021_3132229
crossref_primary_10_1007_s11042_019_08410_6
crossref_primary_10_1109_TMM_2021_3097502
crossref_primary_10_1186_s40537_022_00569_4
crossref_primary_10_1109_TMM_2021_3109430
crossref_primary_10_1016_j_rse_2022_112902
crossref_primary_10_1049_cvi2_12103
crossref_primary_10_1109_TIP_2020_3013138
crossref_primary_10_1016_j_jvcir_2020_102917
crossref_primary_10_3390_ijgi9090538
crossref_primary_10_1109_TMM_2020_2969782
crossref_primary_10_1016_j_inffus_2022_11_019
crossref_primary_10_1109_ACCESS_2023_3312219
crossref_primary_10_1109_TASE_2020_3030852
crossref_primary_10_1155_2021_2464648
crossref_primary_10_1109_TMM_2020_2966830
crossref_primary_10_1145_3503927
crossref_primary_10_1177_16878140211013138
crossref_primary_10_1186_s40538_024_00681_y
crossref_primary_10_1016_j_neucom_2025_129543
crossref_primary_10_1109_TIP_2023_3268004
crossref_primary_10_1186_s13640_019_0486_8
crossref_primary_10_1007_s00138_023_01457_4
crossref_primary_10_1109_TIA_2022_3159617
crossref_primary_10_1007_s11042_024_19247_z
crossref_primary_10_1007_s11517_024_03050_x
crossref_primary_10_1016_j_cviu_2023_103671
crossref_primary_10_1109_TGRS_2024_3365990
crossref_primary_10_1016_j_neucom_2019_06_088
crossref_primary_10_1109_TG_2023_3263013
crossref_primary_10_1109_ACCESS_2023_3293646
crossref_primary_10_1109_TMM_2021_3063631
crossref_primary_10_1109_TMM_2021_3096080
crossref_primary_10_1145_3446792
crossref_primary_10_1109_TMM_2021_3058050
crossref_primary_10_1109_ACCESS_2022_3212745
crossref_primary_10_1007_s10115_025_02444_z
crossref_primary_10_1016_j_eswa_2025_127235
crossref_primary_10_1109_ACCESS_2021_3116882
crossref_primary_10_1109_JLT_2022_3199040
crossref_primary_10_1016_j_engappai_2023_107487
crossref_primary_10_1016_j_ijdrr_2024_104263
crossref_primary_10_1016_j_neucom_2019_06_097
crossref_primary_10_1109_TPAMI_2020_2975798
crossref_primary_10_1049_ell2_12334
crossref_primary_10_1007_s11431_023_2532_9
Cites_doi 10.1109/TMM.2017.2722687
10.1145/2964284.2984064
10.1109/TMM.2015.2443556
10.1109/TIP.2018.2814344
10.1109/CVPR.2014.223
10.1109/TMM.2015.2482228
10.24963/ijcai.2018/157
10.1109/TMM.2008.917346
10.1109/TMM.2008.2004912
10.1109/TPAMI.2016.2577031
10.1371/journal.pone.0158664
10.1109/ICCV.2015.508
10.1109/CVPR.2016.497
10.1109/CVPR.2016.90
10.1109/CVPR.2015.7298594
10.1109/TCSVT.2018.2808685
10.1109/CVPR.2015.7298935
10.24963/ijcai.2018/365
10.1109/TIP.2018.2846664
10.1007/978-3-030-05710-7_4
10.1109/TITS.2017.2749977
10.3115/v1/N15-1173
10.1145/2964284.2984062
10.1109/TMM.2004.840598
10.1109/LSP.2014.2310494
10.1109/TMM.2017.2729019
10.1145/2964284.2984066
10.1109/CVPR.2015.7299087
10.1109/TMM.2003.811617
10.1145/3123266.3123354
10.1109/CVPR.2016.503
10.3115/1073083.1073135
10.1109/CVPR.2014.81
10.1109/CVPR.2016.117
10.1109/TMM.2010.2089504
10.1371/journal.pone.0162939
10.1109/ICCV.2015.512
10.1109/TMM.2011.2177646
10.1145/2964284.2984065
10.1109/TITS.2017.2749965
10.3115/v1/D14-1162
10.1109/ICCV.2017.450
10.1109/CVPR.2016.496
10.1109/CVPR.2017.662
10.1109/TMM.2017.2751140
10.1109/CVPR.2016.571
10.1109/TIP.2018.2855422
10.1109/ICCV.2015.510
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TMM.2019.2924576
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1941-0077
EndPage 241
ExternalDocumentID 10_1109_TMM_2019_2924576
8744407
Genre orig-research
GrantInformation_xml – fundername: National Basic Research Program of China (973 Program); National Key Research and Development Program of China
  grantid: 2017YFC0820600; 2017YFC0820605; 2017YFC0820604
  funderid: 10.13039/501100012166
– fundername: Shenzhen Fundamental Research fund
  grantid: JCYJ20180306174120445; JCYJ20160331185006518
– fundername: National Natural Science Foundation of China; National Nature Science Foundation of China
  grantid: 61671196; 61525206; 61701149
  funderid: 10.13039/501100001809
– fundername: Zhejiang Province Nature Science Foundation of China
  grantid: LR17F030006
– fundername: Shenzhen University
  grantid: 2019041
  funderid: 10.13039/501100009019
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
VH1
ZY4
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c357t-bac2b6a3affe7acabd10f2711e81b6d45d2d965f2ae0a5b11507497be47cb0a33
IEDL.DBID RIE
ISICitedReferencesCount 326
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000506577000020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1520-9210
IngestDate Sun Jun 29 16:56:36 EDT 2025
Sat Nov 29 03:10:03 EST 2025
Tue Nov 18 22:30:39 EST 2025
Wed Aug 27 02:39:31 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c357t-bac2b6a3affe7acabd10f2711e81b6d45d2d965f2ae0a5b11507497be47cb0a33
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-9525-9060
0000-0003-1204-0512
0000-0003-3320-2904
0000-0003-4080-6888
0000-0002-6448-4839
0000-0002-1151-1792
PQID 2333726408
PQPubID 75737
PageCount 13
ParticipantIDs crossref_citationtrail_10_1109_TMM_2019_2924576
proquest_journals_2333726408
crossref_primary_10_1109_TMM_2019_2924576
ieee_primary_8744407
PublicationCentury 2000
PublicationDate 2020-Jan.
2020-1-00
20200101
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – month: 01
  year: 2020
  text: 2020-Jan.
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE transactions on multimedia
PublicationTitleAbbrev TMM
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref14
ref53
ref52
ref11
ref54
ref10
ref17
ref16
ref19
ref18
ref51
ref50
ref46
ref45
ref48
ref47
ref42
ref44
ref49
ref8
ref7
ref9
ref4
ref6
ref5
ref40
ref35
ref34
chen (ref43) 2015
ref37
shetty (ref30) 2015
ref36
ref31
ref33
ref32
ref2
ref1
ref39
zeiler (ref55) 2012
chen (ref38) 0
xu (ref28) 2015; 14
ref24
ref23
ref26
ref25
xu (ref3) 2008; 10
ref20
ref22
ref21
ref27
ref29
lavie (ref41) 0
References_xml – ident: ref8
  doi: 10.1109/TMM.2017.2722687
– ident: ref54
  doi: 10.1145/2964284.2984064
– ident: ref13
  doi: 10.1109/TMM.2015.2443556
– ident: ref6
  doi: 10.1109/TIP.2018.2814344
– ident: ref47
  doi: 10.1109/CVPR.2014.223
– year: 2012
  ident: ref55
  article-title: ADADELTA: An adaptive learning rate method
– ident: ref7
  doi: 10.1109/TMM.2015.2482228
– ident: ref19
  doi: 10.24963/ijcai.2018/157
– volume: 10
  start-page: 421
  year: 2008
  ident: ref3
  article-title: A novel framework for semantic annotation and personalized retrieval of sports video
  publication-title: IEEE Trans Multimedia
  doi: 10.1109/TMM.2008.917346
– year: 2015
  ident: ref43
  article-title: Microsoft coco captions: Data collection and evaluation server
– ident: ref16
  doi: 10.1109/TMM.2008.2004912
– ident: ref49
  doi: 10.1109/TPAMI.2016.2577031
– ident: ref27
  doi: 10.1371/journal.pone.0158664
– ident: ref37
  doi: 10.1109/ICCV.2015.508
– ident: ref36
  doi: 10.1109/CVPR.2016.497
– year: 2015
  ident: ref30
  article-title: Video captioning with recurrent networks based on frame-and video-level features and visual content classification
– ident: ref45
  doi: 10.1109/CVPR.2016.90
– ident: ref44
  doi: 10.1109/CVPR.2015.7298594
– ident: ref17
  doi: 10.1109/TCSVT.2018.2808685
– start-page: 376
  year: 0
  ident: ref41
  article-title: Meteor universal: Language specific translation evaluation for any target language
  publication-title: Proc Workshop Statist Mach Translation
– ident: ref34
  doi: 10.1109/CVPR.2015.7298935
– ident: ref18
  doi: 10.24963/ijcai.2018/365
– ident: ref32
  doi: 10.1109/TIP.2018.2846664
– ident: ref33
  doi: 10.1007/978-3-030-05710-7_4
– ident: ref22
  doi: 10.1109/TITS.2017.2749977
– ident: ref20
  doi: 10.3115/v1/N15-1173
– ident: ref51
  doi: 10.1145/2964284.2984062
– ident: ref15
  doi: 10.1109/TMM.2004.840598
– ident: ref25
  doi: 10.1109/LSP.2014.2310494
– ident: ref1
  doi: 10.1109/TMM.2017.2729019
– ident: ref53
  doi: 10.1145/2964284.2984066
– ident: ref42
  doi: 10.1109/CVPR.2015.7299087
– ident: ref14
  doi: 10.1109/TMM.2003.811617
– ident: ref31
  doi: 10.1145/3123266.3123354
– ident: ref35
  doi: 10.1109/CVPR.2016.503
– ident: ref40
  doi: 10.3115/1073083.1073135
– ident: ref48
  doi: 10.1109/CVPR.2014.81
– ident: ref10
  doi: 10.1109/CVPR.2016.117
– ident: ref4
  doi: 10.1109/TMM.2010.2089504
– volume: 14
  start-page: 77
  year: 2015
  ident: ref28
  article-title: Show, attend and tell: Neural image caption generation with visual attention
  publication-title: ICML
– ident: ref26
  doi: 10.1371/journal.pone.0162939
– ident: ref11
  doi: 10.1109/ICCV.2015.512
– ident: ref2
  doi: 10.1109/TMM.2011.2177646
– ident: ref52
  doi: 10.1145/2964284.2984065
– ident: ref23
  doi: 10.1109/TITS.2017.2749965
– ident: ref50
  doi: 10.3115/v1/D14-1162
– ident: ref9
  doi: 10.1109/ICCV.2017.450
– ident: ref12
  doi: 10.1109/CVPR.2016.496
– ident: ref29
  doi: 10.1109/CVPR.2017.662
– start-page: 190
  year: 0
  ident: ref38
  article-title: Collecting highly parallel data for paraphrase evaluation
  publication-title: Proc 49th Ann Meeting of the Assoc for Computational Linguistics Human Language Technologies?Volume 1
– ident: ref5
  doi: 10.1109/TMM.2017.2751140
– ident: ref39
  doi: 10.1109/CVPR.2016.571
– ident: ref21
  doi: 10.1109/TIP.2018.2855422
– ident: ref46
  doi: 10.1109/ICCV.2015.510
– ident: ref24
  doi: 10.1109/TITS.2017.2749977
SSID ssj0014507
Score 2.6692977
Snippet Video captioning refers to automatic generate natural language sentences, which summarize the video contents. Inspired by the visual attention mechanism of...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 229
SubjectTerms Cider
Coders
Convolutional neural networks
Decoding
encoder-decoder neural networks
Encoders-Decoders
Feature extraction
Fuses
Neural networks
Performance evaluation
Semantics
Sentences
spatial-temporal attention mechanism
Video captioning
Video data
Visualization
Title STAT: Spatial-Temporal Attention Mechanism for Video Captioning
URI https://ieeexplore.ieee.org/document/8744407
https://www.proquest.com/docview/2333726408
Volume 22
WOSCitedRecordID wos000506577000020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1941-0077
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014507
  issn: 1520-9210
  databaseCode: RIE
  dateStart: 19990101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA9zeNCD001xOqUHL4LZ0qRdGm8yHF42PFTZrSTpKwx0G1vn32-SpmOgCN56yAvlvSTv-_cQuisgZCoiBWYgFI5oJLFkkmFGdKJyVQAUxA2b4NNpMpuJ1wZ62PXCAIArPoO-_XS5_HyptzZUNrBQ7ZFtHT_gnFe9WruMQRS71mijjggWxo-pU5JEDNLJxNZwiT41zkZs0UX2VJCbqfLjIXbaZdz633-dohNvRQZPldjPUAMWbdSqJzQE_sK20fEe3GDHGObpU_oY2CnE5tThtEKlMtuUZVX0GEzANgLPN5-BsWWD93kOy2AkVz5oe47exs_p6AX7AQpYs5iXWElN1dDwviiASy1VHpKC8jAEY6wO8yjOaS6GcUElEBkrZxxGgiuIuFZEMnaBmovlAi5RYGhUAsbZ1LahJLeY1hCLxOg2YnZnYRcNap5m2qOL2yEXH5nzMojIjBQyK4XMS6GL7ncUqwpZ44-1Hcv13TrP8C7q1WLL_NXbZJQxxo2ZR5Kr36mu0RG1TrOLo_RQs1xv4QYd6q9yvlnfulP1DdmEyaE
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH8MFdSD8xPnZw9eBOPSJF0bbyKOiW54qOKtJOkrDHQbW-ffb5J2Q1AEbz3khfJekvf9ewAXBYZcC1oQjlITwYQiiitOODWJznWBWFA_bCIeDJK3N_ncgKtlLwwi-uIzvHafPpefj83chcraDqpduNbx1UgIFlbdWsucgYh8c7RVSJRI68kskpJUttN-31VxyWtm3Y3I4Yt8U0J-qsqPp9jrl27zf3-2DVu1HRncVoLfgQaOdqG5mNEQ1Fd2Fza_AQ7uWdM8vU1vAjeH2J47kla4VHabsqzKHoM-ulbg4ewjsNZs8DrMcRzcqUkdtt2Hl-59etcj9QgFYngUl0Qrw3THcr8oMFZG6TykBYvDEK252slFlLNcdqKCKaQq0t48FDLWKGKjqeL8AFZG4xEeQmBpdILW3TSupSR3qNYYycRqN2p352EL2gueZqbGF3djLt4z72dQmVkpZE4KWS2FFlwuKSYVtsYfa_cc15fraoa34GQhtqy-fLOMcc5ja-jR5Oh3qnNY76X9p-zpYfB4DBvMudA-qnICK-V0jqewZj7L4Wx65k_YF4NDzOg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=STAT%3A+Spatial-Temporal+Attention+Mechanism+for+Video+Captioning&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Yan%2C+Chenggang&rft.au=Tu%2C+Yunbin&rft.au=Wang%2C+Xingzheng&rft.au=Zhang%2C+Yongbing&rft.date=2020-01-01&rft.issn=1520-9210&rft.eissn=1941-0077&rft.volume=22&rft.issue=1&rft.spage=229&rft.epage=241&rft_id=info:doi/10.1109%2FTMM.2019.2924576&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TMM_2019_2924576
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon