EAPT: Efficient Attention Pyramid Transformer for Image Processing

Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the input features into the same size patches, which ignores the fact that vision elements are often various and thus may destroy the semantic i...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on multimedia Ročník 25; s. 50 - 61
Hlavní autoři: Lin, Xiao, Sun, Shuzhou, Huang, Wei, Sheng, Bin, Li, Ping, Feng, David Dagan
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1520-9210, 1941-0077
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the input features into the same size patches, which ignores the fact that vision elements are often various and thus may destroy the semantic information. Also, the vanilla patch-based transformer cannot guarantee the information communication between patches, which will prevent the extraction of attention information with a global view. To circumvent those problems, we propose an Efficient Attention Pyramid Transformer (EAPT). Specifically, we first propose the Deformable Attention, which learns an offset for each position in patches. Thus, even with split fixed-size patches, our method can still obtain non-fixed attention information that can cover various vision elements. Then, we design the Encode-Decode Communication module (En-DeC module), which can obtain communication information among all patches to get more complete global attention information. Finally, we propose a position encoding specifically for vision transformers, which can be used for patches of any dimension and any length. Extensive experiments on the vision tasks of image classification, object detection, and semantic segmentation demonstrate the effectiveness of our proposed model. Furthermore, we also conduct rigorous ablation studies to evaluate the key components of the proposed structure.
AbstractList Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the input features into the same size patches, which ignores the fact that vision elements are often various and thus may destroy the semantic information. Also, the vanilla patch-based transformer cannot guarantee the information communication between patches, which will prevent the extraction of attention information with a global view. To circumvent those problems, we propose an Efficient Attention Pyramid Transformer (EAPT). Specifically, we first propose the Deformable Attention, which learns an offset for each position in patches. Thus, even with split fixed-size patches, our method can still obtain non-fixed attention information that can cover various vision elements. Then, we design the Encode-Decode Communication module (En-DeC module), which can obtain communication information among all patches to get more complete global attention information. Finally, we propose a position encoding specifically for vision transformers, which can be used for patches of any dimension and any length. Extensive experiments on the vision tasks of image classification, object detection, and semantic segmentation demonstrate the effectiveness of our proposed model. Furthermore, we also conduct rigorous ablation studies to evaluate the key components of the proposed structure.
Author Sun, Shuzhou
Feng, David Dagan
Sheng, Bin
Li, Ping
Huang, Wei
Lin, Xiao
Author_xml – sequence: 1
  givenname: Xiao
  orcidid: 0000-0002-8805-7129
  surname: Lin
  fullname: Lin, Xiao
  email: lin6008@shnu.edu.cn
  organization: Department of Computer Science, Shanghai Normal University, Shanghai, China
– sequence: 2
  givenname: Shuzhou
  surname: Sun
  fullname: Sun, Shuzhou
  email: 1000479143@smail.shnu.edu.cn
  organization: Department of Computer Science, Shanghai Normal University, Shanghai, China
– sequence: 3
  givenname: Wei
  surname: Huang
  fullname: Huang, Wei
  email: 191380039@usst.edu.cn
  organization: Department of Computer Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
– sequence: 4
  givenname: Bin
  orcidid: 0000-0001-8510-2556
  surname: Sheng
  fullname: Sheng, Bin
  email: shengbin@cs.sjtu.edu.cn
  organization: Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
– sequence: 5
  givenname: Ping
  orcidid: 0000-0002-1503-0240
  surname: Li
  fullname: Li, Ping
  email: p.li@polyu.edu.hk
  organization: Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong
– sequence: 6
  givenname: David Dagan
  orcidid: 0000-0002-3381-214X
  surname: Feng
  fullname: Feng, David Dagan
  email: dagan.feng@sydney.edu.au
  organization: Biomedical and Multimedia Information Technology Research Group, School of Information Technologies, The University of Sydney, Sydney, NSW, Australia
BookMark eNp9kDFvwjAQRq2KSqW0e6UukTqH3tlJbHejiLZIoDKkc2TMGRlBQu0w8O8bBOrQodN3w_fudO-W9eqmJsYeEIaIoJ_L-XzIgeNQIAclxRXro84wBZCy1805h1RzhBt2G-MGALMcZJ-9TkaL8iWZOOetp7pNRm3bhW_qZHEMZudXSRlMHV0TdhSSLpLpzqwpWYTGUoy-Xt-xa2e2ke4vOWBfb5Ny_JHOPt-n49EstVxjmxLxlXLSaqVsBgKWJLJiiRql0Tx31hLIpXJKC-nQqDxX2lhDIIQtlnZFYsCeznv3ofk-UGyrTXMIdXey4rLIUSFi3rWKc8uGJsZArrK-NaeH2mD8tkKoTr6qzld18lVdfHUg_AH3we9MOP6HPJ4RT0S_dZ0rKDIufgCeKXbW
CODEN ITMUF8
CitedBy_id crossref_primary_10_1007_s00371_025_03812_0
crossref_primary_10_1007_s00371_025_04130_1
crossref_primary_10_1016_j_dsp_2024_104964
crossref_primary_10_1007_s00371_024_03640_8
crossref_primary_10_1007_s00371_025_03858_0
crossref_primary_10_1016_j_entcom_2024_100820
crossref_primary_10_1007_s00371_024_03602_0
crossref_primary_10_1007_s00371_025_03798_9
crossref_primary_10_1007_s00371_024_03663_1
crossref_primary_10_1109_TNNLS_2025_3565582
crossref_primary_10_1088_1361_6501_ade272
crossref_primary_10_1007_s00371_024_03773_w
crossref_primary_10_1007_s00371_025_03809_9
crossref_primary_10_1007_s00371_024_03542_9
crossref_primary_10_1080_01431161_2024_2443989
crossref_primary_10_1109_TMM_2025_3535392
crossref_primary_10_1007_s42417_025_01874_x
crossref_primary_10_1007_s00371_024_03795_4
crossref_primary_10_1007_s00371_024_03396_1
crossref_primary_10_1016_j_eswa_2025_127499
crossref_primary_10_1007_s10844_025_00938_4
crossref_primary_10_1007_s00371_025_04019_z
crossref_primary_10_1007_s00371_024_03636_4
crossref_primary_10_1007_s00371_025_03801_3
crossref_primary_10_1007_s00371_025_03886_w
crossref_primary_10_1007_s00371_024_03614_w
crossref_primary_10_1007_s00371_024_03592_z
crossref_primary_10_1007_s41060_024_00578_x
crossref_primary_10_1007_s00371_024_03599_6
crossref_primary_10_1007_s00371_024_03651_5
crossref_primary_10_1109_TMM_2024_3521805
crossref_primary_10_1007_s00371_024_03762_z
crossref_primary_10_1038_s41598_024_62906_2
crossref_primary_10_1007_s00371_024_03784_7
crossref_primary_10_1007_s00371_024_03723_6
crossref_primary_10_1007_s00371_024_03531_y
crossref_primary_10_1007_s00371_024_03701_y
crossref_primary_10_1007_s00371_025_04149_4
crossref_primary_10_1109_ACCESS_2024_3494023
crossref_primary_10_1007_s00371_024_03553_6
crossref_primary_10_1007_s00371_024_03603_z
crossref_primary_10_1007_s00371_025_03836_6
crossref_primary_10_1007_s00371_025_04164_5
crossref_primary_10_1016_j_compmedimag_2025_102527
crossref_primary_10_1007_s00371_024_03479_z
crossref_primary_10_1007_s00371_024_03624_8
crossref_primary_10_1007_s00371_024_03685_9
crossref_primary_10_1007_s00371_023_03245_7
crossref_primary_10_1007_s00371_025_04178_z
crossref_primary_10_1142_S219688882450012X
crossref_primary_10_1049_ipr2_70040
crossref_primary_10_1007_s00371_025_03943_4
crossref_primary_10_1007_s00371_024_03549_2
crossref_primary_10_1109_ACCESS_2025_3588390
crossref_primary_10_1007_s00371_025_04056_8
crossref_primary_10_1007_s00371_025_03921_w
crossref_primary_10_1007_s00371_025_03825_9
crossref_primary_10_1109_TMM_2023_3301225
crossref_primary_10_1007_s11760_024_03783_0
crossref_primary_10_1109_JSTARS_2024_3461171
crossref_primary_10_1109_TMM_2023_3235495
crossref_primary_10_1109_TGRS_2025_3536473
crossref_primary_10_1007_s00138_025_01671_2
crossref_primary_10_1007_s00138_025_01710_y
crossref_primary_10_1007_s00371_024_03613_x
crossref_primary_10_3390_s24062014
crossref_primary_10_1109_TMM_2023_3326881
crossref_primary_10_1007_s00371_024_03650_6
crossref_primary_10_1007_s00371_025_04017_1
crossref_primary_10_1007_s11554_025_01719_6
crossref_primary_10_1007_s00371_025_04060_y
crossref_primary_10_1007_s00371_024_03288_4
crossref_primary_10_1007_s00371_024_03576_z
crossref_primary_10_1007_s00371_024_03783_8
crossref_primary_10_1109_TCSVT_2024_3521454
crossref_primary_10_1007_s00371_024_03530_z
crossref_primary_10_1007_s00371_024_03700_z
crossref_primary_10_1007_s00371_025_03815_x
crossref_primary_10_1007_s11042_025_20696_3
crossref_primary_10_1007_s00371_024_03590_1
crossref_primary_10_1007_s00371_025_03837_5
crossref_primary_10_1007_s00371_025_04180_5
crossref_primary_10_1007_s00371_024_03620_y
crossref_primary_10_1007_s00371_025_03876_y
crossref_primary_10_1007_s00371_024_03333_2
crossref_primary_10_1007_s00371_024_03688_6
crossref_primary_10_1109_TMM_2025_3542958
crossref_primary_10_1007_s00371_024_03416_0
crossref_primary_10_1007_s11760_024_03373_0
crossref_primary_10_1007_s00371_024_03749_w
crossref_primary_10_1007_s00371_024_03522_z
crossref_primary_10_1007_s00371_024_03736_1
crossref_primary_10_1088_1402_4896_adf302
crossref_primary_10_1007_s00371_023_03221_1
crossref_primary_10_1007_s00371_024_03680_0
crossref_primary_10_1007_s00371_025_03981_y
crossref_primary_10_1007_s00371_024_03713_8
crossref_primary_10_1007_s00371_025_03826_8
crossref_primary_10_1007_s00371_024_03797_2
crossref_primary_10_1007_s00371_024_03638_2
crossref_primary_10_1007_s11760_024_03347_2
crossref_primary_10_1007_s00371_024_03594_x
crossref_primary_10_1117_1_JEI_34_2_023008
crossref_primary_10_1007_s00530_025_01845_y
crossref_primary_10_1007_s00371_024_03653_3
crossref_primary_10_1007_s00371_024_03699_3
crossref_primary_10_1007_s00371_024_03692_w
crossref_primary_10_1007_s00371_024_03529_6
crossref_primary_10_1007_s00371_025_03905_w
crossref_primary_10_1007_s00530_025_01821_6
crossref_primary_10_1007_s00371_024_03703_w
crossref_primary_10_3390_s25154700
crossref_primary_10_1117_1_JEI_34_2_023005
crossref_primary_10_1007_s00371_024_03496_y
crossref_primary_10_1007_s00371_025_04140_z
crossref_primary_10_1007_s00371_024_03740_5
crossref_primary_10_1007_s00371_025_03838_4
crossref_primary_10_1007_s00371_025_04162_7
crossref_primary_10_1007_s00371_024_03570_5
crossref_primary_10_3390_s24144686
crossref_primary_10_1007_s00371_025_03877_x
crossref_primary_10_1109_TMM_2025_3535405
crossref_primary_10_1002_cav_70060
crossref_primary_10_1007_s00371_024_03433_z
crossref_primary_10_1016_j_jvcir_2025_104494
crossref_primary_10_1007_s00371_024_03372_9
crossref_primary_10_1007_s00371_025_03853_5
crossref_primary_10_1007_s00371_023_03243_9
crossref_primary_10_1007_s00371_024_03709_4
crossref_primary_10_1007_s00371_024_03748_x
crossref_primary_10_1016_j_bspc_2025_107765
crossref_primary_10_1007_s00371_024_03664_0
crossref_primary_10_1007_s00034_025_03083_z
crossref_primary_10_1109_TMM_2024_3521664
crossref_primary_10_1109_TMM_2023_3302471
crossref_primary_10_1109_TMM_2024_3521662
crossref_primary_10_1007_s00371_025_03827_7
crossref_primary_10_1007_s00371_024_03750_3
crossref_primary_10_1007_s00371_024_03796_3
crossref_primary_10_1007_s00371_024_03429_9
crossref_primary_10_1007_s00530_025_01970_8
crossref_primary_10_1007_s00371_024_03637_3
crossref_primary_10_1007_s00371_024_03630_w
crossref_primary_10_3390_agriculture14101725
crossref_primary_10_1007_s10462_025_11218_6
crossref_primary_10_1007_s00371_024_03652_4
crossref_primary_10_1007_s00371_024_03421_3
crossref_primary_10_1007_s00371_025_03865_1
crossref_primary_10_1007_s13042_025_02797_5
crossref_primary_10_1142_S0129183125420173
crossref_primary_10_1007_s00371_024_03471_7
crossref_primary_10_1007_s00371_024_03578_x
crossref_primary_10_1007_s00371_025_04148_5
crossref_primary_10_1007_s00371_024_03495_z
crossref_primary_10_1016_j_ins_2025_122511
crossref_primary_10_1007_s00371_024_03604_y
crossref_primary_10_1007_s00371_024_03419_x
crossref_primary_10_1007_s00371_025_04028_y
crossref_primary_10_3390_rs17050858
crossref_primary_10_1007_s00371_024_03437_9
crossref_primary_10_1007_s00371_024_03668_w
crossref_primary_10_1007_s00371_025_03878_w
crossref_primary_10_1007_s00371_024_03331_4
crossref_primary_10_1007_s00371_024_03629_3
crossref_primary_10_1007_s00530_025_01711_x
crossref_primary_10_1007_s00371_025_03893_x
crossref_primary_10_1007_s00371_024_03683_x
crossref_primary_10_1007_s00371_025_04109_y
crossref_primary_10_1007_s00371_025_04025_1
crossref_primary_10_1007_s00371_024_03329_y
crossref_primary_10_1007_s00371_025_04074_6
crossref_primary_10_1007_s00371_025_04098_y
crossref_primary_10_1007_s00371_024_03715_6
crossref_primary_10_1007_s00371_025_03828_6
crossref_primary_10_1007_s00371_024_03791_8
crossref_primary_10_1007_s00371_025_03806_y
crossref_primary_10_1007_s42235_024_00557_9
crossref_primary_10_1007_s00371_024_03632_8
crossref_primary_10_1007_s00371_024_03389_0
crossref_primary_10_1007_s00371_024_03403_5
crossref_primary_10_1007_s00371_024_03442_y
crossref_primary_10_1109_TITS_2024_3480114
crossref_primary_10_1007_s00371_024_03366_7
crossref_primary_10_1007_s00371_025_03866_0
crossref_primary_10_1007_s00371_023_03253_7
crossref_primary_10_1007_s00371_025_03903_y
crossref_primary_10_3390_jimaging10090228
crossref_primary_10_1007_s00371_025_03820_0
crossref_primary_10_1016_j_compeleceng_2024_109628
crossref_primary_10_1007_s00371_025_03881_1
crossref_primary_10_1007_s00371_024_03788_3
crossref_primary_10_1007_s00371_024_03727_2
crossref_primary_10_1109_JBHI_2025_3535541
crossref_primary_10_1007_s00371_024_03452_w
crossref_primary_10_1016_j_neunet_2025_107618
crossref_primary_10_3390_app14125039
crossref_primary_10_1007_s00371_024_03459_3
crossref_primary_10_1007_s00371_025_04107_0
crossref_primary_10_1007_s00530_025_01941_z
crossref_primary_10_1007_s00371_025_04183_2
crossref_primary_10_1007_s00530_024_01353_5
crossref_primary_10_7717_peerj_cs_1093
crossref_primary_10_1007_s00371_024_03643_5
crossref_primary_10_1007_s00371_024_03628_4
crossref_primary_10_1007_s00371_025_04001_9
crossref_primary_10_1007_s00371_025_03870_4
crossref_primary_10_1007_s11227_025_06947_y
crossref_primary_10_1016_j_atech_2025_101209
crossref_primary_10_1007_s00371_024_03519_8
crossref_primary_10_1007_s00371_024_03523_y
crossref_primary_10_1007_s00371_024_03737_0
crossref_primary_10_1007_s00371_024_03775_8
crossref_primary_10_1007_s12161_024_02716_4
crossref_primary_10_1007_s00371_024_03273_x
crossref_primary_10_1007_s00371_025_03985_8
crossref_primary_10_1007_s40998_025_00834_1
crossref_primary_10_1109_TGRS_2022_3182745
crossref_primary_10_1007_s00371_025_03829_5
crossref_primary_10_1007_s00371_024_03790_9
crossref_primary_10_1007_s00371_025_04158_3
crossref_primary_10_1007_s00371_024_03617_7
crossref_primary_10_1007_s00371_025_03962_1
crossref_primary_10_1007_s11760_024_03276_0
crossref_primary_10_1007_s00371_025_04173_4
crossref_primary_10_1007_s00371_024_03428_w
crossref_primary_10_1007_s00371_024_03677_9
crossref_primary_10_1007_s10341_025_01385_9
crossref_primary_10_1007_s00521_023_08852_y
crossref_primary_10_1007_s00371_024_03404_4
crossref_primary_10_1109_ACCESS_2024_3520138
crossref_primary_10_1007_s00371_024_03654_2
crossref_primary_10_1109_ACCESS_2024_3468028
crossref_primary_10_1007_s00371_024_03739_y
crossref_primary_10_1007_s00371_024_03508_x
crossref_primary_10_1007_s00371_024_03284_8
crossref_primary_10_1007_s00371_024_03473_5
crossref_primary_10_1007_s00371_024_03787_4
crossref_primary_10_1007_s00371_024_03451_x
crossref_primary_10_1007_s00371_024_03556_3
crossref_primary_10_1109_TMM_2025_3535321
crossref_primary_10_1007_s00371_024_03606_w
crossref_primary_10_1007_s00371_024_03741_4
crossref_primary_10_1007_s00371_025_03879_9
crossref_primary_10_1007_s00371_025_04161_8
crossref_primary_10_1007_s00371_025_04169_0
crossref_primary_10_1007_s00371_024_03571_4
crossref_primary_10_1007_s10044_025_01434_9
crossref_primary_10_1109_TMM_2024_3405626
crossref_primary_10_1007_s00371_024_03646_2
crossref_primary_10_1007_s00371_024_03623_9
crossref_primary_10_1016_j_inffus_2025_102951
crossref_primary_10_1109_TCSVT_2023_3271523
crossref_primary_10_1007_s00371_025_03913_w
crossref_primary_10_1007_s10044_025_01419_8
crossref_primary_10_1007_s00371_024_03600_2
crossref_primary_10_1007_s11760_025_04522_9
crossref_primary_10_1007_s13042_025_02558_4
crossref_primary_10_1007_s00371_024_03732_5
crossref_primary_10_1007_s00371_024_03778_5
crossref_primary_10_1007_s00371_024_03502_3
crossref_primary_10_1007_s00034_025_03026_8
crossref_primary_10_1007_s00371_024_03587_w
crossref_primary_10_1109_TMM_2025_3535351
crossref_primary_10_1007_s11227_024_06289_1
crossref_primary_10_1007_s00371_024_03793_6
crossref_primary_10_1007_s00371_024_03563_4
crossref_primary_10_1007_s00371_025_03961_2
crossref_primary_10_1109_TCSVT_2023_3288134
crossref_primary_10_1007_s42421_023_00063_0
crossref_primary_10_1080_1448837X_2025_2487344
crossref_primary_10_1007_s00371_025_03923_8
crossref_primary_10_1007_s00371_024_03597_8
crossref_primary_10_1007_s00371_024_03695_7
crossref_primary_10_1109_TMM_2024_3521723
crossref_primary_10_1007_s00371_025_03860_6
crossref_primary_10_1007_s00371_024_03721_8
crossref_primary_10_1007_s00371_024_03729_0
crossref_primary_10_1007_s00371_024_03767_8
crossref_primary_10_1007_s00371_025_04068_4
crossref_primary_10_1109_TMM_2024_3521850
crossref_primary_10_1093_iti_liae009
crossref_primary_10_1007_s00371_024_03453_9
crossref_primary_10_1007_s00371_024_03499_9
crossref_primary_10_1007_s00371_024_03551_8
crossref_primary_10_1007_s00371_025_04105_2
crossref_primary_10_1007_s00371_024_03669_9
crossref_primary_10_1007_s00371_025_04166_3
crossref_primary_10_3390_app14156670
crossref_primary_10_1007_s00371_024_03574_1
crossref_primary_10_1177_08953996241300016
crossref_primary_10_1109_TMM_2024_3396281
crossref_primary_10_3389_frai_2025_1553051
crossref_primary_10_1007_s00371_024_03376_5
crossref_primary_10_1016_j_jvcir_2025_104532
crossref_primary_10_1007_s00371_024_03705_8
crossref_primary_10_1007_s00371_025_04000_w
crossref_primary_10_1007_s00371_025_03919_4
crossref_primary_10_1007_s00371_025_04022_4
crossref_primary_10_1007_s00371_024_03660_4
crossref_primary_10_1007_s00371_024_03525_w
crossref_primary_10_1007_s00371_024_03731_6
crossref_primary_10_1007_s00371_024_03777_6
crossref_primary_10_1016_j_rineng_2025_106771
crossref_primary_10_1007_s00371_024_03547_4
crossref_primary_10_1007_s00371_024_03755_y
crossref_primary_10_1007_s00371_025_04134_x
crossref_primary_10_1109_TMM_2023_3283856
crossref_primary_10_1007_s00371_024_03792_7
crossref_primary_10_1007_s00371_024_03562_5
crossref_primary_10_1007_s00371_025_04156_5
crossref_primary_10_1007_s00371_024_03658_y
crossref_primary_10_1109_TMM_2024_3349865
crossref_primary_10_1007_s00371_024_03633_7
crossref_primary_10_1007_s00371_024_03679_7
crossref_primary_10_1007_s00371_024_03342_1
crossref_primary_10_1007_s00371_024_03388_1
crossref_primary_10_1007_s00371_025_03800_4
crossref_primary_10_1007_s00371_025_03846_4
crossref_primary_10_1016_j_bspc_2025_108442
crossref_primary_10_1109_TMM_2023_3311909
crossref_primary_10_1007_s00371_024_03611_z
crossref_primary_10_1007_s00371_024_03327_0
crossref_primary_10_1007_s00371_024_03610_0
crossref_primary_10_1007_s00371_024_03656_0
crossref_primary_10_1007_s00371_025_03861_5
crossref_primary_10_3390_info15090563
crossref_primary_10_1109_TIM_2025_3547130
crossref_primary_10_1007_s00371_024_03671_1
crossref_primary_10_1007_s00371_024_03728_1
crossref_primary_10_1109_TMM_2023_3339589
crossref_primary_10_1007_s00371_025_03994_7
crossref_primary_10_1007_s00371_025_03859_z
crossref_primary_10_1007_s00371_024_03558_1
crossref_primary_10_1007_s00371_025_04106_1
crossref_primary_10_1007_s00371_025_03835_7
crossref_primary_10_1007_s00371_025_04167_2
Cites_doi 10.1109/TMM.2018.2869277
10.1109/TMM.2019.2928491
10.1109/ICCVW.2019.00246
10.1007/978-3-319-10602-1_48
10.1109/CVPR46437.2021.01422
10.1007/978-3-030-01228-1_26
10.1109/CVPR46437.2021.01212
10.1109/CVPRW56347.2022.00309
10.18653/vl/N19-142
10.48550/arXiv.1802.02611
10.1109/TMM.2020.3037496
10.1109/TMM.2021.3054509
10.1145/3474085.3475467
10.1109/CVPR.2018.00644
10.1109/TMM.2020.2965491
10.1109/CVPR42600.2020.01161
10.1109/ICCV.2019.00685
10.1109/ICCV48922.2021.00986
10.1109/CVPR.2016.90
10.1109/TMM.2019.2894964
10.1145/3178876.3186066
10.1109/CVPR.2017.354
10.48550/arXiv.1909.11065
10.1109/CVPR42600.2020.01044
10.1109/CVPR.2019.00511
10.1109/TKDE.2017.2669982
10.1109/TPAMI.2018.2844175
10.1109/CVPR.2009.5206848
10.1109/TMM.2020.3034540
10.1109/CVPR.2019.00931
10.1007/978-3-030-58452-8_13
10.1109/CVPR42600.2020.00978
10.1007/978-3-030-58555-6_12
10.1109/CVPR42600.2020.01079
10.1109/CVPR.2019.00326
10.1109/CVPR.2019.00584
10.1109/CVPR46437.2021.01008
10.1109/TMM.2020.2995278
10.1109/TMM.2019.2929949
10.18653/v1/N18-2074
10.1007/s11263-018-1140-0
10.1109/CVPR46437.2021.00294
10.1109/CVPR42600.2020.00915
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TMM.2021.3120873
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEL
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1941-0077
EndPage 61
ExternalDocumentID 10_1109_TMM_2021_3120873
9580642
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61872241; 62077037
  funderid: 10.13039/501100001809
– fundername: Science and Technology Commission of Shanghai Municipality
  grantid: 18410750700; 17411952600
  funderid: 10.13039/501100003399
– fundername: Shanghai Municipal Science and Technology Major Project
  grantid: 2021SHZDZX0102
– fundername: Hong Kong Polytechnic University
  grantid: P0030419; P0030929; P0035358
  funderid: 10.13039/501100004377
– fundername: Shanghai Lin-Gang Area Smart Manufacturing Special Project
  grantid: ZN2018020202-3
– fundername: Project of Shanghai Municipal Health Commission
  grantid: 2018ZHYL0230
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
VH1
ZY4
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c291t-ee2d8f7c988c4030be346b1917a925fcce07b8f8937f1a85589acae033c6bcde3
IEDL.DBID RIE
ISICitedReferencesCount 382
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000937028400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1520-9210
IngestDate Sun Jun 29 13:23:28 EDT 2025
Sat Nov 29 03:10:07 EST 2025
Tue Nov 18 22:32:51 EST 2025
Wed Aug 27 02:14:20 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c291t-ee2d8f7c988c4030be346b1917a925fcce07b8f8937f1a85589acae033c6bcde3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8510-2556
0000-0002-1503-0240
0000-0002-3381-214X
0000-0002-8805-7129
PQID 2765181115
PQPubID 75737
PageCount 12
ParticipantIDs proquest_journals_2765181115
ieee_primary_9580642
crossref_primary_10_1109_TMM_2021_3120873
crossref_citationtrail_10_1109_TMM_2021_3120873
PublicationCentury 2000
PublicationDate 20230000
2023-00-00
20230101
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – year: 2023
  text: 20230000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE transactions on multimedia
PublicationTitleAbbrev TMM
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
Kingma (ref37) 2015
ref14
ref53
ref52
ref11
ref10
ref16
ref19
ref18
Zhu (ref9) 2020
ref51
ref50
ref46
ref45
ref48
Dosovitskiy (ref6) 2021
ref47
ref42
ref41
ref43
Touvron (ref7) 2020
ref49
ref8
ref3
ref5
ref40
ref35
ref34
ref36
ref30
Chen (ref31) 2020
ref33
ref32
ref2
ref1
ref38
ref24
ref23
ref26
ref25
ref20
ref22
ref21
Tan (ref27) 2019
ref28
Vaswani (ref4) 2017
ref29
Chi (ref39) 2020
Xuanqing (ref17) 2020
Bochkovskiy (ref44) 2020
References_xml – ident: ref20
  doi: 10.1109/TMM.2018.2869277
– ident: ref21
  doi: 10.1109/TMM.2019.2928491
– ident: ref38
  doi: 10.1109/ICCVW.2019.00246
– ident: ref35
  doi: 10.1007/978-3-319-10602-1_48
– ident: ref32
  doi: 10.1109/CVPR46437.2021.01422
– ident: ref52
  doi: 10.1007/978-3-030-01228-1_26
– ident: ref11
  doi: 10.1109/CVPR46437.2021.01212
– ident: ref41
  doi: 10.1109/CVPRW56347.2022.00309
– start-page: 1
  volume-title: Proc. Neural Inf. Process. Syst.
  year: 2020
  ident: ref39
  article-title: RelationNet++ : Bridging visual representations for object detection via transformer decoder
– start-page: 6327
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2020
  ident: ref17
  article-title: Learning to encode position for transformer with continuous dynamical model
– ident: ref15
  doi: 10.18653/vl/N19-142
– ident: ref48
  doi: 10.48550/arXiv.1802.02611
– ident: ref1
  doi: 10.1109/TMM.2020.3037496
– ident: ref18
  doi: 10.1109/TMM.2021.3054509
– ident: ref14
  doi: 10.1145/3474085.3475467
– start-page: 6105
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2019
  ident: ref27
  article-title: EfficientNet: Rethinking model scaling for convolutional neural networks
– ident: ref29
  doi: 10.1109/CVPR.2018.00644
– ident: ref12
  doi: 10.1109/TMM.2020.2965491
– ident: ref40
  doi: 10.1109/CVPR42600.2020.01161
– ident: ref49
  doi: 10.1109/ICCV.2019.00685
– ident: ref8
  doi: 10.1109/ICCV48922.2021.00986
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Representations
  year: 2020
  ident: ref9
  article-title: Deformable DETR: Deformable transformers for end-to-end object detection
– start-page: 1
  year: 2020
  ident: ref44
  article-title: YOLOv4: Optimal speed and accuracy of object detection
– ident: ref33
  doi: 10.1109/CVPR.2016.90
– ident: ref19
  doi: 10.1109/TMM.2019.2894964
– ident: ref25
  doi: 10.1145/3178876.3186066
– ident: ref24
  doi: 10.1109/CVPR.2017.354
– ident: ref51
  doi: 10.48550/arXiv.1909.11065
– start-page: 1123
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2020
  ident: ref31
  article-title: RepPoints V2: Verification meets regression for object detection
– start-page: 6000
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2017
  ident: ref4
  article-title: Attention is all you need
– ident: ref26
  doi: 10.1109/CVPR42600.2020.01044
– ident: ref53
  doi: 10.1109/CVPR.2019.00511
– ident: ref22
  doi: 10.1109/TKDE.2017.2669982
– ident: ref28
  doi: 10.1109/TPAMI.2018.2844175
– ident: ref34
  doi: 10.1109/CVPR.2009.5206848
– ident: ref5
  doi: 10.1109/TMM.2020.3034540
– start-page: 10347
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2020
  ident: ref7
  article-title: Training data-efficient image transformers & distillation through attention
– ident: ref23
  doi: 10.1109/CVPR.2019.00931
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Representations
  year: 2021
  ident: ref6
  article-title: An image is worth 16x16 words: Transformers for image recognition at scale
– ident: ref13
  doi: 10.1007/978-3-030-58452-8_13
– ident: ref30
  doi: 10.1109/CVPR42600.2020.00978
– ident: ref50
  doi: 10.1007/978-3-030-58555-6_12
– ident: ref42
  doi: 10.1109/CVPR42600.2020.01079
– ident: ref47
  doi: 10.1109/CVPR.2019.00326
– ident: ref46
  doi: 10.1109/CVPR.2019.00584
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Representations
  year: 2015
  ident: ref37
  article-title: Adam: A method for stochastic optimization
– ident: ref43
  doi: 10.1109/CVPR46437.2021.01008
– ident: ref2
  doi: 10.1109/TMM.2020.2995278
– ident: ref3
  doi: 10.1109/TMM.2019.2929949
– ident: ref16
  doi: 10.18653/v1/N18-2074
– ident: ref36
  doi: 10.1007/s11263-018-1140-0
– ident: ref45
  doi: 10.1109/CVPR46437.2021.00294
– ident: ref10
  doi: 10.1109/CVPR42600.2020.00915
SSID ssj0014507
Score 2.7253618
Snippet Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 50
SubjectTerms Ablation
attention mechanism
classification
Communication
Convolutional neural networks
Costs
Encoding
Feature extraction
Formability
Image classification
Image processing
Image segmentation
Modules
object detection
Object recognition
Patches (structures)
pyramid
Semantic segmentation
Semantics
Task analysis
Transformer
Transformers
Title EAPT: Efficient Attention Pyramid Transformer for Image Processing
URI https://ieeexplore.ieee.org/document/9580642
https://www.proquest.com/docview/2765181115
Volume 25
WOSCitedRecordID wos000937028400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library
  customDbUrl:
  eissn: 1941-0077
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014507
  issn: 1520-9210
  databaseCode: RIE
  dateStart: 19990101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED-m-KAPfk1xOiUPvgjWZU3bJL5N2VBwsocqeytNmoLgNplV8L_3kn6gKIJP7UMSSq6X-_3ucncApxkegNIXoZczZrwgFNITjCpPq5xSJVnqO5_u4x2_vxfTqZy04LzJhTHGuMtn5sK-ulh-ttBv1lXWk6GweHkFVjiPylytJmIQhC41Gs0R9STymDokSWUvHo-RCPp95Kc-FZx9M0Gup8qPg9hZl9HW_75rGzYrFEkGpdh3oGXmu7BVd2gglcLuwsaXcoNtuBoOJvElGbqqEbgiGRRFeduRTD6W6ewpI3GNY3EVfJDbGZ43pMomwEX24GE0jK9vvKqHgqd92S88Y_xM5FxLIXSACq0MCyJlSVoq_TDX2lCuRG5RS95PRYjCSnVqKGM6UjozbB9W54u5OQCSZgHy6EhlAlmMzIVCrKFUkEnNw1Ry2YFeva2JrgqM2z4Xz4kjGlQmKIjECiKpBNGBs2bGS1lc44-xbbvxzbhqzzvQrSWXVNr3mvg8ChG5INg9_H3WEazbtvGlK6ULq8XyzRzDmn4vnl6XJ-7H-gSg5clf
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB58gXrwLa7PHLwI1s0mzTbxtsqK4u6yhyreSpOmsKCrrFXw3ztJ20VRBE_tIQkl08l830xmBuA4wwNQMSmCnHMbhEKqQHKqA6NzSrXiKfM-3fteNBjIhwc1nIHTaS6MtdZfPrNn7tXH8rNn8-ZcZU0lpMPLszAvwpDRMltrGjMIhU-ORoNEA4VMpg5KUtWM-32kgqyFDJVRGfFvRsh3VflxFHv7crX6vy9bg5UKR5JOKfh1mLHjDVitezSQSmU3YPlLwcFNuOh2hvE56fq6Ebgi6RRFed-RDD8m6dMoI3GNZHEVfJCbJzxxSJVPgItswd1VN768DqouCoFhqlUE1rJM5pFRUpoQVVpbHra1o2mpYiI3xtJIy9zhlryVSoHiSk1qKeemrU1m-TbMjZ_HdgdImoXIpNs6k8hjVC41og2tw0yZSKQqUg1o1tuamKrEuOt08Zh4qkFVgoJInCCSShANOJnOeCnLa_wxdtNt_HRctecN2K8ll1T695qwqC0QuyDc3f191hEsXsf9XtK7GdzuwZJrIl86VvZhrpi82QNYMO_F6HVy6H-yT0XBzKY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EAPT%3A+Efficient+Attention+Pyramid+Transformer+for+Image+Processing&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Lin%2C+Xiao&rft.au=Sun%2C+Shuzhou&rft.au=Huang%2C+Wei&rft.au=Sheng%2C+Bin&rft.date=2023&rft.pub=IEEE&rft.issn=1520-9210&rft.volume=25&rft.spage=50&rft.epage=61&rft_id=info:doi/10.1109%2FTMM.2021.3120873&rft.externalDocID=9580642
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon