EAPT: Efficient Attention Pyramid Transformer for Image Processing
Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the input features into the same size patches, which ignores the fact that vision elements are often various and thus may destroy the semantic i...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on multimedia Jg. 25; S. 50 - 61 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Piscataway
IEEE
2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 1520-9210, 1941-0077 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the input features into the same size patches, which ignores the fact that vision elements are often various and thus may destroy the semantic information. Also, the vanilla patch-based transformer cannot guarantee the information communication between patches, which will prevent the extraction of attention information with a global view. To circumvent those problems, we propose an Efficient Attention Pyramid Transformer (EAPT). Specifically, we first propose the Deformable Attention, which learns an offset for each position in patches. Thus, even with split fixed-size patches, our method can still obtain non-fixed attention information that can cover various vision elements. Then, we design the Encode-Decode Communication module (En-DeC module), which can obtain communication information among all patches to get more complete global attention information. Finally, we propose a position encoding specifically for vision transformers, which can be used for patches of any dimension and any length. Extensive experiments on the vision tasks of image classification, object detection, and semantic segmentation demonstrate the effectiveness of our proposed model. Furthermore, we also conduct rigorous ablation studies to evaluate the key components of the proposed structure. |
|---|---|
| AbstractList | Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the input features into the same size patches, which ignores the fact that vision elements are often various and thus may destroy the semantic information. Also, the vanilla patch-based transformer cannot guarantee the information communication between patches, which will prevent the extraction of attention information with a global view. To circumvent those problems, we propose an Efficient Attention Pyramid Transformer (EAPT). Specifically, we first propose the Deformable Attention, which learns an offset for each position in patches. Thus, even with split fixed-size patches, our method can still obtain non-fixed attention information that can cover various vision elements. Then, we design the Encode-Decode Communication module (En-DeC module), which can obtain communication information among all patches to get more complete global attention information. Finally, we propose a position encoding specifically for vision transformers, which can be used for patches of any dimension and any length. Extensive experiments on the vision tasks of image classification, object detection, and semantic segmentation demonstrate the effectiveness of our proposed model. Furthermore, we also conduct rigorous ablation studies to evaluate the key components of the proposed structure. |
| Author | Sun, Shuzhou Feng, David Dagan Sheng, Bin Li, Ping Huang, Wei Lin, Xiao |
| Author_xml | – sequence: 1 givenname: Xiao orcidid: 0000-0002-8805-7129 surname: Lin fullname: Lin, Xiao email: lin6008@shnu.edu.cn organization: Department of Computer Science, Shanghai Normal University, Shanghai, China – sequence: 2 givenname: Shuzhou surname: Sun fullname: Sun, Shuzhou email: 1000479143@smail.shnu.edu.cn organization: Department of Computer Science, Shanghai Normal University, Shanghai, China – sequence: 3 givenname: Wei surname: Huang fullname: Huang, Wei email: 191380039@usst.edu.cn organization: Department of Computer Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China – sequence: 4 givenname: Bin orcidid: 0000-0001-8510-2556 surname: Sheng fullname: Sheng, Bin email: shengbin@cs.sjtu.edu.cn organization: Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China – sequence: 5 givenname: Ping orcidid: 0000-0002-1503-0240 surname: Li fullname: Li, Ping email: p.li@polyu.edu.hk organization: Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong – sequence: 6 givenname: David Dagan orcidid: 0000-0002-3381-214X surname: Feng fullname: Feng, David Dagan email: dagan.feng@sydney.edu.au organization: Biomedical and Multimedia Information Technology Research Group, School of Information Technologies, The University of Sydney, Sydney, NSW, Australia |
| BookMark | eNp9kDFvwjAQRq2KSqW0e6UukTqH3tlJbHejiLZIoDKkc2TMGRlBQu0w8O8bBOrQodN3w_fudO-W9eqmJsYeEIaIoJ_L-XzIgeNQIAclxRXro84wBZCy1805h1RzhBt2G-MGALMcZJ-9TkaL8iWZOOetp7pNRm3bhW_qZHEMZudXSRlMHV0TdhSSLpLpzqwpWYTGUoy-Xt-xa2e2ke4vOWBfb5Ny_JHOPt-n49EstVxjmxLxlXLSaqVsBgKWJLJiiRql0Tx31hLIpXJKC-nQqDxX2lhDIIQtlnZFYsCeznv3ofk-UGyrTXMIdXey4rLIUSFi3rWKc8uGJsZArrK-NaeH2mD8tkKoTr6qzld18lVdfHUg_AH3we9MOP6HPJ4RT0S_dZ0rKDIufgCeKXbW |
| CODEN | ITMUF8 |
| CitedBy_id | crossref_primary_10_1007_s00371_025_03812_0 crossref_primary_10_1007_s00371_025_04130_1 crossref_primary_10_1016_j_dsp_2024_104964 crossref_primary_10_1007_s00371_024_03640_8 crossref_primary_10_1007_s00371_025_03858_0 crossref_primary_10_1016_j_entcom_2024_100820 crossref_primary_10_1007_s00371_024_03602_0 crossref_primary_10_1007_s00371_025_03798_9 crossref_primary_10_1007_s00371_024_03663_1 crossref_primary_10_1109_TNNLS_2025_3565582 crossref_primary_10_1088_1361_6501_ade272 crossref_primary_10_1007_s00371_024_03773_w crossref_primary_10_1007_s00371_025_03809_9 crossref_primary_10_1007_s00371_024_03542_9 crossref_primary_10_1080_01431161_2024_2443989 crossref_primary_10_1109_TMM_2025_3535392 crossref_primary_10_1007_s42417_025_01874_x crossref_primary_10_1007_s00371_024_03795_4 crossref_primary_10_1007_s00371_024_03396_1 crossref_primary_10_1016_j_eswa_2025_127499 crossref_primary_10_1007_s10844_025_00938_4 crossref_primary_10_1007_s00371_025_04019_z crossref_primary_10_1007_s00371_024_03636_4 crossref_primary_10_1007_s00371_025_03801_3 crossref_primary_10_1007_s00371_025_03886_w crossref_primary_10_1007_s00371_024_03614_w crossref_primary_10_1007_s00371_024_03592_z crossref_primary_10_1007_s41060_024_00578_x crossref_primary_10_1007_s00371_024_03599_6 crossref_primary_10_1007_s00371_024_03651_5 crossref_primary_10_1109_TMM_2024_3521805 crossref_primary_10_1007_s00371_024_03762_z crossref_primary_10_1038_s41598_024_62906_2 crossref_primary_10_1007_s00371_024_03784_7 crossref_primary_10_1007_s00371_024_03723_6 crossref_primary_10_1007_s00371_024_03531_y crossref_primary_10_1007_s00371_024_03701_y crossref_primary_10_1007_s00371_025_04149_4 crossref_primary_10_1109_ACCESS_2024_3494023 crossref_primary_10_1007_s00371_024_03553_6 crossref_primary_10_1007_s00371_024_03603_z crossref_primary_10_1007_s00371_025_03836_6 crossref_primary_10_1007_s00371_025_04164_5 crossref_primary_10_1016_j_compmedimag_2025_102527 crossref_primary_10_1007_s00371_024_03479_z crossref_primary_10_1007_s00371_024_03624_8 crossref_primary_10_1007_s00371_024_03685_9 crossref_primary_10_1007_s00371_023_03245_7 crossref_primary_10_1007_s00371_025_04178_z crossref_primary_10_1142_S219688882450012X crossref_primary_10_1049_ipr2_70040 crossref_primary_10_1007_s00371_025_03943_4 crossref_primary_10_1007_s00371_024_03549_2 crossref_primary_10_1109_ACCESS_2025_3588390 crossref_primary_10_1007_s00371_025_04056_8 crossref_primary_10_1007_s00371_025_03921_w crossref_primary_10_1007_s00371_025_03825_9 crossref_primary_10_1109_TMM_2023_3301225 crossref_primary_10_1007_s11760_024_03783_0 crossref_primary_10_1109_JSTARS_2024_3461171 crossref_primary_10_1109_TMM_2023_3235495 crossref_primary_10_1109_TGRS_2025_3536473 crossref_primary_10_1007_s00138_025_01671_2 crossref_primary_10_1007_s00138_025_01710_y crossref_primary_10_1007_s00371_024_03613_x crossref_primary_10_3390_s24062014 crossref_primary_10_1109_TMM_2023_3326881 crossref_primary_10_1007_s00371_024_03650_6 crossref_primary_10_1007_s00371_025_04017_1 crossref_primary_10_1007_s11554_025_01719_6 crossref_primary_10_1007_s00371_025_04060_y crossref_primary_10_1007_s00371_024_03288_4 crossref_primary_10_1007_s00371_024_03576_z crossref_primary_10_1007_s00371_024_03783_8 crossref_primary_10_1109_TCSVT_2024_3521454 crossref_primary_10_1007_s00371_024_03530_z crossref_primary_10_1007_s00371_024_03700_z crossref_primary_10_1007_s00371_025_03815_x crossref_primary_10_1007_s11042_025_20696_3 crossref_primary_10_1007_s00371_024_03590_1 crossref_primary_10_1007_s00371_025_03837_5 crossref_primary_10_1007_s00371_025_04180_5 crossref_primary_10_1007_s00371_024_03620_y crossref_primary_10_1007_s00371_025_03876_y crossref_primary_10_1007_s00371_024_03333_2 crossref_primary_10_1007_s00371_024_03688_6 crossref_primary_10_1109_TMM_2025_3542958 crossref_primary_10_1007_s00371_024_03416_0 crossref_primary_10_1007_s11760_024_03373_0 crossref_primary_10_1007_s00371_024_03749_w crossref_primary_10_1007_s00371_024_03522_z crossref_primary_10_1007_s00371_024_03736_1 crossref_primary_10_1088_1402_4896_adf302 crossref_primary_10_1007_s00371_023_03221_1 crossref_primary_10_1007_s00371_024_03680_0 crossref_primary_10_1007_s00371_025_03981_y crossref_primary_10_1007_s00371_024_03713_8 crossref_primary_10_1007_s00371_025_03826_8 crossref_primary_10_1007_s00371_024_03797_2 crossref_primary_10_1007_s00371_024_03638_2 crossref_primary_10_1007_s11760_024_03347_2 crossref_primary_10_1007_s00371_024_03594_x crossref_primary_10_1117_1_JEI_34_2_023008 crossref_primary_10_1007_s00530_025_01845_y crossref_primary_10_1007_s00371_024_03653_3 crossref_primary_10_1007_s00371_024_03699_3 crossref_primary_10_1007_s00371_024_03692_w crossref_primary_10_1007_s00371_024_03529_6 crossref_primary_10_1007_s00371_025_03905_w crossref_primary_10_1007_s00530_025_01821_6 crossref_primary_10_1007_s00371_024_03703_w crossref_primary_10_3390_s25154700 crossref_primary_10_1117_1_JEI_34_2_023005 crossref_primary_10_1007_s00371_024_03496_y crossref_primary_10_1007_s00371_025_04140_z crossref_primary_10_1007_s00371_024_03740_5 crossref_primary_10_1007_s00371_025_03838_4 crossref_primary_10_1007_s00371_025_04162_7 crossref_primary_10_1007_s00371_024_03570_5 crossref_primary_10_3390_s24144686 crossref_primary_10_1007_s00371_025_03877_x crossref_primary_10_1109_TMM_2025_3535405 crossref_primary_10_1002_cav_70060 crossref_primary_10_1007_s00371_024_03433_z crossref_primary_10_1016_j_jvcir_2025_104494 crossref_primary_10_1007_s00371_024_03372_9 crossref_primary_10_1007_s00371_025_03853_5 crossref_primary_10_1007_s00371_023_03243_9 crossref_primary_10_1007_s00371_024_03709_4 crossref_primary_10_1007_s00371_024_03748_x crossref_primary_10_1016_j_bspc_2025_107765 crossref_primary_10_1007_s00371_024_03664_0 crossref_primary_10_1007_s00034_025_03083_z crossref_primary_10_1109_TMM_2024_3521664 crossref_primary_10_1109_TMM_2023_3302471 crossref_primary_10_1109_TMM_2024_3521662 crossref_primary_10_1007_s00371_025_03827_7 crossref_primary_10_1007_s00371_024_03750_3 crossref_primary_10_1007_s00371_024_03796_3 crossref_primary_10_1007_s00371_024_03429_9 crossref_primary_10_1007_s00530_025_01970_8 crossref_primary_10_1007_s00371_024_03637_3 crossref_primary_10_1007_s00371_024_03630_w crossref_primary_10_3390_agriculture14101725 crossref_primary_10_1007_s10462_025_11218_6 crossref_primary_10_1007_s00371_024_03652_4 crossref_primary_10_1007_s00371_024_03421_3 crossref_primary_10_1007_s00371_025_03865_1 crossref_primary_10_1007_s13042_025_02797_5 crossref_primary_10_1142_S0129183125420173 crossref_primary_10_1007_s00371_024_03471_7 crossref_primary_10_1007_s00371_024_03578_x crossref_primary_10_1007_s00371_025_04148_5 crossref_primary_10_1007_s00371_024_03495_z crossref_primary_10_1016_j_ins_2025_122511 crossref_primary_10_1007_s00371_024_03604_y crossref_primary_10_1007_s00371_024_03419_x crossref_primary_10_1007_s00371_025_04028_y crossref_primary_10_3390_rs17050858 crossref_primary_10_1007_s00371_024_03437_9 crossref_primary_10_1007_s00371_024_03668_w crossref_primary_10_1007_s00371_025_03878_w crossref_primary_10_1007_s00371_024_03331_4 crossref_primary_10_1007_s00371_024_03629_3 crossref_primary_10_1007_s00530_025_01711_x crossref_primary_10_1007_s00371_025_03893_x crossref_primary_10_1007_s00371_024_03683_x crossref_primary_10_1007_s00371_025_04109_y crossref_primary_10_1007_s00371_025_04025_1 crossref_primary_10_1007_s00371_024_03329_y crossref_primary_10_1007_s00371_025_04074_6 crossref_primary_10_1007_s00371_025_04098_y crossref_primary_10_1007_s00371_024_03715_6 crossref_primary_10_1007_s00371_025_03828_6 crossref_primary_10_1007_s00371_024_03791_8 crossref_primary_10_1007_s00371_025_03806_y crossref_primary_10_1007_s42235_024_00557_9 crossref_primary_10_1007_s00371_024_03632_8 crossref_primary_10_1007_s00371_024_03389_0 crossref_primary_10_1007_s00371_024_03403_5 crossref_primary_10_1007_s00371_024_03442_y crossref_primary_10_1109_TITS_2024_3480114 crossref_primary_10_1007_s00371_024_03366_7 crossref_primary_10_1007_s00371_025_03866_0 crossref_primary_10_1007_s00371_023_03253_7 crossref_primary_10_1007_s00371_025_03903_y crossref_primary_10_3390_jimaging10090228 crossref_primary_10_1007_s00371_025_03820_0 crossref_primary_10_1016_j_compeleceng_2024_109628 crossref_primary_10_1007_s00371_025_03881_1 crossref_primary_10_1007_s00371_024_03788_3 crossref_primary_10_1007_s00371_024_03727_2 crossref_primary_10_1109_JBHI_2025_3535541 crossref_primary_10_1007_s00371_024_03452_w crossref_primary_10_1016_j_neunet_2025_107618 crossref_primary_10_3390_app14125039 crossref_primary_10_1007_s00371_024_03459_3 crossref_primary_10_1007_s00371_025_04107_0 crossref_primary_10_1007_s00530_025_01941_z crossref_primary_10_1007_s00371_025_04183_2 crossref_primary_10_1007_s00530_024_01353_5 crossref_primary_10_7717_peerj_cs_1093 crossref_primary_10_1007_s00371_024_03643_5 crossref_primary_10_1007_s00371_024_03628_4 crossref_primary_10_1007_s00371_025_04001_9 crossref_primary_10_1007_s00371_025_03870_4 crossref_primary_10_1007_s11227_025_06947_y crossref_primary_10_1016_j_atech_2025_101209 crossref_primary_10_1007_s00371_024_03519_8 crossref_primary_10_1007_s00371_024_03523_y crossref_primary_10_1007_s00371_024_03737_0 crossref_primary_10_1007_s00371_024_03775_8 crossref_primary_10_1007_s12161_024_02716_4 crossref_primary_10_1007_s00371_024_03273_x crossref_primary_10_1007_s00371_025_03985_8 crossref_primary_10_1007_s40998_025_00834_1 crossref_primary_10_1109_TGRS_2022_3182745 crossref_primary_10_1007_s00371_025_03829_5 crossref_primary_10_1007_s00371_024_03790_9 crossref_primary_10_1007_s00371_025_04158_3 crossref_primary_10_1007_s00371_024_03617_7 crossref_primary_10_1007_s00371_025_03962_1 crossref_primary_10_1007_s11760_024_03276_0 crossref_primary_10_1007_s00371_025_04173_4 crossref_primary_10_1007_s00371_024_03428_w crossref_primary_10_1007_s00371_024_03677_9 crossref_primary_10_1007_s10341_025_01385_9 crossref_primary_10_1007_s00521_023_08852_y crossref_primary_10_1007_s00371_024_03404_4 crossref_primary_10_1109_ACCESS_2024_3520138 crossref_primary_10_1007_s00371_024_03654_2 crossref_primary_10_1109_ACCESS_2024_3468028 crossref_primary_10_1007_s00371_024_03739_y crossref_primary_10_1007_s00371_024_03508_x crossref_primary_10_1007_s00371_024_03284_8 crossref_primary_10_1007_s00371_024_03473_5 crossref_primary_10_1007_s00371_024_03787_4 crossref_primary_10_1007_s00371_024_03451_x crossref_primary_10_1007_s00371_024_03556_3 crossref_primary_10_1109_TMM_2025_3535321 crossref_primary_10_1007_s00371_024_03606_w crossref_primary_10_1007_s00371_024_03741_4 crossref_primary_10_1007_s00371_025_03879_9 crossref_primary_10_1007_s00371_025_04161_8 crossref_primary_10_1007_s00371_025_04169_0 crossref_primary_10_1007_s00371_024_03571_4 crossref_primary_10_1007_s10044_025_01434_9 crossref_primary_10_1109_TMM_2024_3405626 crossref_primary_10_1007_s00371_024_03646_2 crossref_primary_10_1007_s00371_024_03623_9 crossref_primary_10_1016_j_inffus_2025_102951 crossref_primary_10_1109_TCSVT_2023_3271523 crossref_primary_10_1007_s00371_025_03913_w crossref_primary_10_1007_s10044_025_01419_8 crossref_primary_10_1007_s00371_024_03600_2 crossref_primary_10_1007_s11760_025_04522_9 crossref_primary_10_1007_s13042_025_02558_4 crossref_primary_10_1007_s00371_024_03732_5 crossref_primary_10_1007_s00371_024_03778_5 crossref_primary_10_1007_s00371_024_03502_3 crossref_primary_10_1007_s00034_025_03026_8 crossref_primary_10_1007_s00371_024_03587_w crossref_primary_10_1109_TMM_2025_3535351 crossref_primary_10_1007_s11227_024_06289_1 crossref_primary_10_1007_s00371_024_03793_6 crossref_primary_10_1007_s00371_024_03563_4 crossref_primary_10_1007_s00371_025_03961_2 crossref_primary_10_1109_TCSVT_2023_3288134 crossref_primary_10_1007_s42421_023_00063_0 crossref_primary_10_1080_1448837X_2025_2487344 crossref_primary_10_1007_s00371_025_03923_8 crossref_primary_10_1007_s00371_024_03597_8 crossref_primary_10_1007_s00371_024_03695_7 crossref_primary_10_1109_TMM_2024_3521723 crossref_primary_10_1007_s00371_025_03860_6 crossref_primary_10_1007_s00371_024_03721_8 crossref_primary_10_1007_s00371_024_03729_0 crossref_primary_10_1007_s00371_024_03767_8 crossref_primary_10_1007_s00371_025_04068_4 crossref_primary_10_1109_TMM_2024_3521850 crossref_primary_10_1093_iti_liae009 crossref_primary_10_1007_s00371_024_03453_9 crossref_primary_10_1007_s00371_024_03499_9 crossref_primary_10_1007_s00371_024_03551_8 crossref_primary_10_1007_s00371_025_04105_2 crossref_primary_10_1007_s00371_024_03669_9 crossref_primary_10_1007_s00371_025_04166_3 crossref_primary_10_3390_app14156670 crossref_primary_10_1007_s00371_024_03574_1 crossref_primary_10_1177_08953996241300016 crossref_primary_10_1109_TMM_2024_3396281 crossref_primary_10_3389_frai_2025_1553051 crossref_primary_10_1007_s00371_024_03376_5 crossref_primary_10_1016_j_jvcir_2025_104532 crossref_primary_10_1007_s00371_024_03705_8 crossref_primary_10_1007_s00371_025_04000_w crossref_primary_10_1007_s00371_025_03919_4 crossref_primary_10_1007_s00371_025_04022_4 crossref_primary_10_1007_s00371_024_03660_4 crossref_primary_10_1007_s00371_024_03525_w crossref_primary_10_1007_s00371_024_03731_6 crossref_primary_10_1007_s00371_024_03777_6 crossref_primary_10_1016_j_rineng_2025_106771 crossref_primary_10_1007_s00371_024_03547_4 crossref_primary_10_1007_s00371_024_03755_y crossref_primary_10_1007_s00371_025_04134_x crossref_primary_10_1109_TMM_2023_3283856 crossref_primary_10_1007_s00371_024_03792_7 crossref_primary_10_1007_s00371_024_03562_5 crossref_primary_10_1007_s00371_025_04156_5 crossref_primary_10_1007_s00371_024_03658_y crossref_primary_10_1109_TMM_2024_3349865 crossref_primary_10_1007_s00371_024_03633_7 crossref_primary_10_1007_s00371_024_03679_7 crossref_primary_10_1007_s00371_024_03342_1 crossref_primary_10_1007_s00371_024_03388_1 crossref_primary_10_1007_s00371_025_03800_4 crossref_primary_10_1007_s00371_025_03846_4 crossref_primary_10_1016_j_bspc_2025_108442 crossref_primary_10_1109_TMM_2023_3311909 crossref_primary_10_1007_s00371_024_03611_z crossref_primary_10_1007_s00371_024_03327_0 crossref_primary_10_1007_s00371_024_03610_0 crossref_primary_10_1007_s00371_024_03656_0 crossref_primary_10_1007_s00371_025_03861_5 crossref_primary_10_3390_info15090563 crossref_primary_10_1109_TIM_2025_3547130 crossref_primary_10_1007_s00371_024_03671_1 crossref_primary_10_1007_s00371_024_03728_1 crossref_primary_10_1109_TMM_2023_3339589 crossref_primary_10_1007_s00371_025_03994_7 crossref_primary_10_1007_s00371_025_03859_z crossref_primary_10_1007_s00371_024_03558_1 crossref_primary_10_1007_s00371_025_04106_1 crossref_primary_10_1007_s00371_025_03835_7 crossref_primary_10_1007_s00371_025_04167_2 |
| Cites_doi | 10.1109/TMM.2018.2869277 10.1109/TMM.2019.2928491 10.1109/ICCVW.2019.00246 10.1007/978-3-319-10602-1_48 10.1109/CVPR46437.2021.01422 10.1007/978-3-030-01228-1_26 10.1109/CVPR46437.2021.01212 10.1109/CVPRW56347.2022.00309 10.18653/vl/N19-142 10.48550/arXiv.1802.02611 10.1109/TMM.2020.3037496 10.1109/TMM.2021.3054509 10.1145/3474085.3475467 10.1109/CVPR.2018.00644 10.1109/TMM.2020.2965491 10.1109/CVPR42600.2020.01161 10.1109/ICCV.2019.00685 10.1109/ICCV48922.2021.00986 10.1109/CVPR.2016.90 10.1109/TMM.2019.2894964 10.1145/3178876.3186066 10.1109/CVPR.2017.354 10.48550/arXiv.1909.11065 10.1109/CVPR42600.2020.01044 10.1109/CVPR.2019.00511 10.1109/TKDE.2017.2669982 10.1109/TPAMI.2018.2844175 10.1109/CVPR.2009.5206848 10.1109/TMM.2020.3034540 10.1109/CVPR.2019.00931 10.1007/978-3-030-58452-8_13 10.1109/CVPR42600.2020.00978 10.1007/978-3-030-58555-6_12 10.1109/CVPR42600.2020.01079 10.1109/CVPR.2019.00326 10.1109/CVPR.2019.00584 10.1109/CVPR46437.2021.01008 10.1109/TMM.2020.2995278 10.1109/TMM.2019.2929949 10.18653/v1/N18-2074 10.1007/s11263-018-1140-0 10.1109/CVPR46437.2021.00294 10.1109/CVPR42600.2020.00915 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TMM.2021.3120873 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1941-0077 |
| EndPage | 61 |
| ExternalDocumentID | 10_1109_TMM_2021_3120873 9580642 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61872241; 62077037 funderid: 10.13039/501100001809 – fundername: Science and Technology Commission of Shanghai Municipality grantid: 18410750700; 17411952600 funderid: 10.13039/501100003399 – fundername: Shanghai Municipal Science and Technology Major Project grantid: 2021SHZDZX0102 – fundername: Hong Kong Polytechnic University grantid: P0030419; P0030929; P0035358 funderid: 10.13039/501100004377 – fundername: Shanghai Lin-Gang Area Smart Manufacturing Special Project grantid: ZN2018020202-3 – fundername: Project of Shanghai Municipal Health Commission grantid: 2018ZHYL0230 |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS TN5 VH1 ZY4 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c291t-ee2d8f7c988c4030be346b1917a925fcce07b8f8937f1a85589acae033c6bcde3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 382 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000937028400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1520-9210 |
| IngestDate | Sun Jun 29 13:23:28 EDT 2025 Sat Nov 29 03:10:07 EST 2025 Tue Nov 18 22:32:51 EST 2025 Wed Aug 27 02:14:20 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c291t-ee2d8f7c988c4030be346b1917a925fcce07b8f8937f1a85589acae033c6bcde3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-8510-2556 0000-0002-1503-0240 0000-0002-3381-214X 0000-0002-8805-7129 |
| PQID | 2765181115 |
| PQPubID | 75737 |
| PageCount | 12 |
| ParticipantIDs | proquest_journals_2765181115 ieee_primary_9580642 crossref_primary_10_1109_TMM_2021_3120873 crossref_citationtrail_10_1109_TMM_2021_3120873 |
| PublicationCentury | 2000 |
| PublicationDate | 20230000 2023-00-00 20230101 |
| PublicationDateYYYYMMDD | 2023-01-01 |
| PublicationDate_xml | – year: 2023 text: 20230000 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE transactions on multimedia |
| PublicationTitleAbbrev | TMM |
| PublicationYear | 2023 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref15 Kingma (ref37) 2015 ref14 ref53 ref52 ref11 ref10 ref16 ref19 ref18 Zhu (ref9) 2020 ref51 ref50 ref46 ref45 ref48 Dosovitskiy (ref6) 2021 ref47 ref42 ref41 ref43 Touvron (ref7) 2020 ref49 ref8 ref3 ref5 ref40 ref35 ref34 ref36 ref30 Chen (ref31) 2020 ref33 ref32 ref2 ref1 ref38 ref24 ref23 ref26 ref25 ref20 ref22 ref21 Tan (ref27) 2019 ref28 Vaswani (ref4) 2017 ref29 Chi (ref39) 2020 Xuanqing (ref17) 2020 Bochkovskiy (ref44) 2020 |
| References_xml | – ident: ref20 doi: 10.1109/TMM.2018.2869277 – ident: ref21 doi: 10.1109/TMM.2019.2928491 – ident: ref38 doi: 10.1109/ICCVW.2019.00246 – ident: ref35 doi: 10.1007/978-3-319-10602-1_48 – ident: ref32 doi: 10.1109/CVPR46437.2021.01422 – ident: ref52 doi: 10.1007/978-3-030-01228-1_26 – ident: ref11 doi: 10.1109/CVPR46437.2021.01212 – ident: ref41 doi: 10.1109/CVPRW56347.2022.00309 – start-page: 1 volume-title: Proc. Neural Inf. Process. Syst. year: 2020 ident: ref39 article-title: RelationNet++ : Bridging visual representations for object detection via transformer decoder – start-page: 6327 volume-title: Proc. Int. Conf. Mach. Learn. year: 2020 ident: ref17 article-title: Learning to encode position for transformer with continuous dynamical model – ident: ref15 doi: 10.18653/vl/N19-142 – ident: ref48 doi: 10.48550/arXiv.1802.02611 – ident: ref1 doi: 10.1109/TMM.2020.3037496 – ident: ref18 doi: 10.1109/TMM.2021.3054509 – ident: ref14 doi: 10.1145/3474085.3475467 – start-page: 6105 volume-title: Proc. Int. Conf. Mach. Learn. year: 2019 ident: ref27 article-title: EfficientNet: Rethinking model scaling for convolutional neural networks – ident: ref29 doi: 10.1109/CVPR.2018.00644 – ident: ref12 doi: 10.1109/TMM.2020.2965491 – ident: ref40 doi: 10.1109/CVPR42600.2020.01161 – ident: ref49 doi: 10.1109/ICCV.2019.00685 – ident: ref8 doi: 10.1109/ICCV48922.2021.00986 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Representations year: 2020 ident: ref9 article-title: Deformable DETR: Deformable transformers for end-to-end object detection – start-page: 1 year: 2020 ident: ref44 article-title: YOLOv4: Optimal speed and accuracy of object detection – ident: ref33 doi: 10.1109/CVPR.2016.90 – ident: ref19 doi: 10.1109/TMM.2019.2894964 – ident: ref25 doi: 10.1145/3178876.3186066 – ident: ref24 doi: 10.1109/CVPR.2017.354 – ident: ref51 doi: 10.48550/arXiv.1909.11065 – start-page: 1123 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2020 ident: ref31 article-title: RepPoints V2: Verification meets regression for object detection – start-page: 6000 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2017 ident: ref4 article-title: Attention is all you need – ident: ref26 doi: 10.1109/CVPR42600.2020.01044 – ident: ref53 doi: 10.1109/CVPR.2019.00511 – ident: ref22 doi: 10.1109/TKDE.2017.2669982 – ident: ref28 doi: 10.1109/TPAMI.2018.2844175 – ident: ref34 doi: 10.1109/CVPR.2009.5206848 – ident: ref5 doi: 10.1109/TMM.2020.3034540 – start-page: 10347 volume-title: Proc. Int. Conf. Mach. Learn. year: 2020 ident: ref7 article-title: Training data-efficient image transformers & distillation through attention – ident: ref23 doi: 10.1109/CVPR.2019.00931 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Representations year: 2021 ident: ref6 article-title: An image is worth 16x16 words: Transformers for image recognition at scale – ident: ref13 doi: 10.1007/978-3-030-58452-8_13 – ident: ref30 doi: 10.1109/CVPR42600.2020.00978 – ident: ref50 doi: 10.1007/978-3-030-58555-6_12 – ident: ref42 doi: 10.1109/CVPR42600.2020.01079 – ident: ref47 doi: 10.1109/CVPR.2019.00326 – ident: ref46 doi: 10.1109/CVPR.2019.00584 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Representations year: 2015 ident: ref37 article-title: Adam: A method for stochastic optimization – ident: ref43 doi: 10.1109/CVPR46437.2021.01008 – ident: ref2 doi: 10.1109/TMM.2020.2995278 – ident: ref3 doi: 10.1109/TMM.2019.2929949 – ident: ref16 doi: 10.18653/v1/N18-2074 – ident: ref36 doi: 10.1007/s11263-018-1140-0 – ident: ref45 doi: 10.1109/CVPR46437.2021.00294 – ident: ref10 doi: 10.1109/CVPR42600.2020.00915 |
| SSID | ssj0014507 |
| Score | 2.7253952 |
| Snippet | Recent transformer-based models, especially patch-based methods, have shown huge potentiality in vision tasks. However, the split fixed-size patches divide the... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 50 |
| SubjectTerms | Ablation attention mechanism classification Communication Convolutional neural networks Costs Encoding Feature extraction Formability Image classification Image processing Image segmentation Modules object detection Object recognition Patches (structures) pyramid Semantic segmentation Semantics Task analysis Transformer Transformers |
| Title | EAPT: Efficient Attention Pyramid Transformer for Image Processing |
| URI | https://ieeexplore.ieee.org/document/9580642 https://www.proquest.com/docview/2765181115 |
| Volume | 25 |
| WOSCitedRecordID | wos000937028400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1941-0077 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014507 issn: 1520-9210 databaseCode: RIE dateStart: 19990101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_m8EEfnE7F-UUefBGsy9qmSXybMlFwYw9VfCtpegXBbTKr4H9vkn6gKIJP7UNylFwv97tc7ncAJ-hTnoXIPIM1As_yl3hK6dATEUbaZ0Kkjonp4Y5PJuLxUU5bcNbUwiCiu3yG5_bV5fKzhX6zR2V9yYTFyyuwwnlU1mo1GYOQudJo446oJ00cU6ckqezH47EJBP2BiU99KnjwzQW5nio_NmLnXa47__uuTdioUCQZlmrfghbOu9CpOzSQymC7sP6FbnAbLkfDaXxBRo41wkgkw6IobzuS6cdSzZ4yEtc41kgxD3I7M_sNqaoJjJAduL8exVc3XtVDwdO-HBQeop-JnGsphA6NQacYhFFqgzQlfZZrjZSnIreoJR8owZiQSiukQaCjVGcY7EJ7vpjjHhAWpSEKzAJG09Bmb7jBTiYajERKldSqB_16WRNdEYzbPhfPiQs0qEyMIhKriKRSRA9OmxkvJbnGH2O37cI346o178Fhrbmksr7XxOcRM8jFgN3932cdwJptG18epRxCu1i-4RGs6vfi6XV57H6sTzRryOw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB58gXrwLa7PHLwI1s2mSZt4W2VFcXfZQxVvJU1nYUFXWavgvzdJ20VRBE_tIRlKppP5JpP5BuAYGY1zjiKwWCMMHH9JoLXhgYwwMkxImXkmpvtu3O_Lhwc1mIHTaS0MIvrLZ3jmXn0uP382b-6orKmEdHh5FuYF54yW1VrTnAEXvjjaOiQaKBvJ1ElJqppJr2dDQdayESqjMg6_OSHfVeXHVuz9y9Xq_75sDVYqHEnapeLXYQbHG7Ba92gglcluwPIXwsFNuOi0B8k56XjeCCuRtIuivO9IBh8T_TTKSVIjWSvFPsjNk91xSFVPYIVswd1VJ7m8DqouCoFhqlUEiCyXw9goKQ23Jp1hyKPMhWlaMTE0BmmcyaHDLcOWlkJIpY1GGoYmykyO4TbMjZ_HuANERBlHiXkoaMZd_ia26MnGg5HMqFZGN6BZL2tqKopx1-niMfWhBlWpVUTqFJFWimjAyXTGS0mv8cfYTbfw03HVmjdgv9ZcWtnfa8riSFjsYuHu7u-zjmDxOul10-5N_3YPllwT-fJgZR_miskbHsCCeS9Gr5ND_5N9AtPKzDM |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EAPT%3A+Efficient+Attention+Pyramid+Transformer+for+Image+Processing&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Lin%2C+Xiao&rft.au=Sun%2C+Shuzhou&rft.au=Huang%2C+Wei&rft.au=Sheng%2C+Bin&rft.date=2023&rft.pub=IEEE&rft.issn=1520-9210&rft.volume=25&rft.spage=50&rft.epage=61&rft_id=info:doi/10.1109%2FTMM.2021.3120873&rft.externalDocID=9580642 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon |