PVT v2: Improved baselines with Pyramid Vision Transformer

Transformers have recently lead to encouraging progress in computer vision. In this work, we present new baselines by improving the original Pyramid Vision Transformer (PVT v1) by adding three designs: (i) a linear complexity attention layer, (ii) an overlapping patch embedding, and (iii) a convolut...

Full description

Saved in:
Bibliographic Details
Published in:Computational visual media (Beijing) Vol. 8; no. 3; pp. 415 - 424
Main Authors: Wang, Wenhai, Xie, Enze, Li, Xiang, Fan, Deng-Ping, Song, Kaitao, Liang, Ding, Lu, Tong, Luo, Ping, Shao, Ling
Format: Journal Article
Language:English
Published: Beijing Tsinghua University Press 01.09.2022
Springer Nature B.V
Subjects:
ISSN:2096-0433, 2096-0662
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first