Search Results - Deep learning architectures and techniques; Recognition: detection

Refine Results
  1. 1

    Equalized Focal Loss for Dense Long-Tailed Object Detection by Li, Bo, Yao, Yongqiang, Tan, Jingru, Zhang, Gang, Yu, Fengwei, Lu, Jianwei, Luo, Ye

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Despite the recent success of long-tailed object detection, almost all long-tailed object detectors are developed based on the two-stage paradigm…”
    Get full text
    Conference Proceeding
  2. 2

    Sylph: A Hypernetwork Framework for Incremental Few-shot Object Detection by Yin, Li, Perez-Rua, Juan M, Liang, Kevin J

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…We study the challenging incremental few-shot object de-tection (iFSD) setting. Recently, hypernetwork-based approaches have been studied in the context of…”
    Get full text
    Conference Proceeding
  3. 3

    A ConvNet for the 2020s by Liu, Zhuang, Mao, Hanzi, Wu, Chao-Yuan, Feichtenhofer, Christoph, Darrell, Trevor, Xie, Saining

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs…”
    Get full text
    Conference Proceeding
  4. 4

    Grounded Language-Image Pre-training by Li, Liunian Harold, Zhang, Pengchuan, Zhang, Haotian, Yang, Jianwei, Li, Chunyuan, Zhong, Yiwu, Wang, Lijuan, Yuan, Lu, Zhang, Lei, Hwang, Jenq-Neng, Chang, Kai-Wei, Gao, Jianfeng

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…This paper presents a grounded language-image pretraining (GLIP) model for learning object-level, language-aware, and semantic-rich visual representations…”
    Get full text
    Conference Proceeding
  5. 5

    CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows by Dong, Xiaoyi, Bao, Jianmin, Chen, Dongdong, Zhang, Weiming, Yu, Nenghai, Yuan, Lu, Chen, Dong, Guo, Baining

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…We present CSWin Transformer, an efficient and effective Transformer-based backbone for general-purpose vision tasks. A challenging issue in Transformer design…”
    Get full text
    Conference Proceeding
  6. 6

    MetaFormer is Actually What You Need for Vision by Yu, Weihao, Luo, Mi, Zhou, Pan, Si, Chenyang, Zhou, Yichen, Wang, Xinchao, Feng, Jiashi, Yan, Shuicheng

    ISSN: 1063-6919
    Published: IEEE 01.01.2022
    “… Based on this observation, we hypothesize that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model's performance…”
    Get full text
    Conference Proceeding
  7. 7

    beta-DARTS: Beta-Decay Regularization for Differentiable Architecture Search by Ye, Peng, Li, Baopu, Li, Yikang, Chen, Tao, Fan, Jiayuan, Ouyang, Wanli

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically…”
    Get full text
    Conference Proceeding
  8. 8

    SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos by Khorasgani, Salar Hosseini, Chen, Yuxuan, Shkurti, Florian

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Self-supervised methods have significantly closed the gap with end-to-end supervised learning for image classification [13], [24…”
    Get full text
    Conference Proceeding
  9. 9

    Multimodal Token Fusion for Vision Transformers by Wang, Yikai, Chen, Xinghao, Cao, Lele, Huang, Wenbing, Sun, Fuchun, Wang, Yunhe

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Many adaptations of transformers have emerged to address the single-modal vision tasks, where self-attention modules are stacked to handle input sources like…”
    Get full text
    Conference Proceeding
  10. 10

    Knowledge Distillation via the Target-aware Transformer by Lin, Sihao, Xie, Hongwei, Wang, Bing, Yu, Kaicheng, Chang, Xiaojun, Liang, Xiaodan, Wang, Gang

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “… However, people tend to overlook the fact that, due to the architecture differences, the semantic information on the same spatial location usually vary…”
    Get full text
    Conference Proceeding
  11. 11

    Single-Domain Generalized Object Detection in Urban Scene via Cyclic-Disentangled Self-Distillation by Wu, Aming, Deng, Cheng

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “… And we consider a realistic yet challenging scenario, namely Single-Domain Generalized Object Detection (Single-DGOD…”
    Get full text
    Conference Proceeding
  12. 12

    TransMix: Attend to Mix for Vision Transformers by Chen, Jie-Neng, Sun, Shuyang, He, Ju, Torr, Philip, Yuille, Alan, Bai, Song

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Mixup-based augmentation has been found to be effective for generalizing models during training, especially for Vision Transformers (ViTs) since they can…”
    Get full text
    Conference Proceeding
  13. 13

    Unbiased Teacher v2: Semi-supervised Object Detection for Anchor-free and Anchor-based Detectors by Liu, Yen-Cheng, Ma, Chih-Yao, Kira, Zsolt

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…With the recent development of Semi-Supervised Object Detection (SS-OD) techniques, object detectors can be improved by using a limited amount of labeled data and abundant unlabeled data…”
    Get full text
    Conference Proceeding
  14. 14

    MiniViT: Compressing Vision Transformers with Weight Multiplexing by Zhang, Jinnian, Peng, Houwen, Wu, Kan, Liu, Mengchen, Xiao, Bin, Fu, Jianlong, Yuan, Lu

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Vision Transformer (ViT) models have recently drawn much attention in computer vision due to their high model capability. However, ViT models suffer from huge…”
    Get full text
    Conference Proceeding
  15. 15

    TableFormer: Table Structure Understanding with Transformers by Nassar, Ahmed, Livathinos, Nikolaos, Lysak, Maksym, Staar, Peter

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “… In this paper, we present a new table-structure identification model. The latter improves the latest end-to-end deep learning model (i.e…”
    Get full text
    Conference Proceeding
  16. 16

    VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention by Deng, Shengheng, Liang, Zhihao, Sun, Lin, Jia, Kui

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Detecting objects from LiDAR point clouds is of tremendous significance in autonomous driving…”
    Get full text
    Conference Proceeding
  17. 17

    Human-Object Interaction Detection via Disentangled Transformer by Zhou, Desen, Liu, Zhichao, Wang, Jian, Wang, Leshan, Hu, Tao, Ding, Errui, Wang, Jingdong

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Human-Object Interaction Detection tackles the problem of joint localization and classification of human object interactions…”
    Get full text
    Conference Proceeding
  18. 18

    Progressive End-to-End Object Detection in Crowded Scenes by Zheng, Anlin, Zhang, Yuang, Zhang, Xiangyu, Qi, Xiaojuan, Sun, Jian

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…In this paper, we propose a new query-based detection framework for crowd detection…”
    Get full text
    Conference Proceeding
  19. 19

    DTA: Physical Camouflage Attacks using Differentiable Transformation Network by Suryanto, Naufal, Kim, Yongsu, Kang, Hyoeun, Larasati, Harashta Tatimma, Yun, Youngyeo, Le, Thi-Thu-Huong, Yang, Hunmin, Oh, Se-Yoon, Kim, Howon

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “… In this paper, we propose the Differentiable Transformation Attack (DTA), a framework for generating a robust physical adversarial pattern on a target object to camouflage it against object detection models with a wide range of transformations…”
    Get full text
    Conference Proceeding
  20. 20

    MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens by Fang, Jiemin, Xie, Lingxi, Wang, Xinggang, Zhang, Xiaopeng, Liu, Wenyu, Tian, Qi

    ISSN: 1063-6919
    Published: IEEE 01.06.2022
    “…Transformers have offered a new methodology of designing neural networks for visual recognition…”
    Get full text
    Conference Proceeding