Deformable Siamese Attention Networks for Visual Object Tracking

Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 6727 - 6736
Main Authors: Yu, Yuechen, Xiong, Yilei, Huang, Weilin, Scott, Matthew R.
Format: Conference Proceeding
Language:English
Published: IEEE 01.06.2020
Subjects:
ISSN:1063-6919
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self-attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The crossattention is capable of aggregating rich contextual interdependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross correlations between the attentional features for more accurate tracking. We conduct experiments on six benchmarks, where our method achieves new state-of-the-art results, outperforming recent strong baseline, SiamRPN++, by 0.464 to 0.537 and 0.415 to 0.470 EAO on VOT 2016 and 2018.
AbstractList Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self-attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The crossattention is capable of aggregating rich contextual interdependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross correlations between the attentional features for more accurate tracking. We conduct experiments on six benchmarks, where our method achieves new state-of-the-art results, outperforming recent strong baseline, SiamRPN++, by 0.464 to 0.537 and 0.415 to 0.470 EAO on VOT 2016 and 2018.
Author Scott, Matthew R.
Yu, Yuechen
Huang, Weilin
Xiong, Yilei
Author_xml – sequence: 1
  givenname: Yuechen
  surname: Yu
  fullname: Yu, Yuechen
  organization: Malong Technologies
– sequence: 2
  givenname: Yilei
  surname: Xiong
  fullname: Xiong, Yilei
  organization: Malong Technologies
– sequence: 3
  givenname: Weilin
  surname: Huang
  fullname: Huang, Weilin
  organization: Malong Technologies
– sequence: 4
  givenname: Matthew R.
  surname: Scott
  fullname: Scott, Matthew R.
  organization: Malong Technologies
BookMark eNotjNtKw0AQQFdRUGu_QB_2B1JnZt1J5s1Sr1CsaO1r2U0nsr0kkkTEv7egD4fzcjhn5qhuajXmEmGECHI1Wby8XhMDjAgIRgCc84EZSl5gTnuQC39oThHYZSwoJ2bYdWsAcITIUpyam1utmnYX4lbtWwo77dSO-17rPjW1fdb-u2k3nd03dpG6r7C1s7jWsrfzNpSbVH-cm-MqbDsd_ntg3u_v5pPHbDp7eJqMp1kicH3GBUYWCeIZwUVyTtwqrzzHAFQRlzGCF9RQBlKOAqVDYpXgfUWryruBufj7JlVdfrZpF9qfpaBnyr37BepZTC4
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR42600.2020.00676
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781728171685
1728171687
EISSN 1063-6919
EndPage 6736
ExternalDocumentID 9156275
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i203t-681b699a956103b23393d7f56ba02f26cbb0591eaca2e6b90c3126e9a55f2df53
IEDL.DBID RIE
ISICitedReferencesCount 404
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000620679507001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:30:35 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-681b699a956103b23393d7f56ba02f26cbb0591eaca2e6b90c3126e9a55f2df53
PageCount 10
ParticipantIDs ieee_primary_9156275
PublicationCentury 2000
PublicationDate 2020-Jun
PublicationDateYYYYMMDD 2020-06-01
PublicationDate_xml – month: 06
  year: 2020
  text: 2020-Jun
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.6375852
Snippet Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of...
SourceID ieee
SourceType Publisher
StartPage 6727
SubjectTerms Convolution
Correlation
Object tracking
Proposals
Target tracking
Task analysis
Visualization
Title Deformable Siamese Attention Networks for Visual Object Tracking
URI https://ieeexplore.ieee.org/document/9156275
WOSCitedRecordID wos000620679507001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQNTgRbxLQ-MmDp24tQbqFAxlYqPqltlJ2cpUtVWTcrvx3aiwsDCFkWRIp3t3LvLe-8Abv2esJZZ6raDpnGsI2oMIo0TyzMHd71tURg2kU4mw_lcTVtwt9fCIGIgn-G9vwz_8vN1tvOtsoFyxQZPkza00zSttVr7fopwlYxUw0YdFzE1GM2mb8F_3VWBPBC4vLHIrxkqIYWMu_97-RH0f7R4ZLrPMsfQwtUJdBvwSJqjWfbg4QkD_jRLJO-F574ieayqms1IJjXbuyTuGTIryp1eklfjezDEZavM98v78Dl-_hi90GY8Ai04ExWVDnFKpbSXpjJhuBBK5KlNpNGMWy4zYxx2ityXVXOURrFMRFyi0olbh9wm4hQ6q_UKz4AotELHSuauvoi1t6lzxzSTwsEbGzFtz6HnA7LY1A4YiyYWF3_fvoRDH_GaUHUFnWq7w2s4yL6qotzehGX7BmEzl8w
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH5BNNETKhh_24NHJ1u7FXrToAQjTqJIuJF2e02WEDBs-Pfbdgt68OJtWZYseW33vvf2fd8DuLZ7Qmtfe2Y7SC8MZeApheiFkaaJgbvWtsgNm-jEcXc6FaMa3Gy0MIjoyGd4ay_dv_x0maxtq6wtTLFBO9EWbEdhSINSrbXpqDBTy3DRrfRxgS_avcnozTmwmzqQOgqXtRb5NUXFJZF-43-v34fWjxqPjDZ55gBquDiERgUfSXU48ybcPaBDoGqO5D2z7Fck90VR8hlJXPK9c2KeIZMsX8s5eVW2C0NMvkpsx7wFH_3HcW_gVQMSvIz6rPC4wZxcCGnFqT5TlDHB0o6OuJI-1ZQnShn0FJhvq6TIlfATFlCOQkZmJVIdsSOoL5YLPAYiUDMZCp6aCiOU1qjOHNSEMwNwdOBLfQJNG5DZZ-mBMaticfr37SvYHYxfhrPhU_x8Bns2-iW96hzqxWqNF7CTfBVZvrp0S_gNeLKbEw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Deformable+Siamese+Attention+Networks+for+Visual+Object+Tracking&rft.au=Yu%2C+Yuechen&rft.au=Xiong%2C+Yilei&rft.au=Huang%2C+Weilin&rft.au=Scott%2C+Matthew+R.&rft.date=2020-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=6727&rft.epage=6736&rft_id=info:doi/10.1109%2FCVPR42600.2020.00676&rft.externalDocID=9156275