Deformable Siamese Attention Networks for Visual Object Tracking
Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention...
Saved in:
| Published in: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 6727 - 6736 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.06.2020
|
| Subjects: | |
| ISSN: | 1063-6919 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self-attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The crossattention is capable of aggregating rich contextual interdependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross correlations between the attentional features for more accurate tracking. We conduct experiments on six benchmarks, where our method achieves new state-of-the-art results, outperforming recent strong baseline, SiamRPN++, by 0.464 to 0.537 and 0.415 to 0.470 EAO on VOT 2016 and 2018. |
|---|---|
| AbstractList | Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self-attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The crossattention is capable of aggregating rich contextual interdependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross correlations between the attentional features for more accurate tracking. We conduct experiments on six benchmarks, where our method achieves new state-of-the-art results, outperforming recent strong baseline, SiamRPN++, by 0.464 to 0.537 and 0.415 to 0.470 EAO on VOT 2016 and 2018. |
| Author | Scott, Matthew R. Yu, Yuechen Huang, Weilin Xiong, Yilei |
| Author_xml | – sequence: 1 givenname: Yuechen surname: Yu fullname: Yu, Yuechen organization: Malong Technologies – sequence: 2 givenname: Yilei surname: Xiong fullname: Xiong, Yilei organization: Malong Technologies – sequence: 3 givenname: Weilin surname: Huang fullname: Huang, Weilin organization: Malong Technologies – sequence: 4 givenname: Matthew R. surname: Scott fullname: Scott, Matthew R. organization: Malong Technologies |
| BookMark | eNotjNtKw0AQQFdRUGu_QB_2B1JnZt1J5s1Sr1CsaO1r2U0nsr0kkkTEv7egD4fzcjhn5qhuajXmEmGECHI1Wby8XhMDjAgIRgCc84EZSl5gTnuQC39oThHYZSwoJ2bYdWsAcITIUpyam1utmnYX4lbtWwo77dSO-17rPjW1fdb-u2k3nd03dpG6r7C1s7jWsrfzNpSbVH-cm-MqbDsd_ntg3u_v5pPHbDp7eJqMp1kicH3GBUYWCeIZwUVyTtwqrzzHAFQRlzGCF9RQBlKOAqVDYpXgfUWryruBufj7JlVdfrZpF9qfpaBnyr37BepZTC4 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR42600.2020.00676 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9781728171685 1728171687 |
| EISSN | 1063-6919 |
| EndPage | 6736 |
| ExternalDocumentID | 9156275 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i203t-681b699a956103b23393d7f56ba02f26cbb0591eaca2e6b90c3126e9a55f2df53 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 404 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000620679507001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:30:35 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i203t-681b699a956103b23393d7f56ba02f26cbb0591eaca2e6b90c3126e9a55f2df53 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_9156275 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-Jun |
| PublicationDateYYYYMMDD | 2020-06-01 |
| PublicationDate_xml | – month: 06 year: 2020 text: 2020-Jun |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2020 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211698 |
| Score | 2.6375852 |
| Snippet | Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 6727 |
| SubjectTerms | Convolution Correlation Object tracking Proposals Target tracking Task analysis Visualization |
| Title | Deformable Siamese Attention Networks for Visual Object Tracking |
| URI | https://ieeexplore.ieee.org/document/9156275 |
| WOSCitedRecordID | wos000620679507001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQNTgRbxLQ-MmDp24tQbqFAxlYqPqltlJ2cpUtVWTcrvx3aiwsDCFkWRIp3t3LvLe-8Abv2esJZZ6raDpnGsI2oMIo0TyzMHd71tURg2kU4mw_lcTVtwt9fCIGIgn-G9vwz_8vN1tvOtsoFyxQZPkza00zSttVr7fopwlYxUw0YdFzE1GM2mb8F_3VWBPBC4vLHIrxkqIYWMu_97-RH0f7R4ZLrPMsfQwtUJdBvwSJqjWfbg4QkD_jRLJO-F574ieayqms1IJjXbuyTuGTIryp1eklfjezDEZavM98v78Dl-_hi90GY8Ai04ExWVDnFKpbSXpjJhuBBK5KlNpNGMWy4zYxx2ityXVXOURrFMRFyi0olbh9wm4hQ6q_UKz4AotELHSuauvoi1t6lzxzSTwsEbGzFtz6HnA7LY1A4YiyYWF3_fvoRDH_GaUHUFnWq7w2s4yL6qotzehGX7BmEzl8w |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH5BNNETKhh_24NHJ1u7FXrToAQjTqJIuJF2e02WEDBs-Pfbdgt68OJtWZYseW33vvf2fd8DuLZ7Qmtfe2Y7SC8MZeApheiFkaaJgbvWtsgNm-jEcXc6FaMa3Gy0MIjoyGd4ay_dv_x0maxtq6wtTLFBO9EWbEdhSINSrbXpqDBTy3DRrfRxgS_avcnozTmwmzqQOgqXtRb5NUXFJZF-43-v34fWjxqPjDZ55gBquDiERgUfSXU48ybcPaBDoGqO5D2z7Fck90VR8hlJXPK9c2KeIZMsX8s5eVW2C0NMvkpsx7wFH_3HcW_gVQMSvIz6rPC4wZxcCGnFqT5TlDHB0o6OuJI-1ZQnShn0FJhvq6TIlfATFlCOQkZmJVIdsSOoL5YLPAYiUDMZCp6aCiOU1qjOHNSEMwNwdOBLfQJNG5DZZ-mBMaticfr37SvYHYxfhrPhU_x8Bns2-iW96hzqxWqNF7CTfBVZvrp0S_gNeLKbEw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Deformable+Siamese+Attention+Networks+for+Visual+Object+Tracking&rft.au=Yu%2C+Yuechen&rft.au=Xiong%2C+Yilei&rft.au=Huang%2C+Weilin&rft.au=Scott%2C+Matthew+R.&rft.date=2020-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=6727&rft.epage=6736&rft_id=info:doi/10.1109%2FCVPR42600.2020.00676&rft.externalDocID=9156275 |