Multidirection and Multiscale Pyramid in Transformer for Video-Based Pedestrian Retrieval

In video surveillance, pedestrian retrieval (also called person reidentification) is a critical task. This task aims to retrieve the pedestrian of interest from nonoverlapping cameras. Recently, transformer-based models have achieved significant progress for this task. However, these models still su...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on industrial informatics Jg. 18; H. 12; S. 8776 - 8785
Hauptverfasser: Zang, Xianghao, Li, Ge, Gao, Wei
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Piscataway IEEE 01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1551-3203, 1941-0050
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract In video surveillance, pedestrian retrieval (also called person reidentification) is a critical task. This task aims to retrieve the pedestrian of interest from nonoverlapping cameras. Recently, transformer-based models have achieved significant progress for this task. However, these models still suffer from ignoring fine-grained, part-informed information. This article proposes a multidirection and multiscale Pyramid in Transformer (PiT) to solve this problem. In transformer-based architecture, each pedestrian image is split into many patches. Then, these patches are fed to transformer layers to obtain the feature representation of this image. To explore the fine-grained information, this article proposes to apply vertical division and horizontal division on these patches to generate different-direction human parts. These parts provide more fine-grained information. To fuse multiscale feature representation, this article presents a pyramid structure containing global-level information and many pieces of local-level information from different scales. The feature pyramids of all the pedestrian images from the same video are fused to form the final multidirection and multiscale feature representation. Experimental results on two challenging video-based benchmarks, MARS and iLIDS-VID, show the proposed PiT achieves state-of-the-art performance. Extensive ablation studies demonstrate the superiority of the proposed pyramid structure. Data is available on-line at https://git.openi.org.cn/zangxh/PiT.git .
AbstractList In video surveillance, pedestrian retrieval (also called person reidentification) is a critical task. This task aims to retrieve the pedestrian of interest from nonoverlapping cameras. Recently, transformer-based models have achieved significant progress for this task. However, these models still suffer from ignoring fine-grained, part-informed information. This article proposes a multidirection and multiscale Pyramid in Transformer (PiT) to solve this problem. In transformer-based architecture, each pedestrian image is split into many patches. Then, these patches are fed to transformer layers to obtain the feature representation of this image. To explore the fine-grained information, this article proposes to apply vertical division and horizontal division on these patches to generate different-direction human parts. These parts provide more fine-grained information. To fuse multiscale feature representation, this article presents a pyramid structure containing global-level information and many pieces of local-level information from different scales. The feature pyramids of all the pedestrian images from the same video are fused to form the final multidirection and multiscale feature representation. Experimental results on two challenging video-based benchmarks, MARS and iLIDS-VID, show the proposed PiT achieves state-of-the-art performance. Extensive ablation studies demonstrate the superiority of the proposed pyramid structure. Data is available on-line at https://git.openi.org.cn/zangxh/PiT.git .
Author Li, Ge
Gao, Wei
Zang, Xianghao
Author_xml – sequence: 1
  givenname: Xianghao
  orcidid: 0000-0001-8421-7167
  surname: Zang
  fullname: Zang, Xianghao
  email: zangxh@pku.edu.cn
  organization: School of Electronic and Computer Engineering, Peking University, Shenzhen, China
– sequence: 2
  givenname: Ge
  orcidid: 0000-0003-0140-0949
  surname: Li
  fullname: Li, Ge
  email: geli@ece.pku.edu.cn
  organization: School of Electronic and Computer Engineering, Peking University, Shenzhen, China
– sequence: 3
  givenname: Wei
  orcidid: 0000-0001-7429-5495
  surname: Gao
  fullname: Gao, Wei
  email: gaowei262@pku.edu.cn
  organization: School of Electronic and Computer Engineering, Peking University, Shenzhen, China
BookMark eNp9kM1LAzEQxYNUsK3eBS8Bz1uTzc5mc9TiR6FikSp4WrLJLKRsd2uyFfrfm9riwYOnNwzvNx9vRAZt1yIhl5xNOGfqZjmbTVKWphPBgcs8PyFDrjKeMAZsEGsAnoiUiTMyCmHFmJBMqCH5eN42vbPOo-ld11LdWvrTCkY3SBc7r9fOUtfSpddtqDu_Rk-j0HdnsUvudEBLF2gx9N7plr5iVPzSzTk5rXUT8OKoY_L2cL-cPiXzl8fZ9HaeGAGyT5RRBaCVykgLWtQFU4ApZEIjigrAVtIwVMpClWe6YEaIwmRFlmoNRQVcjMn1Ye7Gd5_beEa56ra-jSvLVHIFACotois_uIzvQvBYl8b1ev9y77VrSs7KfYxljLHcx1geY4wg-wNuvFtrv_sPuTogDhF_7UryjAspvgGNnH-q
CODEN ITIICH
CitedBy_id crossref_primary_10_1007_s10851_023_01166_7
crossref_primary_10_3390_app122312503
crossref_primary_10_1007_s40436_025_00569_6
crossref_primary_10_1007_s40747_024_01474_4
crossref_primary_10_1007_s11042_023_15116_3
crossref_primary_10_1109_TII_2023_3266372
crossref_primary_10_3390_s24237536
crossref_primary_10_1007_s00034_024_02808_w
crossref_primary_10_1016_j_aei_2023_102238
crossref_primary_10_1109_TII_2024_3359432
crossref_primary_10_3390_s23062938
crossref_primary_10_3390_a17080352
crossref_primary_10_1109_TII_2023_3298473
crossref_primary_10_1109_TII_2023_3240733
crossref_primary_10_1109_TCSVT_2024_3362369
crossref_primary_10_1007_s11263_025_02350_5
crossref_primary_10_1016_j_eswa_2025_128123
crossref_primary_10_1109_TII_2024_3367043
crossref_primary_10_1007_s00521_025_11218_1
crossref_primary_10_1109_TITS_2024_3351841
crossref_primary_10_31857_S0005231023050057
crossref_primary_10_1016_j_patcog_2025_111813
crossref_primary_10_1016_j_cviu_2024_104030
crossref_primary_10_1109_TMM_2023_3276167
crossref_primary_10_1109_TII_2024_3453919
crossref_primary_10_1016_j_imavis_2024_105400
crossref_primary_10_1109_TCSVT_2023_3340428
crossref_primary_10_1134_S0005117923050041
crossref_primary_10_1186_s13634_024_01139_x
crossref_primary_10_1016_j_autcon_2024_105726
crossref_primary_10_1016_j_neucom_2024_128479
crossref_primary_10_1049_ipr2_12913
crossref_primary_10_1109_TII_2023_3348838
crossref_primary_10_1016_j_knosys_2025_113461
Cites_doi 10.1007/978-3-030-58595-2_24
10.1109/TII.2019.2946030
10.1016/j.imavis.2021.104330
10.1109/CVPR46437.2021.01313
10.1109/CVPR42600.2020.00335
10.1109/TII.2017.2767557
10.1109/TII.2014.2330976
10.1109/CVPR42600.2020.01042
10.1109/ICCV.2019.00065
10.1609/aaai.v34i07.6632
10.1609/aaai.v33i01.33018295
10.1007/978-3-030-58607-2_6
10.1109/CVPR.2019.00505
10.1109/CVPR46437.2021.00205
10.1109/TPAMI.2015.2389824
10.1109/ICCV.2019.00406
10.1609/aaai.v34i07.6770
10.1609/aaai.v34i07.6807
10.1609/aaai.v34i07.6802
10.24963/ijcai.2020/141
10.1007/978-3-030-58598-3_39
10.1609/aaai.v35i2.16262
10.1049/ipr2.12380
10.1145/3394171.3413843
10.1109/CVPR.2019.00735
10.1109/CVPR46437.2021.00435
10.1007/978-3-030-58536-5_14
10.1109/CVPR42600.2020.00297
10.1109/TNNLS.2019.2920905
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TII.2022.3151766
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEL
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1941-0050
EndPage 8785
ExternalDocumentID 10_1109_TII_2022_3151766
9714137
Genre orig-research
GrantInformation_xml – fundername: National Key R&D Program of China
  grantid: 2020AAA0103501
GroupedDBID 0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c357t-9c985ed79c7d5a3f8095e2543aee3b55db7c0e99d5b64a80c338c4842aa58b513
IEDL.DBID RIE
ISICitedReferencesCount 43
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000862429800042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1551-3203
IngestDate Mon Jun 30 10:07:08 EDT 2025
Tue Nov 18 22:35:39 EST 2025
Sat Nov 29 04:17:01 EST 2025
Wed Aug 27 02:14:19 EDT 2025
IsPeerReviewed false
IsScholarly true
Issue 12
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c357t-9c985ed79c7d5a3f8095e2543aee3b55db7c0e99d5b64a80c338c4842aa58b513
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8421-7167
0000-0001-7429-5495
0000-0003-0140-0949
PQID 2719555928
PQPubID 85507
PageCount 10
ParticipantIDs crossref_citationtrail_10_1109_TII_2022_3151766
crossref_primary_10_1109_TII_2022_3151766
proquest_journals_2719555928
ieee_primary_9714137
PublicationCentury 2000
PublicationDate 2022-12-01
PublicationDateYYYYMMDD 2022-12-01
PublicationDate_xml – month: 12
  year: 2022
  text: 2022-12-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE transactions on industrial informatics
PublicationTitleAbbrev TII
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref35
ref13
ref34
ref12
ref14
ref31
ref30
ref33
ref11
ref32
ref10
ref2
ref1
ref17
ref19
ref18
wang (ref9) 0
sun (ref5) 0
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
liu (ref16) 2021
ref4
zheng (ref8) 0
ye (ref15) 2021
ref3
ref6
dosovitskiy (ref7) 0
References_xml – ident: ref31
  doi: 10.1007/978-3-030-58595-2_24
– ident: ref1
  doi: 10.1109/TII.2019.2946030
– ident: ref3
  doi: 10.1016/j.imavis.2021.104330
– ident: ref12
  doi: 10.1109/CVPR46437.2021.01313
– ident: ref29
  doi: 10.1109/CVPR42600.2020.00335
– ident: ref4
  doi: 10.1109/TII.2017.2767557
– ident: ref2
  doi: 10.1109/TII.2014.2330976
– start-page: 868
  year: 0
  ident: ref8
  article-title: MARS: A video benchmark for large-scale person re-identification
  publication-title: Proc Eur Conf Comput Vis
– ident: ref27
  doi: 10.1109/CVPR42600.2020.01042
– start-page: 480
  year: 0
  ident: ref5
  article-title: Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)
  publication-title: Proc Eur Conf Comput Vis
– ident: ref20
  doi: 10.1109/ICCV.2019.00065
– ident: ref22
  doi: 10.1609/aaai.v34i07.6632
– ident: ref6
  doi: 10.1609/aaai.v33i01.33018295
– ident: ref35
  doi: 10.1007/978-3-030-58607-2_6
– ident: ref18
  doi: 10.1109/CVPR.2019.00505
– ident: ref13
  doi: 10.1109/CVPR46437.2021.00205
– ident: ref17
  doi: 10.1109/TPAMI.2015.2389824
– ident: ref19
  doi: 10.1109/ICCV.2019.00406
– ident: ref21
  doi: 10.1609/aaai.v34i07.6770
– ident: ref23
  doi: 10.1609/aaai.v34i07.6807
– ident: ref24
  doi: 10.1609/aaai.v34i07.6802
– ident: ref25
  doi: 10.24963/ijcai.2020/141
– start-page: 688
  year: 0
  ident: ref9
  article-title: Person re-identification by video ranking
  publication-title: Proc Eur Conf Comput Vis
– ident: ref33
  doi: 10.1007/978-3-030-58598-3_39
– year: 2021
  ident: ref15
  article-title: Deep learning for person re-identification: A survey and outlook
  publication-title: IEEE Trans Pattern Anal Mach Intell
– ident: ref34
  doi: 10.1609/aaai.v35i2.16262
– ident: ref11
  doi: 10.1049/ipr2.12380
– year: 0
  ident: ref7
  article-title: An image is worth 16 × 16 words: Transformers for image recognition at scale
– ident: ref26
  doi: 10.1145/3394171.3413843
– ident: ref30
  doi: 10.1109/CVPR.2019.00735
– ident: ref14
  doi: 10.1109/CVPR46437.2021.00435
– year: 2021
  ident: ref16
  article-title: Video swin transformer
– ident: ref32
  doi: 10.1007/978-3-030-58536-5_14
– ident: ref28
  doi: 10.1109/CVPR42600.2020.00297
– ident: ref10
  doi: 10.1109/TNNLS.2019.2920905
SSID ssj0037039
Score 2.571434
Snippet In video surveillance, pedestrian retrieval (also called person reidentification) is a critical task. This task aims to retrieve the pedestrian of interest...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 8776
SubjectTerms Ablation
Cameras
Convolution
Feature extraction
Kernel
Multidirection and multiscale pyramid
Natural language processing
Patches (structures)
Pyramids
Representations
Retrieval
Task analysis
Transformers
video-based pedestrian retrieval
vision transformer
Title Multidirection and Multiscale Pyramid in Transformer for Video-Based Pedestrian Retrieval
URI https://ieeexplore.ieee.org/document/9714137
https://www.proquest.com/docview/2719555928
Volume 18
WOSCitedRecordID wos000862429800042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEL
  customDbUrl:
  eissn: 1941-0050
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0037039
  issn: 1551-3203
  databaseCode: RIE
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1La-MwEB6a0sP20MdmS9MXOuxlodrYeljSsS0tzSWEki3tyegxgcCus6QP6L-vLNtpYZdCTzZGMma-0cx8lmYG4DsvTOCMS4paCCpyH6hzs4Ia6QNzmttcN80m1His7-7MZA1OV7kwiJgOn-HP-jbt5YeFf6p_lQ2NyqPNVT3oKVU0uVqd1eVRc02qjSpzylnGuy3JzAyno1EkgoxFfirbeohvLij1VPnHECfvcrX9ue_aga02iiRnDey7sIbVV9h8V1uwD_cptbbxWFH2xFaBpEcPERUkk5el_TMPZF6RaRe74pLEC7mdB1zQ8-jeAplgwNTZoyI3qfdWVMxv8OvqcnpxTds-CtRzqR6p8UZLDMp4FaTlMx3DKqyT4C0id1IGp3yGxgTpCmF15iNt9UILZq3UTuZ8D9arRYX7QHLlZhYVEzwLIvDCCTHjwcSFH9mo03IAw060pW-LjNe9Ln6XiWxkpoxglDUYZQvGAH6sZvxtCmx8MLZfC381rpX7AI469Mp2BT6UTOVGRrrE9MH_Zx3Cl_rdzdGUI1h_XD7hMWz45wjE8iQp1yv4dMwC
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS9xAEB-sCtUHrVXx_Gj3wZeC20v243b3UUXxqD2Ocop9CvsxBwdtTs4P8L93s0muglLwKSHskjC_2Zn5ZXdmAA55zwTOuKSohaAi94E6N-5RI31gTnOb67rZhBoM9M2NGS7A0TwXBhHT4TP8Xt2mvfww9Q_Vr7KuUXm0ueoDLEkhWFZna7V2l0fdNak6qswpZxlvNyUz0x31-5EKMhYZqmwqIv5zQqmryitTnPzL-fr7vuwTrDVxJDmugd-ABSw_w-qL6oKb8Dsl19Y-K0qf2DKQ9Ogu4oJk-DSzfyeBTEoyaqNXnJF4IdeTgFN6Eh1cIEMMmHp7lORX6r4VVXMLrs7PRqcXtOmkQD2X6p4ab7TEoIxXQVo-1jGwwioN3iJyJ2VwymdoTJCuJ6zOfCSuXmjBrJXayZxvw2I5LXEHSK7c2KJigmdBBN5zQox5MHHpRz7qtOxAtxVt4Zsy41W3iz9FohuZKSIYRQVG0YDRgW_zGbd1iY3_jN2shD8f18i9A_stekWzBu8KpnIjI2FievftWV_h48Xo52Vx2R_82IOV6j31QZV9WLyfPeABLPvHCMrsS1K0Z78Cz0k
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multidirection+and+Multiscale+Pyramid+in+Transformer+for+Video-Based+Pedestrian+Retrieval&rft.jtitle=IEEE+transactions+on+industrial+informatics&rft.au=Zang%2C+Xianghao&rft.au=Li%2C+Ge&rft.au=Gao%2C+Wei&rft.date=2022-12-01&rft.issn=1551-3203&rft.eissn=1941-0050&rft.volume=18&rft.issue=12&rft.spage=8776&rft.epage=8785&rft_id=info:doi/10.1109%2FTII.2022.3151766&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TII_2022_3151766
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1551-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1551-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1551-3203&client=summon