Pedestrian Re-Recognition Based on Spatiotemporal Transformer Skeleton Contrastive Learning and Feature Optimization
Person re-identification is an important task in computer vision, aimed at achieving cross-camera identity confirmation by identifying and matching the same pedestrian under different cameras. However, when traditional image-based methods are affected by factors such as lighting changes, occlusion,...
Uloženo v:
| Vydáno v: | Journal of advanced computational intelligence and intelligent informatics Ročník 29; číslo 6; s. 1249 - 1261 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Tokyo
Fuji Technology Press Co. Ltd
20.11.2025
|
| Témata: | |
| ISSN: | 1343-0130, 1883-8014 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Person re-identification is an important task in computer vision, aimed at achieving cross-camera identity confirmation by identifying and matching the same pedestrian under different cameras. However, when traditional image-based methods are affected by factors such as lighting changes, occlusion, and changes in viewing angles, the advantages of skeleton data become increasingly apparent. Existing methods typically use primitive body joint design skeleton descriptors or learn skeleton sequence representations, but they often cannot simultaneously simulate the relationships between different body components, and rarely model skeleton information from both temporal and spatial dimensions. Therefore, in this paper, we propose a universal skeleton contrastive learning method based on the spatiotemporal Transformer (Space-time Transformer, StFormer). The method first adopts the Space-time Attention (S-T Attention) mechanism and achieves relationship modeling of spatiotemporal features by stacking multiple S-T Attention blocks. Secondly, to improve the important clues for extracting data features from the model, a Feature Refinement Box (FR Box) was proposed. Finally, we purpose a unique prompt learning mechanism (P-Study) which utilizes the spatiotemporal context of graph nodes to prompt skeleton graph reconstruction and help capture more valuable patterns and graph semantics. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1343-0130 1883-8014 |
| DOI: | 10.20965/jaciii.2025.p1249 |