Pose Recognition with Cascade Transformers
In this paper, we present a regression-based pose recognition method using cascade Transformers. One way to categorize the existing approaches in this domain is to separate them into 1). heatmap-based and 2). regression-based. In general, heatmap-based methods achieve higher accuracy but are subject...
Uložené v:
| Vydané v: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 1944 - 1953 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.06.2021
|
| Predmet: | |
| ISSN: | 1063-6919 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In this paper, we present a regression-based pose recognition method using cascade Transformers. One way to categorize the existing approaches in this domain is to separate them into 1). heatmap-based and 2). regression-based. In general, heatmap-based methods achieve higher accuracy but are subject to various heuristic designs (not end-to-end mostly), whereas regression-based approaches attain relatively lower accuracy but they have less intermediate non-differentiable steps. Here we utilize the encoder-decoder structure in Transformers to perform regression-based person and keypoint detection that is general-purpose and requires less heuristic design compared with the existing approaches. We demonstrate the keypoint hypothesis (query) refinement process across different self-attention layers to reveal the recursive self-attention mechanism in Transformers. In the experiments, we report competitive results for pose recognition when compared with the competing regression-based methods. |
|---|---|
| AbstractList | In this paper, we present a regression-based pose recognition method using cascade Transformers. One way to categorize the existing approaches in this domain is to separate them into 1). heatmap-based and 2). regression-based. In general, heatmap-based methods achieve higher accuracy but are subject to various heuristic designs (not end-to-end mostly), whereas regression-based approaches attain relatively lower accuracy but they have less intermediate non-differentiable steps. Here we utilize the encoder-decoder structure in Transformers to perform regression-based person and keypoint detection that is general-purpose and requires less heuristic design compared with the existing approaches. We demonstrate the keypoint hypothesis (query) refinement process across different self-attention layers to reveal the recursive self-attention mechanism in Transformers. In the experiments, we report competitive results for pose recognition when compared with the competing regression-based methods. |
| Author | Xu, Weijian Tu, Zhuowen Xu, Yifan Zhang, Xiang Li, Ke Wang, Shijie |
| Author_xml | – sequence: 1 givenname: Ke surname: Li fullname: Li, Ke email: keliictcas@gmail.com organization: University of Chinese Academy of Sciences,Beijing,China – sequence: 2 givenname: Shijie surname: Wang fullname: Wang, Shijie email: wang98thu@gmail.com organization: Tsinghua University,Beijing,China – sequence: 3 givenname: Xiang surname: Zhang fullname: Zhang, Xiang email: zx1239856@gmail.com organization: Tsinghua University,Beijing,China – sequence: 4 givenname: Yifan surname: Xu fullname: Xu, Yifan email: yix081@ucsd.edu organization: University of California San Diego,San Diego,USA – sequence: 5 givenname: Weijian surname: Xu fullname: Xu, Weijian email: wex041@ucsd.edu organization: University of California San Diego,San Diego,USA – sequence: 6 givenname: Zhuowen surname: Tu fullname: Tu, Zhuowen email: ztu@ucsd.edu organization: University of California San Diego,San Diego,USA |
| BookMark | eNotzstKw0AUgOFRFGxrn0AXWQuJ58wtM0sJ3qBgKcFtOZk50RGbSCYgvr0FXf27n28pzoZxYCGuESpE8LfN63anrVZ1JUFiBYDenYglWmu0NuDlqVggWFVaj_5CrHP-AAAlEa13C3GzHTMXOw7j25DmNA7Fd5rfi4ZyoMhFO9GQ-3E68JQvxXlPn5nX_12J9uG-bZ7Kzcvjc3O3KZMENZcx2A5r6DG4QEeSYdtDiNpH6oyGozI400fjFRmqZRfqyKQ8krPasVQrcfW3Tcy8_5rSgaafvTe1U1qrX_KjQ9c |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR46437.2021.00198 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 1665445092 9781665445092 |
| EISSN | 1063-6919 |
| EndPage | 1953 |
| ExternalDocumentID | 9578344 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i203t-dc6b170f1c8ca0215e6f0cd49dab540202c85fd593a5a72bc7dea391a8648e23 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 215 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000739917302014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:24:15 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i203t-dc6b170f1c8ca0215e6f0cd49dab540202c85fd593a5a72bc7dea391a8648e23 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_9578344 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-June |
| PublicationDateYYYYMMDD | 2021-06-01 |
| PublicationDate_xml | – month: 06 year: 2021 text: 2021-June |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211698 |
| Score | 2.6211445 |
| Snippet | In this paper, we present a regression-based pose recognition method using cascade Transformers. One way to categorize the existing approaches in this domain... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1944 |
| SubjectTerms | Computer vision Decoding Heating systems Pattern recognition Task analysis Transformers Visualization |
| Title | Pose Recognition with Cascade Transformers |
| URI | https://ieeexplore.ieee.org/document/9578344 |
| WOSCitedRecordID | wos000739917302014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5q8eCpaiu-ycGTuHZfeZ2LxVNZSpHeyiSZQC-70m39_W6224rgxVvIJUwefPP4vgzAU3NpjDRNdKIU6Sj33kWInofvtZXgGp0R2DabkLOZWi510YOXoxaGiFryGb2GYVvLd5XdhVTZWPO2LcQJnEgp91qtYz4layIZoVWnjktiPZ58FPM81KWaKDBNQslBq189VFoImQ7-t_g5jH60eKw4oswF9Ki8hEHnPLLuadZDeC6qmtj8wAeqShZSrGyCdWDAs8XBQW3cvREspm-LyXvUNUKI1mmcbSNnhUlk7BOrLAaQJuFj63Lt0PAQAKZWce-4zpCjTI2VjjDTCSqRK0qzK-iXVUnXwIQMkkDiynDKLWqUgvLAzfSpb3BJ3sAwWL763H91seqMvv17-g7OwtbumVP30N9udvQAp_Zru643j-35fAO38JC2 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JS8NAFH7UKuipaivu5uBJjM0y67lYKtYSSpDeymTmDfSSlKb195tJ04rgxdswl-HNwveW75sH8FhdmoxnVXQiBEqfWGt8pSx132sLRqUyGVN1swk-mYjZTCYteN5rYRCxJp_hixvWtXxT6I1LlfUlrdtCHMAhJSQKt2qtfUYlrmIZJkWjjwsD2R98JlPiKlNVHBiFruggxa8uKjWIDDv_W_4Uej9qPC_Z48wZtDA_h07jPnrN4yy78JQUJXrTHSOoyD2XZPUGqnQceC_duaiVw9eDdPiaDkZ-0wrBX0RBvPaNZlnIAxtqoZWDaWQ20IZIozLqQsBIC2oNlbGiikeZ5gZVLEMlGBEYxRfQzoscL8Fj3IkCkYqMItFKKs6QOHamjWyFTPwKus7y-XL72cW8Mfr67-kHOB6lH-P5-G3yfgMnbpu3PKpbaK9XG7yDI_21XpSr-_qsvgFy_pP9 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Pose+Recognition+with+Cascade+Transformers&rft.au=Li%2C+Ke&rft.au=Wang%2C+Shijie&rft.au=Zhang%2C+Xiang&rft.au=Xu%2C+Yifan&rft.date=2021-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=1944&rft.epage=1953&rft_id=info:doi/10.1109%2FCVPR46437.2021.00198&rft.externalDocID=9578344 |