PoTion: Pose MoTion Representation for Action Recognition
Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we claim that considering them jointly offers rich information for action recognition. We introduce a novel representation that gracefully encodes...
Uložené v:
| Vydané v: | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition s. 7024 - 7033 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.06.2018
|
| Predmet: | |
| ISSN: | 1063-6919 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we claim that considering them jointly offers rich information for action recognition. We introduce a novel representation that gracefully encodes the movement of some semantic keypoints. We use the human joints as these keypoints and term our Pose moTion representation PoTion. Specifically, we first run a state-of-the-art human pose estimator [4] and extract heatmaps for the human joints in each frame. We obtain our PoTion representation by temporally aggregating these probability maps. This is achieved by 'colorizing' each of them depending on the relative time of the frames in the video clip and summing them. This fixed-size representation for an entire video clip is suitable to classify actions using a shallow convolutional neural network. Our experimental evaluation shows that PoTion outperforms other state-of-the-art pose representations [6, 48]. Furthermore, it is complementary to standard appearance and motion streams. When combining PoTion with the recent two-stream I3D approach [5], we obtain state-of-the-art performance on the JHMDB, HMDB and UCF101 datasets. |
|---|---|
| AbstractList | Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we claim that considering them jointly offers rich information for action recognition. We introduce a novel representation that gracefully encodes the movement of some semantic keypoints. We use the human joints as these keypoints and term our Pose moTion representation PoTion. Specifically, we first run a state-of-the-art human pose estimator [4] and extract heatmaps for the human joints in each frame. We obtain our PoTion representation by temporally aggregating these probability maps. This is achieved by 'colorizing' each of them depending on the relative time of the frames in the video clip and summing them. This fixed-size representation for an entire video clip is suitable to classify actions using a shallow convolutional neural network. Our experimental evaluation shows that PoTion outperforms other state-of-the-art pose representations [6, 48]. Furthermore, it is complementary to standard appearance and motion streams. When combining PoTion with the recent two-stream I3D approach [5], we obtain state-of-the-art performance on the JHMDB, HMDB and UCF101 datasets. |
| Author | Weinzaepfel, Philippe Revaud, Jerome Choutas, Vasileios Schmid, Cordelia |
| Author_xml | – sequence: 1 givenname: Vasileios surname: Choutas fullname: Choutas, Vasileios – sequence: 2 givenname: Philippe surname: Weinzaepfel fullname: Weinzaepfel, Philippe – sequence: 3 givenname: Jerome surname: Revaud fullname: Revaud, Jerome – sequence: 4 givenname: Cordelia surname: Schmid fullname: Schmid, Cordelia |
| BookMark | eNotjLFOwzAURQ0CibZkZmDJDyR9tuNnm62KKCAVEVWFtbKdFwiCuIqz8PeolOncoyudObsY4kCM3XAoOQe7rN-abSmAmxJAy-qMZVYbrqRBrATYczbjgLJAy-0Vy1L6BACBRppKzZht4q6Pw13exET585_kWzqMlGiY3HTULo75KkynJ8T3oT_ua3bZua9E2T8X7HV9v6sfi83Lw1O92hQfQsFUoCTkFXaBKtFB8AEVetcCoOPgCEwgrzsSLbbGIgrrtRdeEkntKIhWLtjtqdsT0f4w9t9u_NkbpY2RQv4CpBNJxg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR.2018.00734 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9781538664209 1538664208 |
| EISSN | 1063-6919 |
| EndPage | 7033 |
| ExternalDocumentID | 8578832 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-h250t-63e6146fce42f0cbc656bad006a10ae08ceb7fe2d6d896629b7b2b3ee37aec2d3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 232 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000457843607019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:52:16 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-h250t-63e6146fce42f0cbc656bad006a10ae08ceb7fe2d6d896629b7b2b3ee37aec2d3 |
| OpenAccessLink | https://inria.hal.science/hal-01764222 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_8578832 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-Jun |
| PublicationDateYYYYMMDD | 2018-06-01 |
| PublicationDate_xml | – month: 06 year: 2018 text: 2018-Jun |
| PublicationDecade | 2010 |
| PublicationTitle | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2018 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0002683845 ssj0003211698 |
| Score | 2.570884 |
| Snippet | Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 7024 |
| SubjectTerms | Computer architecture Heating systems Image color analysis Joints Optical imaging Streaming media |
| Title | PoTion: Pose MoTion Representation for Action Recognition |
| URI | https://ieeexplore.ieee.org/document/8578832 |
| WOSCitedRecordID | wos000457843607019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21iIGpQIv4lgdGTNM4sR02VFGxUEVVQd0q2zkLlgT1g9_POQmFgYUtjhTLes7pnu33fAA3xLkp7xQFF6kzPEl8wm1qU66twJC_rDZNsQk1nerFIss7cLvzwiBiLT7Du_BYn-UXlduGrbKhpn7pD-xCVynZeLV2-ymx1EK3J2ShLWhlIzPd3uYzirLh-DWfBS1XEE-qUCj5VzmVOptMev8bxyEMfmx5LN8lnCPoYHkMvZZHsjZK133I8mpOcN-zvFoje64bbFZLXlunUcmIq7KH2tPAZt8aoqocwMvkcT5-4m2JBP5G3GXDpUDKr9I7TGIfOeuInllTUCiZUWQw0g6t8hgXstC0sIkzq2xM04BCGXRxIU5gr6xKPAXmkBD03ikUFNipMPS5M0lmvbSZ9PYM-gGJ5UdzC8ayBeH879cXcBCgbkRVl7C3WW3xCvbd5-Z9vbqup-4LL5-aIg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED2VggRTgRbxjQdGQoOdD4cNVVRFtFVUFdStsp2zYElQP_j9nJNQGFjY4kixrOec7tl-zwdwTZyb8k6WeSI0ygsCG3g61KEntUCXv7RUVbGJeDyWs1mSNuBm44VBxFJ8hrfusTzLzwqzdltlXUn90h-4BdthEHC_cmttdlR4JIWsz8hcW9DaJkpkfZ_PnZ90e6_pxKm5nHwydqWSfxVUKfNJv_W_kexD58eYx9JNyjmABuaH0KqZJKvjdNmGJC2mBPg9S4slslHZYJNS9Fp7jXJGbJU9lK4GNvlWERV5B176j9PewKuLJHhvxF5WXiSQMmxkDQbc-kYbImhaZRRM6s5X6EuDOrbIsyiTtLThiY41p4lAESs0PBNH0MyLHI-BGSQErTUxCgrtUCj63Kgg0TbSSWT1CbQdEvOP6h6MeQ3C6d-vr2B3MB0N58On8fMZ7DnYK4nVOTRXizVewI75XL0vF5flNH4BZFqdaQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=PoTion%3A+Pose+MoTion+Representation+for+Action+Recognition&rft.au=Choutas%2C+Vasileios&rft.au=Weinzaepfel%2C+Philippe&rft.au=Revaud%2C+Jerome&rft.au=Schmid%2C+Cordelia&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=7024&rft.epage=7033&rft_id=info:doi/10.1109%2FCVPR.2018.00734&rft.externalDocID=8578832 |