Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition
A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses, and a probabilistic mapping between the visual and semantic poses. This paper assumes that both the visual poses and map...
Uložené v:
| Vydané v: | IEEE transactions on circuits and systems for video technology Ročník 30; číslo 2; s. 457 - 467 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.02.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1051-8215, 1558-2205 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses, and a probabilistic mapping between the visual and semantic poses. This paper assumes that both the visual poses and mapping are hidden and proposes a method to simultaneously learn a visual pose model that estimates the likelihood of an observed video frame being generated from hidden visual poses, and a pose lexicon model establishes the probabilistic mapping between the hidden visual poses and the semantic poses parsed from textual instructions. Specifically, the proposed method consists of two-level hidden Markov models. One level represents the alignment between the visual poses and semantic poses. The other level represents a visual pose sequence, and each visual pose is modeled as a Gaussian mixture. An expectation-maximization algorithm is developed to train a pose lexicon. With the learned lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of video frames that follows a given sequence of semantic poses, constrained by the most likely visual pose and the alignment sequences. The proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15, Combined-17, and Combined-50 action datasets using cross-subject, cross-dataset, zero-shot, and seen/unseen protocols. |
|---|---|
| AbstractList | A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses, and a probabilistic mapping between the visual and semantic poses. This paper assumes that both the visual poses and mapping are hidden and proposes a method to simultaneously learn a visual pose model that estimates the likelihood of an observed video frame being generated from hidden visual poses, and a pose lexicon model establishes the probabilistic mapping between the hidden visual poses and the semantic poses parsed from textual instructions. Specifically, the proposed method consists of two-level hidden Markov models. One level represents the alignment between the visual poses and semantic poses. The other level represents a visual pose sequence, and each visual pose is modeled as a Gaussian mixture. An expectation-maximization algorithm is developed to train a pose lexicon. With the learned lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of video frames that follows a given sequence of semantic poses, constrained by the most likely visual pose and the alignment sequences. The proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15, Combined-17, and Combined-50 action datasets using cross-subject, cross-dataset, zero-shot, and seen/unseen protocols. |
| Author | Li, Wanqing Ogunbona, Philip Zhang, Zhengyou Zhou, Lijuan |
| Author_xml | – sequence: 1 givenname: Lijuan orcidid: 0000-0002-6418-6284 surname: Zhou fullname: Zhou, Lijuan email: lz683@uowmail.edu.au organization: AMRL, University of Wollongong, Wollongong, NSW, Australia – sequence: 2 givenname: Wanqing orcidid: 0000-0002-4427-2687 surname: Li fullname: Li, Wanqing email: wanqing@uow.edu.au organization: AMRL, University of Wollongong, Wollongong, NSW, Australia – sequence: 3 givenname: Philip orcidid: 0000-0003-4119-2873 surname: Ogunbona fullname: Ogunbona, Philip email: philipo@uow.edu.au organization: AMRL, University of Wollongong, Wollongong, NSW, Australia – sequence: 4 givenname: Zhengyou surname: Zhang fullname: Zhang, Zhengyou organization: Microsoft Research, Redmond, Shenzhen, WA, China |
| BookMark | eNp9kE1PAjEQhhuDiYD-Ab1s4nmxn9v2SIifIWoEuTbdbpeUQIvtksi_dxeIBw-e5s3MPDNv3gHo-eAtANcIjhCC8m4-mS3mIwyRHGEhocDyDPQRYyLHGLJeqyFDucCIXYBBSisIERWU98HrS3C-We-zqdXRO7_MFi7t9Dp7D8mmTPvqoNrxtzPBZ3WI2cxutG-cycamcW3vw5qw9K7Tl-C81utkr051CD4f7ueTp3z69vg8GU9zgyVrckt4xXVdEiQLXlFpClZRAksrGNZScIp4zXhZ6hJaTSEt6q5nOedaYAoJGYLb491tDF87mxq1Crvo25cKE4aRRATKdksct0wMKUVbK-Ma3flsonZrhaDq0lOH9FSXnjql16L4D7qNbqPj_n_o5gg5a-0vIArYOhbkByf7fJA |
| CODEN | ITCTEM |
| CitedBy_id | crossref_primary_10_1016_j_eswa_2025_126646 crossref_primary_10_1016_j_imavis_2024_104985 crossref_primary_10_1109_TASE_2025_3553495 crossref_primary_10_3390_s20113305 crossref_primary_10_3390_electronics12204328 crossref_primary_10_1109_TCSVT_2024_3434563 crossref_primary_10_1109_TCSVT_2021_3050807 crossref_primary_10_1016_j_eswa_2025_127420 |
| Cites_doi | 10.1007/978-3-319-10578-9_27 10.1109/ICPR.2014.451 10.3115/993268.993313 10.1109/CVPR.2017.55 10.1145/2733373.2806296 10.1109/DICTA.2014.7008101 10.1016/j.patcog.2017.06.035 10.1109/DICTA.2014.7008115 10.1007/s11263-016-0897-2 10.1109/TIFS.2013.2258152 10.1023/A:1007617005950 10.1109/TCSVT.2016.2628339 10.1109/THMS.2015.2504550 10.1109/ICIP.2015.7350781 10.1007/s11263-007-0122-4 10.1145/2207676.2208303 10.1109/SIU.2013.6531398 10.1109/CVPR.2013.123 10.1109/ICCV.2013.337 10.1109/CVPR.2011.5995353 10.18653/v1/D17-1052 10.1109/TCSVT.2008.2005597 10.1109/CVPR.2008.4587756 10.1109/FG.2017.100 10.1109/5.18626 10.1109/CVPRW.2012.6239233 10.1109/ICME.2016.7552882 10.1109/ICCV.2013.274 10.1007/978-3-642-33718-5_11 10.3115/v1/N15-1017 10.1109/CVPRW.2010.5543273 10.1007/978-3-030-01234-2_9 10.1109/TPAMI.2009.43 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TCSVT.2019.2890829 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2205 |
| EndPage | 467 |
| ExternalDocumentID | 10_1109_TCSVT_2019_2890829 8600338 |
| Genre | orig-research |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c295t-e37d7afb31967d49c65d430be852a987417f57bbab0ea4046f8741e777a824033 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 16 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000521643900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1051-8215 |
| IngestDate | Mon Jun 30 03:03:01 EDT 2025 Tue Nov 18 21:31:34 EST 2025 Sat Nov 29 01:44:12 EST 2025 Wed Aug 27 06:28:51 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c295t-e37d7afb31967d49c65d430be852a987417f57bbab0ea4046f8741e777a824033 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-6418-6284 0000-0002-4427-2687 0000-0003-4119-2873 |
| PQID | 2352191309 |
| PQPubID | 85433 |
| PageCount | 11 |
| ParticipantIDs | crossref_citationtrail_10_1109_TCSVT_2019_2890829 crossref_primary_10_1109_TCSVT_2019_2890829 proquest_journals_2352191309 ieee_primary_8600338 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-02-01 |
| PublicationDateYYYYMMDD | 2020-02-01 |
| PublicationDate_xml | – month: 02 year: 2020 text: 2020-02-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on circuits and systems for video technology |
| PublicationTitleAbbrev | TCSVT |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | tang (ref38) 2015 ref35 ref13 ref12 ref37 ref15 ref36 ref14 li (ref1) 2008; 18 ref31 ref30 brown (ref26) 1993; 19 ref33 ref11 dyer (ref28) 2011 ref32 ref10 ref2 ref39 ref17 ref16 ref18 ref24 ref23 ref25 naim (ref29) 2014 ref20 ref22 ref21 ref27 eweiwi (ref9) 2014; 9007 ref8 ref7 hussein (ref34) 2013 ref4 ref3 ref6 ref5 blei (ref19) 2003; 3 ref40 |
| References_xml | – start-page: 2466 year: 2013 ident: ref34 article-title: Human action recognition using a temporal hierarchy of covariance descriptor on 3D joint locations publication-title: Proc Intern Joint Conf Artificial Intel (IJCAI) – year: 2015 ident: ref38 publication-title: Online action recognition based on incremental learning of weighted covariance descriptors – ident: ref35 doi: 10.1007/978-3-319-10578-9_27 – ident: ref17 doi: 10.1109/ICPR.2014.451 – ident: ref27 doi: 10.3115/993268.993313 – ident: ref10 doi: 10.1109/CVPR.2017.55 – start-page: 1 year: 2014 ident: ref29 article-title: Unsupervised alignment of natural language instructions with video segments publication-title: Proc AAAI Conf Artif Intell (AAAI) – ident: ref3 doi: 10.1145/2733373.2806296 – ident: ref14 doi: 10.1109/DICTA.2014.7008101 – ident: ref12 doi: 10.1016/j.patcog.2017.06.035 – ident: ref2 doi: 10.1109/DICTA.2014.7008115 – ident: ref24 doi: 10.1007/s11263-016-0897-2 – ident: ref21 doi: 10.1109/TIFS.2013.2258152 – ident: ref18 doi: 10.1023/A:1007617005950 – ident: ref5 doi: 10.1109/TCSVT.2016.2628339 – ident: ref4 doi: 10.1109/THMS.2015.2504550 – ident: ref36 doi: 10.1109/ICIP.2015.7350781 – ident: ref15 doi: 10.1007/s11263-007-0122-4 – ident: ref32 doi: 10.1145/2207676.2208303 – ident: ref33 doi: 10.1109/SIU.2013.6531398 – ident: ref8 doi: 10.1109/CVPR.2013.123 – ident: ref25 doi: 10.1109/ICCV.2013.337 – ident: ref20 doi: 10.1109/CVPR.2011.5995353 – ident: ref31 doi: 10.18653/v1/D17-1052 – volume: 18 start-page: 1499 year: 2008 ident: ref1 article-title: Expandable data-driven graphical modeling of human actions based on salient postures publication-title: IEEE Trans Circuits Syst Video Technol doi: 10.1109/TCSVT.2008.2005597 – ident: ref13 doi: 10.1109/CVPR.2008.4587756 – ident: ref23 doi: 10.1109/FG.2017.100 – volume: 19 start-page: 263 year: 1993 ident: ref26 article-title: The mathematics of statistical machine translation: Parameter estimation publication-title: Comput Linguistics – volume: 9007 start-page: 428 year: 2014 ident: ref9 article-title: Efficient pose-based action recognition publication-title: Vision Computer – ident: ref39 doi: 10.1109/5.18626 – ident: ref37 doi: 10.1109/CVPRW.2012.6239233 – ident: ref11 doi: 10.1109/ICME.2016.7552882 – ident: ref40 doi: 10.1109/ICCV.2013.274 – volume: 3 start-page: 993 year: 2003 ident: ref19 article-title: Latent Dirichlet allocation publication-title: J Mach Learn Res – ident: ref22 doi: 10.1007/978-3-642-33718-5_11 – start-page: 409 year: 2011 ident: ref28 article-title: Unsupervised word alignment with arbitrary features publication-title: Proc 49th Annual Meeting Assoc for Comp Linguistics Human Language Tech – ident: ref30 doi: 10.3115/v1/N15-1017 – ident: ref7 doi: 10.1109/CVPRW.2010.5543273 – ident: ref6 doi: 10.1007/978-3-030-01234-2_9 – ident: ref16 doi: 10.1109/TPAMI.2009.43 |
| SSID | ssj0014847 |
| Score | 2.3930137 |
| Snippet | A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 457 |
| SubjectTerms | Algorithms Alignment Computational modeling Conditional probability Datasets Expectation-maximization algorithms Hidden Markov models Integrated circuit modeling Learning Mapping Markov chains pose lexicon Probabilistic logic Recognition Semantic action recognition Semantics Statistical analysis Visual observation visual pose Visualization |
| Title | Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition |
| URI | https://ieeexplore.ieee.org/document/8600338 https://www.proquest.com/docview/2352191309 |
| Volume | 30 |
| WOSCitedRecordID | wos000521643900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2205 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014847 issn: 1051-8215 databaseCode: RIE dateStart: 19910101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEJ4A8aAHX2hE0ezBmxbK9rG7R0IkxgMhgoRbs68aEmwNBaP_3t2loInGxFvT7iTNzG7nm87jA7iWIgy01qknY628UBPlcYGFF4uYW48rsPId2QQZDOh0yoYVuN32whhBV3ymW_bS5fJVLlf2V1mbxpZ6jFahSki87tXaZgxC6sjEDFzoeNT4sU2DjM_a495oMrZVXKxl02rUwckvJ-RYVX58ip1_6R_8780OYb_Ekai7NvwRVHR2DHvfpgvWYfCQz7Ll_AOVM1Sf0WRWrIzQMC90gXim3JV5_G62Q4YMfEUj_WJUPZOo6_od0OOmvijPTuCpfzfu3XslfYInMYuWng6IIjwV9pARFTIZRyoMfKFphDmjBkqQNCJCcOFrHpo4ObX3NCGEUzulLziFWpZn-gxQygi2OWplPHrIQyxoKiNORYBlygPmN6Cz0Wciy9niluJinrgYw2eJs0FibZCUNmjAzVbmdT1Z48_Vdav17cpS4Q1obsyWlIevSLABlSYMDXx2_rvUBexiGza74usm1JaLlb6EHfm2nBWLK7evPgEIrson |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1ZS8NAEB7qAeqDVxXruQ--adq4OXb3sRRLvUrRWvoW9ooUNBHTiv57d9e0CorgW0h2IMzsZr7JHB_AsRRhoLVOPRlr5YWaKI8LLLxYxNx6XIGV78gmSLdLh0PWq8DprBfGCLriM123ly6Xr3I5sb_KGjS21GN0DhYsc1bZrTXLGYTU0YkZwHDmUePJpi0yPmv0W3eDvq3jYnWbWKMOUH65Icer8uNj7DxMe-1_77YOqyWSRM1P029ARWebsPJtvmAVupf5KBs_vqNyiuoDGoyKiRHq5YUuEM-UuzKP38yGyJABsOhOPxlljyRquo4HdDutMMqzLbhvn_dbHa8kUPAkZtHY0wFRhKfCHjOiQibjSIWBLzSNMGfUgAmSRkQILnzNQxMpp_aeJoRwauf0Bdswn-WZ3gGUMoJtlloZnx7yEAuayohTEWCZ8oD5NTib6jOR5XRxS3LxmLgow2eJs0FibZCUNqjByUzm-XO2xp-rq1brs5WlwmuwPzVbUh6_IsEGVppANPDZ7u9SR7DU6d9cJ9cX3as9WMY2iHal2PswP36Z6ANYlK_jUfFy6PbYB76FzXA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Jointly+Learning+Visual+Poses+and+Pose+Lexicon+for+Semantic+Action+Recognition&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Zhou%2C+Lijuan&rft.au=Li%2C+Wanqing&rft.au=Ogunbona%2C+Philip&rft.au=Zhang%2C+Zhengyou&rft.date=2020-02-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=30&rft.issue=2&rft.spage=457&rft_id=info:doi/10.1109%2FTCSVT.2019.2890829&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |