BoMW: Bag of Manifold Words for One-Shot Learning Gesture Recognition From Kinect

In this paper, we study one-shot learning gesture recognition on RGB-D data recorded from Microsoft's Kinect. To this end, we propose a novel bag of manifold words (BoMW)-based feature representation on symmetric positive definite (SPD) manifolds. In particular, we use covariance matrices to ex...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transactions on circuits and systems for video technology Ročník 28; číslo 10; s. 2562 - 2573
Hlavní autori:	Zhang, Lei, Zhang, Shengping, Jiang, Feng, Qi, Yuankai, Zhang, Jun, Guo, Yuliang, Zhou, Huiyu
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York IEEE 01.10.2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Coding covariance descriptor Covariance matrix Encoding Euclidean geometry Euclidean space Feature extraction Gesture recognition Hidden Markov models kernel sparse coding Learning Manifolds Mathematical analysis Matrix methods Representations reproducing kernel Hilbert space Riemannian manifold Shape State of the art Teaching methods Three-dimensional displays Video data
ISSN:	1051-8215, 1558-2205
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	In this paper, we study one-shot learning gesture recognition on RGB-D data recorded from Microsoft's Kinect. To this end, we propose a novel bag of manifold words (BoMW)-based feature representation on symmetric positive definite (SPD) manifolds. In particular, we use covariance matrices to extract local features from RGB-D data due to its compact representation ability as well as the convenience of fusing both RGB and depth information. Since covariance matrices are SPD matrices and the space spanned by them is the SPD manifold, traditional learning methods in the Euclidean space, such as sparse coding, cannot be directly applied to them. To overcome this problem, we propose a unified framework to transfer the sparse coding on SPD manifolds to the one on the Euclidean space, which enables any existing learning method to be used. After building BoMW representation on a video from each gesture class, a nearest neighbor classifier is adopted to perform the one-shot learning gesture recognition. Experimental results on the ChaLearn gesture data set demonstrate the outstanding performance of the proposed one-shot learning gesture recognition method compared against the state-of-the-art methods. The effectiveness of the proposed feature extraction method is also validated on a new RGB-D action recognition data set.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2017.2721108