Enhancing Data-Driven Algorithms for Human Pose Estimation and Action Recognition Through Simulation
Recognizing human actions, reliably inferring their meaning and being able to potentially exchange mutual social information are core challenges for autonomous systems when they directly share the same space with humans. Intelligent transport systems in particular face this challenge, as interaction...
Saved in:
| Published in: | IEEE transactions on intelligent transportation systems Vol. 21; no. 9; pp. 3990 - 3999 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.09.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1524-9050, 1558-0016 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Recognizing human actions, reliably inferring their meaning and being able to potentially exchange mutual social information are core challenges for autonomous systems when they directly share the same space with humans. Intelligent transport systems in particular face this challenge, as interactions with people are often required. The development and testing of technical perception solutions is done mostly on standard vision benchmark datasets for which manual labelling of sensory ground truth has been a tedious but necessary task. Furthermore, rarely occurring human activities are underrepresented in these datasets, leading to algorithms not recognizing such activities. For this purpose, we introduce a modular simulation framework, which offers to train and validate algorithms on various human-centred scenarios. We describe the usage of simulation data to train a state-of-the-art human pose estimation algorithm to recognize unusual human activities in urban areas. Since the recognition of human actions can be an important component of intelligent transport systems, we investigated how simulations can be applied for his purpose. Laboratory experiments show that we can train a recurrent neural network with only simulated data based on motion capture data and 3D avatars, which achieves an almost perfect performance in the classification of those human actions on real data. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1524-9050 1558-0016 |
| DOI: | 10.1109/TITS.2020.2988504 |