Modeling and reinforcement learning-based locomotion control for a humanoid robot with kinematic loop closures

Humanoid robots are complex multibody systems, and modeling and locomotion control for them are challenging tasks. In this paper, a rigid multibody model is first built for a home-made humanoid robot with kinematic loop closures. The inverse kinematics solutions based on geometric relationships are...

Full description

Saved in:
Bibliographic Details
Published in:Multibody system dynamics Vol. 65; no. 2; pp. 239 - 265
Main Authors: Tang, Lingling, Liang, Dingkun, Gao, Guang, Wang, Xin, Xie, Anhuan
Format: Journal Article
Language:English
Published: Dordrecht Springer Netherlands 01.10.2025
Springer Nature B.V
Subjects:
ISSN:1384-5640, 1573-272X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Humanoid robots are complex multibody systems, and modeling and locomotion control for them are challenging tasks. In this paper, a rigid multibody model is first built for a home-made humanoid robot with kinematic loop closures. The inverse kinematics solutions based on geometric relationships are then presented for the parallel mechanisms of the knee and ankle joints, and contact detection procedures for foot–ground interactions on flat terrain and collisions between legs are simplified. Based on the above modeling work, a deep reinforcement learning (RL)-based strategy is presented for locomotion control. The reward function in the RL environment is well designed, where the foot periodic cycle penalty is implemented based on the complementary conditions of foot velocity and foot–ground interaction force. A new method is proposed to encourage the symmetric gait by penalizing the differences in the mean values and standard deviations between left and right joint angles, and the whole-body coordination is realized by tracking a pair of reference trajectories of the shoulder pitch degrees of freedom (DoFs). Finally, to verify the effectiveness of the proposed RL-based locomotion control strategy, we present several training cases, each with a separate RL agent, and the goals of foot periodic cycle with a frequency of 2 Hz, gait symmetry, forward speed up to 10 km/h, whole-body coordinated gait, and time-varying velocity command tracking are successfully achieved.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1384-5640
1573-272X
DOI:10.1007/s11044-024-10035-z