Modeling and reinforcement learning-based locomotion control for a humanoid robot with kinematic loop closures

Humanoid robots are complex multibody systems, and modeling and locomotion control for them are challenging tasks. In this paper, a rigid multibody model is first built for a home-made humanoid robot with kinematic loop closures. The inverse kinematics solutions based on geometric relationships are...

Full description

Saved in:

Bibliographic Details
Published in:	Multibody system dynamics Vol. 65; no. 2; pp. 239 - 265
Main Authors:	Tang, Lingling, Liang, Dingkun, Gao, Guang, Wang, Xin, Xie, Anhuan
Format:	Journal Article
Language:	English
Published:	Dordrecht Springer Netherlands 01.10.2025 Springer Nature B.V
Subjects:	Automotive Engineering Control Deep learning Dynamical Systems Electrical Engineering Engineering Feet Gait Humanoid Inverse kinematics Kinematics Locomotion Mechanical Engineering Modelling Multibody systems Optimization Robot control Robot dynamics Symmetry Tracking Vibration Whole-body coordination Locomotion control Inverse kinematics Gait symmetry Reinforcement learning Humanoid robot
ISSN:	1384-5640, 1573-272X
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Humanoid robots are complex multibody systems, and modeling and locomotion control for them are challenging tasks. In this paper, a rigid multibody model is first built for a home-made humanoid robot with kinematic loop closures. The inverse kinematics solutions based on geometric relationships are then presented for the parallel mechanisms of the knee and ankle joints, and contact detection procedures for foot–ground interactions on flat terrain and collisions between legs are simplified. Based on the above modeling work, a deep reinforcement learning (RL)-based strategy is presented for locomotion control. The reward function in the RL environment is well designed, where the foot periodic cycle penalty is implemented based on the complementary conditions of foot velocity and foot–ground interaction force. A new method is proposed to encourage the symmetric gait by penalizing the differences in the mean values and standard deviations between left and right joint angles, and the whole-body coordination is realized by tracking a pair of reference trajectories of the shoulder pitch degrees of freedom (DoFs). Finally, to verify the effectiveness of the proposed RL-based locomotion control strategy, we present several training cases, each with a separate RL agent, and the goals of foot periodic cycle with a frequency of 2 Hz, gait symmetry, forward speed up to 10 km/h, whole-body coordinated gait, and time-varying velocity command tracking are successfully achieved.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1384-5640 1573-272X
DOI:	10.1007/s11044-024-10035-z