Modeling and reinforcement learning-based locomotion control for a humanoid robot with kinematic loop closures

Humanoid robots are complex multibody systems, and modeling and locomotion control for them are challenging tasks. In this paper, a rigid multibody model is first built for a home-made humanoid robot with kinematic loop closures. The inverse kinematics solutions based on geometric relationships are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multibody system dynamics Jg. 65; H. 2; S. 239 - 265
Hauptverfasser: Tang, Lingling, Liang, Dingkun, Gao, Guang, Wang, Xin, Xie, Anhuan
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Dordrecht Springer Netherlands 01.10.2025
Springer Nature B.V
Schlagworte:
ISSN:1384-5640, 1573-272X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Humanoid robots are complex multibody systems, and modeling and locomotion control for them are challenging tasks. In this paper, a rigid multibody model is first built for a home-made humanoid robot with kinematic loop closures. The inverse kinematics solutions based on geometric relationships are then presented for the parallel mechanisms of the knee and ankle joints, and contact detection procedures for foot–ground interactions on flat terrain and collisions between legs are simplified. Based on the above modeling work, a deep reinforcement learning (RL)-based strategy is presented for locomotion control. The reward function in the RL environment is well designed, where the foot periodic cycle penalty is implemented based on the complementary conditions of foot velocity and foot–ground interaction force. A new method is proposed to encourage the symmetric gait by penalizing the differences in the mean values and standard deviations between left and right joint angles, and the whole-body coordination is realized by tracking a pair of reference trajectories of the shoulder pitch degrees of freedom (DoFs). Finally, to verify the effectiveness of the proposed RL-based locomotion control strategy, we present several training cases, each with a separate RL agent, and the goals of foot periodic cycle with a frequency of 2 Hz, gait symmetry, forward speed up to 10 km/h, whole-body coordinated gait, and time-varying velocity command tracking are successfully achieved.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1384-5640
1573-272X
DOI:10.1007/s11044-024-10035-z