Application of an Off‐Policy Reinforcement Learning Algorithm for H∞${{H}_\infty }$ Control Design of Nonlinear Structural Systems With Completely Unknown Dynamics

ABSTRACT This paper proposes a model‐free and online off‐policy algorithm based on reinforcement learning (RL) for vibration attenuation of earthquake‐excited structures, through designing an optimal H∞${{H}_\infty }$ controller. This design relies on solving a two‐player zero‐sum game theory with a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Earthquake engineering & structural dynamics Jg. 54; H. 4; S. 1210 - 1228
Hauptverfasser:	Amirmojahedi, M., Mojoodi, A., Shojaee, Saeed, Hamzehei‐Javaran, Saleh
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Bognor Regis Wiley Subscription Services, Inc 01.04.2025
Schlagworte:	Active damping Algorithms Civil engineering Control systems design Controllers Earthquakes Game theory H-infinity control H∞${{H}_\infty }$ control Machine learning Neural networks nonlinear building Nonlinear control Nonlinear systems online reinforcement learning Policies State feedback Strategy System dynamics two‐player zero‐sum game Vibration Zero sum games
ISSN:	0098-8847, 1096-9845
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	ABSTRACT This paper proposes a model‐free and online off‐policy algorithm based on reinforcement learning (RL) for vibration attenuation of earthquake‐excited structures, through designing an optimal H∞${{H}_\infty }$ controller. This design relies on solving a two‐player zero‐sum game theory with a Hamilton–Jacobi–Isaacs (HJI) equation, which is extremely difficult, or often impossible, to be solved for the value function and the related optimal controller. The proposed strategy uses an actor‐critic‐disturbance structure to learn the solution of the HJI equation online and forward in time, without requiring any knowledge of the system dynamics. In addition, the control and disturbance policies and value function are approximated by the actor, the disturbance, and the critic neural networks (NNs), respectively. Implementing the policy iteration technique, the NNs’ weights of the proposed model are calculated using the least square (LS) method in each iteration. In the present study, the convergence of the proposed algorithm is investigated through two distinct examples. Furthermore, the performance of this off‐policy RL strategy is studied in reducing the response of a seismically excited nonlinear structure with an active mass damper (AMD) for two cases of state feedback. The simulation results prove the effectiveness of the proposed algorithm in application to civil engineering structures.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0098-8847 1096-9845
DOI:	10.1002/eqe.4299