Modified Triplet-Average Deep Deterministic Policy Gradient for interpretable neuro-fuzzy deep reinforcement learning

In order to find the control rules of the nonlinear system from the learned data, it is necessary to interpret the learned policy in Deep Reinforcement Learning (DRL). This paper presents a novel interpretable Neuro-Fuzzy (NF) inference system based on Modified Triplet-Average Deep Deterministic Pol...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Journal of the Franklin Institute Ročník 362; číslo 7
Hlavní autoři:	Nguyen, Tuan-Linh, Thin, Nguyen Van, Lee, Sangmoon
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier Inc 01.05.2025
Témata:	Interpretable neuro-fuzzy controller Inverted pendulum Reinforcement learning Twin-delay Two-phase training Twin-delay Inverted pendulum Two-phase training Reinforcement learning Interpretable neuro-fuzzy controller
ISSN:	0016-0032
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In order to find the control rules of the nonlinear system from the learned data, it is necessary to interpret the learned policy in Deep Reinforcement Learning (DRL). This paper presents a novel interpretable Neuro-Fuzzy (NF) inference system based on Modified Triplet-Average Deep Deterministic Policy Gradient (MTADD) reinforcement learning algorithm with a two-phased training method. The first phase involves exploring and initiating the T-S fuzzy system rule and premise parameter. The second step is the deep reinforcement learning of the NF policy network, which uses a Modified Triplet-Average Deep Deterministic policy gradient algorithm. The experiment results demonstrate that the proposed approach decreases the training time, enhances the control performance, and increases the interpretability of NF DRL. •A novel Neuro-Fuzzy (NF) inference system is presented with a modified deep RL algorithm.•The new two-phase method for the neuro-fuzzy RL model reduces the training time.•Experiment results verify the effectiveness of the proposed approach.
ISSN:	0016-0032
DOI:	10.1016/j.jfranklin.2025.107653