In EDS ansehen

Compensating Environmental Disturbances in Maritime Path Following Using Deep Reinforcement Learning.

Gespeichert in:

Bibliographische Detailangaben
Titel:	Compensating Environmental Disturbances in Maritime Path Following Using Deep Reinforcement Learning.
Autoren:	Krautwig, Björn, Wans, Dominik, Temmen, Till, Brinkmann, Tobias, Lee, Sung-Yong, Kim, Daehyuk, Andert, Jakob
Quelle:	Journal of Marine Science & Engineering; Feb2026, Vol. 14 Issue 4, p327, 22p
Schlagwörter:	REINFORCEMENT learning, AUTONOMOUS vehicles, NONLINEAR control theory, ECOLOGICAL disturbances, FEEDBACK control systems
Abstract:	One of the major challenges in autonomous path following for unmanned surface vehicles (USVs) is the impact of stochastic environmental forces—primarily wind, waves and currents—which introduce nonlinearities that affect control models. Conventional strategies often rely on minimizing cross-track error, resulting in a reactive system that corrects heading only after a disturbance has displaced the vessel, potentially leading to oscillatory behavior and reduced precision. Deep Reinforcement Learning (DRL) is successfully used for a wide range of nonlinear control tasks. It has already been shown that robust solutions that can handle disturbances such as sensor noise or changes in system dynamics can be obtained. This study investigates whether an agent, provided it can explicitly observe disturbances, can go beyond simply correcting deviations and autonomously learn the correlation between environmental conditions and necessary counter-forces. We show that integrating the wind vector directly into the agent's observation space allows a Proximal Policy Optimization (PPO) policy to decouple the environmental cause from the kinematic effect, facilitating drift compensation before significant errors accumulate. By systematically comparing agents trained with randomized wind scenarios, we found that agents that can observe the wind can achieve goal reaching rates of up to 99.0% and reduce the spread of path deviation and velocity in our tested scenarios. Furthermore, our results quantify a distinct Pareto frontier between navigational velocity and tracking precision, demonstrating that explicit disturbance perception improves consistency, although robust implicit training already provides substantial resilience. These findings indicate that augmenting state observations with environmental data enhances the stability of learning-based controllers. [ABSTRACT FROM AUTHOR]
	Copyright of Journal of Marine Science & Engineering is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank:	Complementary Index

Full Text Finder

Nájsť tento článok vo Web of Science

Beschreibung
Abstract:	One of the major challenges in autonomous path following for unmanned surface vehicles (USVs) is the impact of stochastic environmental forces—primarily wind, waves and currents—which introduce nonlinearities that affect control models. Conventional strategies often rely on minimizing cross-track error, resulting in a reactive system that corrects heading only after a disturbance has displaced the vessel, potentially leading to oscillatory behavior and reduced precision. Deep Reinforcement Learning (DRL) is successfully used for a wide range of nonlinear control tasks. It has already been shown that robust solutions that can handle disturbances such as sensor noise or changes in system dynamics can be obtained. This study investigates whether an agent, provided it can explicitly observe disturbances, can go beyond simply correcting deviations and autonomously learn the correlation between environmental conditions and necessary counter-forces. We show that integrating the wind vector directly into the agent's observation space allows a Proximal Policy Optimization (PPO) policy to decouple the environmental cause from the kinematic effect, facilitating drift compensation before significant errors accumulate. By systematically comparing agents trained with randomized wind scenarios, we found that agents that can observe the wind can achieve goal reaching rates of up to 99.0% and reduce the spread of path deviation and velocity in our tested scenarios. Furthermore, our results quantify a distinct Pareto frontier between navigational velocity and tracking precision, demonstrating that explicit disturbance perception improves consistency, although robust implicit training already provides substantial resilience. These findings indicate that augmenting state observations with environmental data enhances the stability of learning-based controllers. [ABSTRACT FROM AUTHOR]
ISSN:	20771312
DOI:	10.3390/jmse14040327