Efficient fuel-optimal multi-impulse orbital transfer via contrastive pre-trained reinforcement learning

Multi-impulse transfers between noncoplanar orbits are significant for on-orbit service spacecraft. This paper investigates the complex optimization problem of multi-impulse orbital transfer involving a chaser and a target. The chaser is subject to constraints on impulse magnitude and time, while th...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Advances in space research Ročník 75; číslo 10; s. 7377 - 7396
Hlavní autoři: Ren, He, Gui, Haichao, Zhong, Rui
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 15.05.2025
Témata:
ISSN:0273-1177
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Multi-impulse transfers between noncoplanar orbits are significant for on-orbit service spacecraft. This paper investigates the complex optimization problem of multi-impulse orbital transfer involving a chaser and a target. The chaser is subject to constraints on impulse magnitude and time, while the target may experience uncertain disturbances, causing it to deviate from the nominal orbit. The complexity of this problem imposes a significant computational burden on numerical methods, making it challenging for spacecraft to autonomously plan trajectory transfers in real time. To mitigate this burden, we propose a robust, fast, and autonomous algorithm for the optimization challenge, which can rapid plan transfer trajectories. Even if the terminal conditions suddenly change, our algorithm can quickly adjust the trajectory based on observed states without the need to completely re-plan. The algorithm comprises an intelligent trajectory generator and a Lambert transfer algorithm. The intelligent generator is based on a reinforcement learning (RL) method called contrastive-pre-trained Reinforcement Learning (CPRL), which emulates human learning habits to avoid the temporal credit assignment with long time horizons and sparse rewards during the training phase. When the chaser reaches an admissible range, determined by the impulse constraints and geometric relations of the conic curve, the algorithm adopts the Lambert transfer to complete the mission. Compared to traditional genetic and particle swarm algorithms, our method achieves a significant improvement in computational speed. Even with deviations, the average mission success rate remains at 96.8%. Numerical simulations confirm that our algorithm processes data quickly, can be deployed online, and is capable of handling various tasks in real time.
ISSN:0273-1177
DOI:10.1016/j.asr.2025.02.049