Neural combinatorial optimization for multi-rendezvous mission design
•Applied Neural Combinatorial Optimization to plan low-thrust multi-target rendezvous sequences for Active Debris Removal missions.•Created statistical model of the Iridium 33 debris cloud to generate realistic scenarios for training and evaluating NCO policies.•Demonstrated effectiveness for missio...
Saved in:
| Published in: | Advances in space research Vol. 75; no. 10; pp. 7306 - 7326 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
15.05.2025
|
| Subjects: | |
| ISSN: | 0273-1177 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | •Applied Neural Combinatorial Optimization to plan low-thrust multi-target rendezvous sequences for Active Debris Removal missions.•Created statistical model of the Iridium 33 debris cloud to generate realistic scenarios for training and evaluating NCO policies.•Demonstrated effectiveness for missions with 10 targets, achieving a 32% optimality gap, but identified scalability challenges with larger target numbers.
Optimal solutions to spacecraft routing problems are essential for space logistics activity such as Active Debris Removal (ADR), which addresses the growing threat of space debris. This research investigates the effectiveness of Neural Combinatorial Optimization (NCO) methods for the autonomous planning of low-thrust, multi-target ADR missions, an instance of the Space Traveling Salesman Problem (STSP). An autoregressive, attention-based routing policy was trained to solve 10-transfer ADR routing problems using REINFORCE, Advantage Actor-Critic, and Proximal Policy Optimization. A hyperparameter sensitivity analysis identified embedding dimension and the number of encoder layers as the critical factors influencing model performance, while an ablation study found the attention-based encoder to be the most critical architectural component of the policy. The trained policy was evaluated on 10-, 30-, and 50-transfer scenarios based on the Iridium 33 debris cloud, comparing its performance to a baseline provided by a novel ADR STSP routing heuristic (Dynamic RAAN Walk, DRW) and near-optimal benchmarks obtained via Heuristic Combinatorial Optimization (HCO). In missions with 10 transfers, the NCO policy achieved a mean optimality gap of 32%, outperforming DRW. However, performance degraded significantly in scenarios with 30 and 50 transfers, suggesting limited generalization to larger problems. A hyperparameter search further revealed that the performance of the NCO model considered in this work improves asymptotically with its size. Exposure to greater numbers of training scenarios did not yield significant performance gains. This work demonstrates that NCO methods can be effective for the autonomous planning of ADR missions with a limited number of targets, but face scalability and generalization challenges in more complex scenarios. |
|---|---|
| ISSN: | 0273-1177 |
| DOI: | 10.1016/j.asr.2025.03.050 |