REMU: Memory-aware Radiation Emulation via Dual Addressing for In-orbit Deep Learning System

The deployment of commercial-off-the-shelf (COTS) GPUs in space has emerged as a promising approach for supporting inorbit deep neural network (DNN) inference. However, unlike terrestrial environments, understanding the impact of space radiation on COTS GPU-enabled DNNs is critical. This is challeng...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2025 62nd ACM/IEEE Design Automation Conference (DAC) S. 1 - 7
Hauptverfasser: Xu, Longnv, Wang, Meiqi, Qiu, Han, Liu, Jun, Li, Yuanjie, Li, Hewu
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 22.06.2025
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The deployment of commercial-off-the-shelf (COTS) GPUs in space has emerged as a promising approach for supporting inorbit deep neural network (DNN) inference. However, unlike terrestrial environments, understanding the impact of space radiation on COTS GPU-enabled DNNs is critical. This is challenging because existing methods, such as real-world radiation testing and software emulation, fail to link radiation-induced memory errors to runtime DNN behaviors. In this paper, we propose REMU, a memory-aware Radiation EMUlator to fill this gap. REMU introduces a dual addressing mechanism across virtual, physical, and DRAM memory spaces, enabling precise mapping and efficient injection of radiation-induced errors from DRAM to runtime DNN inference. Extensive evaluations across 10 well-known DNN models and 2 typical in-orbit computing tasks demonstrate the effectiveness of REMU, providing valuable insights for understanding the resilience of runtime DNN inferences on space radiations.
DOI:10.1109/DAC63849.2025.11132935