REMU: Memory-aware Radiation Emulation via Dual Addressing for In-orbit Deep Learning System

The deployment of commercial-off-the-shelf (COTS) GPUs in space has emerged as a promising approach for supporting inorbit deep neural network (DNN) inference. However, unlike terrestrial environments, understanding the impact of space radiation on COTS GPU-enabled DNNs is critical. This is challeng...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2025 62nd ACM/IEEE Design Automation Conference (DAC) s. 1 - 7
Hlavní autori: Xu, Longnv, Wang, Meiqi, Qiu, Han, Liu, Jun, Li, Yuanjie, Li, Hewu
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 22.06.2025
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The deployment of commercial-off-the-shelf (COTS) GPUs in space has emerged as a promising approach for supporting inorbit deep neural network (DNN) inference. However, unlike terrestrial environments, understanding the impact of space radiation on COTS GPU-enabled DNNs is critical. This is challenging because existing methods, such as real-world radiation testing and software emulation, fail to link radiation-induced memory errors to runtime DNN behaviors. In this paper, we propose REMU, a memory-aware Radiation EMUlator to fill this gap. REMU introduces a dual addressing mechanism across virtual, physical, and DRAM memory spaces, enabling precise mapping and efficient injection of radiation-induced errors from DRAM to runtime DNN inference. Extensive evaluations across 10 well-known DNN models and 2 typical in-orbit computing tasks demonstrate the effectiveness of REMU, providing valuable insights for understanding the resilience of runtime DNN inferences on space radiations.
DOI:10.1109/DAC63849.2025.11132935