Efficient Buffer Overflow Detection on GPU

Rich thread-level parallelism of GPU has motivated co-running GPU kernels on a single GPU. However, when GPU kernels co-run, it is possible that one kernel can leverage buffer overflow to attack another kernel running on the same GPU. There is very limited work aiming to detect buffer overflow for G...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transactions on parallel and distributed systems Ročník 32; číslo 5; s. 1161 - 1177
Hlavní autori:	Di, Bang, Sun, Jianhua, Chen, Hao, Li, Dong
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York IEEE 01.05.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Arrays Buffer overflows Buffers CUDA Data structures Garbage collection GPGPU Graphics processing units Instruction sets Kernel Kernels Memory management Monitoring Overflow Performance evaluation Resource management Run time (computers) Runtime unified memory Workload
ISSN:	1045-9219, 1558-2183
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Rich thread-level parallelism of GPU has motivated co-running GPU kernels on a single GPU. However, when GPU kernels co-run, it is possible that one kernel can leverage buffer overflow to attack another kernel running on the same GPU. There is very limited work aiming to detect buffer overflow for GPU. Existing work has either large performance overhead or limited capability in detecting buffer overflow. In this article, we introduce GMODx, a runtime software system that can detect GPU buffer overflow. GMODx performs always-on monitoring on allocated memory based on a canary-based design. First, for the fine-grained memory management, GMODx introduces a set of byte arrays to store buffer information for overflow detection. Techniques, such as lock-free accesses to the byte arrays, delayed memory free, efficient memory reallocation, and garbage collection for the byte arrays, are proposed to achieve high performance. Second, for the coarse-grained memory management, GMODx utilizes unified memory to delegate the always-on monitoring to the CPU. To reduce performance overhead, we propose several techniques, including customized list data structure and specific optimizations against the unified memory. For micro-benchmarking, our experiments show that GMODx is capable of detecting buffer overflow for the fine-grained memory management without performance loss, and that it incurs small runtime overhead (4.2 percent on average and up to 9.7 percent) for the coarse-grained memory management. For real workloads, we deploy GMODx on the TensorFlow framework, it only causes 0.8 percent overhead on average (up to 1.8 percent).
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2020.3042965