PREMA: A Predictive Multi-Task Scheduling Algorithm For Preemptible Neural Processing Units

To amortize cost, cloud vendors providing DNN acceleration as a service to end-users employ consolidation and virtualization to share the underlying resources among multiple DNN service requests. This paper makes a case for a "preemptible" neural processing unit (NPU) and a "predictiv...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings - International Symposium on High-Performance Computer Architecture pp. 220 - 233
Main Authors: Choi, Yujeong, Rhu, Minsoo
Format: Conference Proceeding
Language:English
Published: IEEE 01.02.2020
Subjects:
ISSN:2378-203X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first