PREMA: A Predictive Multi-Task Scheduling Algorithm For Preemptible Neural Processing Units
To amortize cost, cloud vendors providing DNN acceleration as a service to end-users employ consolidation and virtualization to share the underlying resources among multiple DNN service requests. This paper makes a case for a "preemptible" neural processing unit (NPU) and a "predictiv...
Saved in:
| Published in: | Proceedings - International Symposium on High-Performance Computer Architecture pp. 220 - 233 |
|---|---|
| Main Authors: | , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.02.2020
|
| Subjects: | |
| ISSN: | 2378-203X |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!