Architectural Support for Task Dependence Management with Flexible Software Scheduling

The growing complexity of multi-core architectures has motivated a wide range of software mechanisms to improve the orchestration of parallel executions. Task parallelism has become a very attractive approach thanks to its programmability, portability and potential for optimizations. However, with t...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings - International Symposium on High-Performance Computer Architecture s. 283 - 295
Hlavní autoři: Castillo, Emilio, Alvarez, Lluc, Moreto, Miquel, Casas, Marc, Vallejo, Enrique, Bosque, Jose Luis, Beivide, Ramon, Valero, Mateo
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.02.2018
Témata:
ISSN:2378-203X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The growing complexity of multi-core architectures has motivated a wide range of software mechanisms to improve the orchestration of parallel executions. Task parallelism has become a very attractive approach thanks to its programmability, portability and potential for optimizations. However, with the expected increase in core counts, finer-grained tasking will be required to exploit the available parallelism, which will increase the overheads introduced by the runtime system. This work presents Task Dependence Manager (TDM), a hardware/software co-designed mechanism to mitigate runtime system overheads. TDM introduces a hardware unit, denoted Dependence Management Unit (DMU), and minimal ISA extensions that allow the runtime system to offload costly dependence tracking operations to the DMU and to still perform task scheduling in software. With lower hardware cost, TDM outperforms hardware-based solutions and enhances the flexibility, adaptability and composability of the system. Results show that TDM improves performance by 12.3% and reduces EDP by 20.4% on average with respect to a software runtime system. Compared to a runtime system fully implemented in hardware, TDM achieves an average speedup of 4.2% with 7.3x less area requirements and significant EDP reductions. In addition, five different software schedulers are evaluated with TDM, illustrating its flexibility and performance gains.
ISSN:2378-203X
DOI:10.1109/HPCA.2018.00033