Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs

We present dataflow mirroring, architectural support for low-overhead fine-grained systolic array allocation which overcomes the limitations of prior coarse-grained spatial-multitasking Neural Processing Unit (NPU) architectures. The key idea of dataflow mirroring is to reverse the dataflows of co-l...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	2021 58th ACM/IEEE Design Automation Conference (DAC) s. 247 - 252
Hlavní autoři:	Lee, Jounghoo, Choi, Jinwoo, Kim, Jaeyeon, Lee, Jinho, Kim, Youngsok
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 05.12.2021
Témata:	Artificial neural networks Design automation Hardware Multitasking Prototypes Resource management Systolic arrays
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	We present dataflow mirroring, architectural support for low-overhead fine-grained systolic array allocation which overcomes the limitations of prior coarse-grained spatial-multitasking Neural Processing Unit (NPU) architectures. The key idea of dataflow mirroring is to reverse the dataflows of co-located Neural Networks (NNs) in horizontal and/or vertical directions, allowing allocation boundaries to be set between any adjacent rows and columns of a systolic array and supporting up to four-way spatial multitasking. Our detailed experiments using MLPerf NNs and a dataflow-mirroring-augmented NPU prototype which extends Google's TPU with dataflow mirroring shows that dataflow mirroring can significantly improve the multitasking performance by up to 46.4%.
DOI:	10.1109/DAC18074.2021.9586312