Efficiently Removing Sparsity for High-Throughput Stream Processing ; The International Conference on Field-Programmable Technology (FPT) 2023
Gespeichert in:
| Titel: | Efficiently Removing Sparsity for High-Throughput Stream Processing ; The International Conference on Field-Programmable Technology (FPT) 2023 |
|---|---|
| Autoren: | Papaphilippou, Philippos |
| Publikationsjahr: | 2023 |
| Bestand: | The University of Dublin, Trinity College: TARA (Trinity's Access to Research Archive) |
| Schlagwörter: | Prefix scan, Interconnects, FPGA, Stream compaction, Aggregation, High- throughput computation, Analytics, Computer Architecture, Computer Engineering, Computer Science, Parallel Computer Architecture, Parallel Programming, Parallel Systems |
| Beschreibung: | Big data analytics and machine learning are increasingly targeted by FPGAs due to their significant amount of computing capabilities and internal parallelism. Different programming models are used to distribute the workload to the internals of the FPGAs at different granularities. While the memory bandwidth has been steadily increasing, there are some challenges in the way system-on-chips use this bandwidth. One way system-on-chip architects exploit the increasing memory bandwidth is by widening the datapath width. This is reflected at various points in the system including the widening of vector instructions. On FPGAs, many analytics accelerators are memory-bound, and would benefit from making the most of the available bandwidth. In this paper we present a scalable and highly-efficient building block for building high-throughput streaming accelerators, which removes sparsity on-the-fly without backpressure. |
| Publikationsart: | conference object |
| Dateibeschreibung: | application/pdf |
| Sprache: | English |
| Relation: | Y; http://hdl.handle.net/2262/104146; http://people.tcd.ie/papaphip; 260091 |
| Verfügbarkeit: | http://hdl.handle.net/2262/104146 http://people.tcd.ie/papaphip |
| Rights: | Y ; openAccess |
| Dokumentencode: | edsbas.BD543074 |
| Datenbank: | BASE |
| Abstract: | Big data analytics and machine learning are increasingly targeted by FPGAs due to their significant amount of computing capabilities and internal parallelism. Different programming models are used to distribute the workload to the internals of the FPGAs at different granularities. While the memory bandwidth has been steadily increasing, there are some challenges in the way system-on-chips use this bandwidth. One way system-on-chip architects exploit the increasing memory bandwidth is by widening the datapath width. This is reflected at various points in the system including the widening of vector instructions. On FPGAs, many analytics accelerators are memory-bound, and would benefit from making the most of the available bandwidth. In this paper we present a scalable and highly-efficient building block for building high-throughput streaming accelerators, which removes sparsity on-the-fly without backpressure. |
|---|
Nájsť tento článok vo Web of Science