SpMMPlu: A Compiler Plug-in with Sparse IR for Efficient Sparse Matrix Multiplication

Sparsity is becoming arguably the most critical dimension to explore for efficiency and scalability as deep learning models grow significantly larger. Particularly, pruning is a common method to reduce redundant computations in attention-based and convolution-based models. The induced sparse matrix...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	2023 60th ACM/IEEE Design Automation Conference (DAC) s. 1 - 6
Hlavní autoři:	Yang, Tao, Zhou, Yiyuan, Tang, Qidong, Xu, Feng, Ma, Hui, Zhao, Jieru, Jiang, Li
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 09.07.2023
Témata:	CNN Computational modeling Computer architecture Deep learning DNN compiler Graphics processing units Intermediate representation Parallel processing Plug-in Processor scheduling Scalability Sparsity Transformer
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Buďte první, kdo okomentuje tento záznam!