Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication

Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and graph algorithms. However, most previous work on parallel matrix multiplication considered only both dense or both sparse matrix operands. This...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings - IEEE International Parallel and Distributed Processing Symposium pp. 842 - 853
Main Authors: Koanantakool, Penporn, Azad, Ariful, Buluc, Aydin, Morozov, Dmitriy, Sang-Yun Oh, Oliker, Leonid, Yelick, Katherine
Format: Conference Proceeding
Language:English
Published: IEEE 01.05.2016
Subjects:
ISSN:1530-2075
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and graph algorithms. However, most previous work on parallel matrix multiplication considered only both dense or both sparse matrix operands. This paper analyzes the communication lower bounds and compares the communication costs of various classic parallel algorithms in the context of sparse-dense matrix-matrix multiplication. We also present new communication-avoiding algorithms based on a 1D decomposition, called 1.5D, which - while suboptimal in dense-dense and sparse-sparse cases - outperform the 2D and 3D variants both theoretically and in practice for sparse-dense multiplication. Our analysis separates one-time costs from per iteration costs in an iterative machine learning context. Experiments demonstrate speedups up to 100x over a baseline 3D SUMMA implementation and show parallel scaling over 10 thousand cores.
AbstractList Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and graph algorithms. However, most previous work on parallel matrix multiplication considered only both dense or both sparse matrix operands. This paper analyzes the communication lower bounds and compares the communication costs of various classic parallel algorithms in the context of sparse-dense matrix-matrix multiplication. We also present new communication-avoiding algorithms based on a 1D decomposition, called 1.5D, which - while suboptimal in dense-dense and sparse-sparse cases - outperform the 2D and 3D variants both theoretically and in practice for sparse-dense multiplication. Our analysis separates one-time costs from per iteration costs in an iterative machine learning context. Experiments demonstrate speedups up to 100x over a baseline 3D SUMMA implementation and show parallel scaling over 10 thousand cores.
Author Azad, Ariful
Buluc, Aydin
Koanantakool, Penporn
Sang-Yun Oh
Morozov, Dmitriy
Yelick, Katherine
Oliker, Leonid
Author_xml – sequence: 1
  givenname: Penporn
  surname: Koanantakool
  fullname: Koanantakool, Penporn
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
– sequence: 2
  givenname: Ariful
  surname: Azad
  fullname: Azad, Ariful
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
– sequence: 3
  givenname: Aydin
  surname: Buluc
  fullname: Buluc, Aydin
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
– sequence: 4
  givenname: Dmitriy
  surname: Morozov
  fullname: Morozov, Dmitriy
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
– sequence: 5
  surname: Sang-Yun Oh
  fullname: Sang-Yun Oh
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
– sequence: 6
  givenname: Leonid
  surname: Oliker
  fullname: Oliker, Leonid
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
– sequence: 7
  givenname: Katherine
  surname: Yelick
  fullname: Yelick, Katherine
  organization: Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
BookMark eNotjslqwzAURVVIoUmabTfd-AeUvidZ0zI4HQwJNaRdB9mSi4o8YDul_fsGktXhLs7lLMis7VpPyAPCGhHMU15si8OaAcrzVjdkZZRGAQYYpiBnZI6CA2WgxB1ZjOM3AAOemjnJs65pTm2o7BS6lm5-uuBC-5UUdrAx-pgcejuMnm59O_pkb6ch_NILkv0pTqGPV_ee3NY2jn515ZJ8vjx_ZG909_6aZ5sdDSzVExXgSlOVKefgmK9V7ZwRWmu0ta1AOcbRgpBK4TleGGVNqdBDyaTklbOSL8nj5Td474_9EBo7_B2VQAka-T-PlU4_
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IPDPS.2016.117
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781509021406
150902140X
EndPage 853
ExternalDocumentID 7516081
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i248t-50db9cb4330d2ef7fdd958881afac07d231a056771902597a9b71e0b2663cda63
IEDL.DBID RIE
ISICitedReferencesCount 32
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000391251800087&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1530-2075
IngestDate Wed Aug 27 02:11:36 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i248t-50db9cb4330d2ef7fdd958881afac07d231a056771902597a9b71e0b2663cda63
OpenAccessLink https://www.osti.gov/biblio/1769300
PageCount 12
ParticipantIDs ieee_primary_7516081
PublicationCentury 2000
PublicationDate 20160501
PublicationDateYYYYMMDD 2016-05-01
PublicationDate_xml – month: 05
  year: 2016
  text: 20160501
  day: 01
PublicationDecade 2010
PublicationTitle Proceedings - IEEE International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev IPDPS
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020349
Score 1.8562605
Snippet Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and...
SourceID ieee
SourceType Publisher
StartPage 842
SubjectTerms Algorithm design and analysis
communication-avoiding algorithms
Covariance matrices
linear algebra
Machine learning algorithms
Parallel algorithms
Partitioning algorithms
Sparse matrices
sparse-dense matrix-matrix multiplication
Three-dimensional displays
Title Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication
URI https://ieeexplore.ieee.org/document/7516081
WOSCitedRecordID wos000391251800087&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5q8eDJRyu-ycGjsXnsbjZHsRYLtixUpbeSbGahIG2pbfHnm-yulYIXTwmBEJghzPP7BuCWR6JQhbHUOJbSKEHudzKiiTEFpkVhWZlwe39Rw2E6HuusAXdbLAwils1neB-2ZS3fzfN1SJV1VMwTFnDWe0qpCqu1Da4Cz0rFjcq85lVcEzRypjv9rJuNQhdXEkqUO2NUSivSO_zf-0fQ_oXjkWxraI6hgbMTOPyZx0Dq79mC_g7agz5s5tNwgWRmGSamfJDRwoexSLs-dEUyCOT8X7RayKBqLKzvtuGt9_T6-EzrUQl0KqJ0RWPmrM5tJCVzAr30ndOxD265KUzOlPNenPGujlI8lBW1Mtoqjsx68yxzZxJ5Cs3ZfIZnQIQLqEr0foSQkUys5k4IIyW3xh_l6Tm0gmQmi4oNY1IL5eLv40s4CIKvWgSvoLlarvEa9vPNavq5vClV-A2aLpxz
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB7EFtqTbbX03Rx6bGpeu9k9llpRqrKgLd4k2WRBEBWr0p_fZHdrEXrpKSEwEGYIM5OZ7xuABypYJjOlsTIkwiK01O24wKFSmY2yTJP8w-2jJweDaDyOkwo87rAw1tq8-cw--W1eyzeLdOO_ypoyoCHxOOuDQAhGC7TWLr3yTCsFOypxtpdBSdFISdzsJq1k6Pu4Ql-k3BukkvuRdu1_NziBxi8gDyU7V3MKFTs_g9rPRAZUPtA6dPfwHvh5u5h6AZSolZ-ZMkPDpUtkLW655NWivqfn_8LFgvpFa2Ep24D39uvopYPLYQl4ykS0xgExOk614JwYZp3-jYkDl95SlamUSOPiOOWCHSmpLyzGUsVaUku0c9A8NSrk51CdL-b2AhAzHldpXSTBuOChjqlhTHFOtXJHaXQJda-ZybLgw5iUSrn6-_gejjqjfm_S6w7eruHYG6FoGLyB6nq1sbdwmG7X08_VXW7Ob-91n7o
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Communication-Avoiding+Parallel+Sparse-Dense+Matrix-Matrix+Multiplication&rft.au=Koanantakool%2C+Penporn&rft.au=Azad%2C+Ariful&rft.au=Buluc%2C+Aydin&rft.au=Morozov%2C+Dmitriy&rft.date=2016-05-01&rft.pub=IEEE&rft.issn=1530-2075&rft.spage=842&rft.epage=853&rft_id=info:doi/10.1109%2FIPDPS.2016.117&rft.externalDocID=7516081
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon