Implementing sparse matrix-vector multiplication on throughput-oriented processors

Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis s. 1 - 11
Hlavní autoři: Bell, Nathan, Garland, Michael
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: New York, NY, USA ACM 14.11.2009
Edice:ACM Conferences
Témata:
ISBN:1605587443, 9781605587448
ISSN:2167-4329
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained parallelism and impose sufficient regularity on execution paths and memory access patterns. We explore SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes. The techniques we propose are efficient, successfully utilizing large percentages of peak bandwidth. Furthermore, they deliver excellent total throughput, averaging 16 GFLOP/s and 10 GFLOP/s in double precision for structured grid and unstructured mesh matrices, respectively, on a GeForce GTX 285. This is roughly 2.8 times the throughput previously achieved on Cell BE and more than 10 times that of a quad-core Intel Clovertown system.
AbstractList Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained parallelism and impose sufficient regularity on execution paths and memory access patterns. We explore SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes. The techniques we propose are efficient, successfully utilizing large percentages of peak bandwidth. Furthermore, they deliver excellent total throughput, averaging 16 GFLOP/s and 10 GFLOP/s in double precision for structured grid and unstructured mesh matrices, respectively, on a GeForce GTX 285. This is roughly 2.8 times the throughput previously achieved on Cell BE and more than 10 times that of a quad-core Intel Clovertown system.
Author Garland, Michael
Bell, Nathan
Author_xml – sequence: 1
  givenname: Nathan
  surname: Bell
  fullname: Bell, Nathan
  organization: NVIDIA Research
– sequence: 2
  givenname: Michael
  surname: Garland
  fullname: Garland, Michael
  organization: NVIDIA Research
BookMark eNqNkM1Lw0AQxVesYFt79uAlRy-p-73JUYofhYIgel42u5M2mmTD7lb0vzfaHjw6DDyGx2_gvRma9L4HhC4JXhLCxQ2RgmNRLn9VFSdoRiQWolCcs9O_xwRNKZEq54yW52gR4xsepyCUFWKKntfd0EIHfWr6bRYHEyJknUmh-cw_wCYfsm7fpmZoG2tS4_ts3LQLfr_dDfuU-9CMLLhsCN5CjD7EC3RWmzbC4qhz9Hp_97J6zDdPD-vV7SY3jLCUOyDMYVzJmlJOXFXiCoOyHGThOK9lZUrlKMZgnTFWUgdgVUkVUY4VtCRsjq4OfxsA0ENoOhO-tGRKCIVHd3lwje105f171ATrn-r0sTp9rE5XY4Z6BK7_CbBvpAdubA
CODEN IEEPAD
ContentType Conference Proceeding
Copyright 2009 ACM
Copyright_xml – notice: 2009 ACM
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/1654059.1654078
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEL
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1605587443
9781605587448
EndPage 11
ExternalDocumentID 6375570
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAJGR
AARBI
ACM
ADPZR
ALMA_UNASSIGNED_HOLDINGS
APO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
GUFHI
OCL
RIE
RIL
6IH
6IK
AAWTH
ABLEC
ADZIZ
CHZPO
IEGSK
IPLJI
ID FETCH-LOGICAL-a313t-de13d00b6f2241db90b0e7c4e68d44f6ba97d200ecdaac62deec792717d382913
IEDL.DBID RIE
ISBN 1605587443
9781605587448
ISICitedReferencesCount 124
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000320136800057&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2167-4329
IngestDate Wed Jul 30 06:14:31 EDT 2025
Wed Jan 31 06:45:55 EST 2024
IsPeerReviewed false
IsScholarly false
Language English
License Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org
LinkModel DirectLink
MeetingName SC '09: International Conference for High Performance Computing, Networking, Storage and Analysis
MergedId FETCHMERGED-LOGICAL-a313t-de13d00b6f2241db90b0e7c4e68d44f6ba97d200ecdaac62deec792717d382913
PageCount 11
ParticipantIDs acm_books_10_1145_1654059_1654078
acm_books_10_1145_1654059_1654078_brief
ieee_primary_6375570
PublicationCentury 2000
PublicationDate 2009-11-14
PublicationDateYYYYMMDD 2009-11-14
PublicationDate_xml – month: 11
  year: 2009
  text: 2009-11-14
  day: 14
PublicationDecade 2000
PublicationPlace New York, NY, USA
PublicationPlace_xml – name: New York, NY, USA
PublicationSeriesTitle ACM Conferences
PublicationTitle Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
PublicationTitleAbbrev SUPERC
PublicationYear 2009
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0000812385
ssj0003204180
Score 1.9932897
Snippet Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra,...
SourceID ieee
acm
SourceType Publisher
StartPage 1
SubjectTerms Bandwidth
Computer systems organization -- Architectures -- Parallel architectures -- Multiple instruction, multiple data
Computer systems organization -- Dependable and fault-tolerant systems and networks
Computing methodologies -- Symbolic and algebraic manipulation -- Symbolic and algebraic algorithms -- Linear algebra algorithms
General and reference -- Cross-computing tools and techniques -- Performance
Graphics processing units
Hardware
Instruction sets
Kernel
Mathematics of computing -- Mathematical analysis -- Numerical analysis -- Computations on matrices
Memory management
Networks -- Network performance evaluation
Optimization
Sparse matrices
Throughput
Vectors
Title Implementing sparse matrix-vector multiplication on throughput-oriented processors
URI https://ieeexplore.ieee.org/document/6375570
WOSCitedRecordID wos000320136800057&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB52Fw-eVt0V1xcVBC92t4-0Sc7i4mlZRGFvJU2m4mEf7At_vpM0VgRBhJY-KKEdmnyZyXzfANwioTgNAZaNY0M3UqhQKE5eSoWEn4lkGStdsQk-mYjZTE5bcN9wYRDRJZ_h0J66tXyz1DsbKhvlKbeKUW1oc57XXK0mnkLQRuiTNddpErHYFU5LnLR3mkiv7BOzbGQ5PDSxGLojdzKrev6jwIrDl3H3f292BP1vol4wbSDoGFq4OIHuV6WGwHfcHjw7EWCXGbR4C2gQWW8wmFt1_o9w7-L2gc8s9CG8gDZfwoeaCpdWDZnmpsGq5hUs15s-vI4fXx6eQl9NIVRpnG5Dg3FqoqjMK4vappRRGSHXDHNhGKvyUkluqM-gNkrpPDGImsuE3D2TikTG6Sl0FssFnkGgK4FSxYgZTcfyLBNKGMFZRbuuolIO4IbMWVg3YVPUzOes8CYvvMkHcPfnM0VJX1cNoGcNXqxq-Y3C2_r899sXcOgWfGymHruEzna9wys40Pvt-2Z97f6ZT4Y4u74
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7MKejT1E2c1wqCL3b2krbJszgmzjFkwt5Km5yKD-vGbvjzPcliRRBEaOmFUtpD0y85Od_3AVwjoTj9AjQbR6duBM9cniU0SimQ8DMQLGK5MZtIBgM-HothDW4rLgwimuIz7OhdM5evpnKlU2V3cZhoxagt2NbOWZatVWVUCNwIf6LqOAw85hvrtMCIe4eBsNo-PovuNIuHuhYds02M0Kqc_LBYMQjTbfzv2fah9U3Vc4YVCB1ADctDaHx5NTi26TbhxcgAm9qg8s2h38h8gc5E6_N_uGuTuXdsbaFN4jm0WBMfupU71XrI1Dt1ZhtmwXS-aMFr92F033Otn4KbhX64dBX6ofK8PC40bqtceLmHiWQYc8VYEeeZSBS1GpQqy2QcKESZiIAGfCrkgfDDI6iX0xKPwZEFR5H5iBF1yOIo4hlXPGEFrbLwctGGKwpnqgcKi3TDfY5SG_LUhrwNN39ek-b0dkUbmjrg6WwjwJHaWJ_8fvoSdnuj537afxw8ncKemf7RdXvsDOrL-QrPYUeul--L-YX5fj4BfcG_Bw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+Conference+on+High+Performance+Computing+Networking%2C+Storage+and+Analysis&rft.atitle=Implementing+sparse+matrix-vector+multiplication+on+throughput-oriented+processors&rft.au=Bell%2C+Nathan&rft.au=Garland%2C+Michael&rft.series=ACM+Conferences&rft.date=2009-11-14&rft.pub=ACM&rft.isbn=1605587443&rft.spage=1&rft.epage=11&rft_id=info:doi/10.1145%2F1654059.1654078
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2167-4329&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2167-4329&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2167-4329&client=summon