Parallel matrix transpose algorithms on distributed memory concurrent computers

This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl times/Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide app...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the Scalable Parallel Libraries Conference , October 6-8, 1993, Mississippi State, Mississippi s. 245 - 252
Hlavní autoři: Jaeyoung Choi, Dongarra, J.J., Walker, D.W.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE Comput. Soc. Press 1993
Témata:
ISBN:0818649801, 9780818649806
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl times/Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C=A/spl middot/B, the algorithms are used to compute parallel multiplications of transposed matrices, C=A/sup T//spl middot/B/sup T/, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.< >
AbstractList This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl times/Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C=A/spl middot/B, the algorithms are used to compute parallel multiplications of transposed matrices, C=A/sup T//spl middot/B/sup T/, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.< >
Author Jaeyoung Choi
Dongarra, J.J.
Walker, D.W.
Author_xml – sequence: 1
  surname: Jaeyoung Choi
  fullname: Jaeyoung Choi
  organization: Math. Sci. Sect., Oak Ridge Nat. Lab., TN, USA
– sequence: 2
  givenname: J.J.
  surname: Dongarra
  fullname: Dongarra, J.J.
  organization: Math. Sci. Sect., Oak Ridge Nat. Lab., TN, USA
– sequence: 3
  givenname: D.W.
  surname: Walker
  fullname: Walker, D.W.
  organization: Math. Sci. Sect., Oak Ridge Nat. Lab., TN, USA
BookMark eNotj11LwzAYhQMq6Obuxav8gdZ8N7mU4RcUNnD34237VitNU5IM3L-3MM_Nc-CBA2dFrqcwISEPnJWcM_f0ua-3JXdOltJord0VWTHLrVHOMn5LNin9sCVKKab0HdntIcI44kg95Dj80hxhSnNISGH8CnHI3z7RMNFuSItvThk76tGHeKZtmNpTjDjlpfp5UTHdk5sexoSbf67J4fXlsH0v6t3bx_a5LgbrciGEs00PXdMoKyvBtTQgKnTKcHQtOAlCVEaBBIlghO6d7DWTPRctVw5buSaPl9kBEY9zHDzE8_HyWP4BGYhPmg
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/SPLC.1993.365559
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EndPage 252
ExternalDocumentID 365559
GroupedDBID 6IE
6IK
6IL
AAJGR
AAWTH
ACGHX
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
OCL
RIB
RIC
RIE
RIL
ID FETCH-LOGICAL-i89t-2298bfadbb483721536a27e9461e9ca93a22764a3a3ea625f93f503f12c149ec3
IEDL.DBID RIE
ISBN 0818649801
9780818649806
IngestDate Tue Aug 26 22:02:01 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i89t-2298bfadbb483721536a27e9461e9ca93a22764a3a3ea625f93f503f12c149ec3
PageCount 8
ParticipantIDs ieee_primary_365559
PublicationCentury 1900
PublicationDate 19930000
PublicationDateYYYYMMDD 1993-01-01
PublicationDate_xml – year: 1993
  text: 19930000
PublicationDecade 1990
PublicationTitle Proceedings of the Scalable Parallel Libraries Conference , October 6-8, 1993, Mississippi State, Mississippi
PublicationTitleAbbrev SPLC
PublicationYear 1993
Publisher IEEE Comput. Soc. Press
Publisher_xml – name: IEEE Comput. Soc. Press
SSID ssj0000444045
Score 1.2370473
Snippet This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl...
SourceID ieee
SourceType Publisher
StartPage 245
SubjectTerms Application software
Computer architecture
Concurrent computing
Distributed computing
Laboratories
Lifting equipment
Linear algebra
Matrix decomposition
Packaging
Scattering
Title Parallel matrix transpose algorithms on distributed memory concurrent computers
URI https://ieeexplore.ieee.org/document/365559
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEMctWjEwFUoRb3lgddvYaRzPFRVDVSLRoVvl2Beo1CYoSRF8e_xIg5BY2JIMp8hWcg_f_3cIPcRU0XGsKdEmNiYh0xlJVSqJ4hQk1WA8qNvpOV8s4tVKJA1n22lhAMA1n8HQXrqzfF2ovS2VjVg0MQFwB3U4j7xUqy2nWOyZiU4c4TGIo1CYP2_D1znct6eUYzF6SeZTK9RjQ2_z12wV51pmvX-91Cka_Ej0cNI6nzN0BHkf9Q4zGnDzyZ6j50SWdlzKFu8sjP8T155mXgGW29ei3NRvuwoXOdaWoGuHX4HGO9t--4VNqqw8vgmrxnA1QMvZ43L6RJohCmQTi5pQKuI0kzpNLTre-HcWScpBhFEAQknBJKU8CiWTDKTJhTLBssmYZQFVJncCxS5QNy9yuETY5OAmlpEAzKpXUyonSvMgkGBsW4zfFerb1Vm_e0zG2i_M9Z9Pb9CJ7xy0tYxb1K3LPdyhY_VRb6ry3m3tN6Pvo3E
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEMdPUJBgKpQi3nhgTdvYeXmuqIoIJRIdulWOfYFKbYKSFMG3x07SIiQWtiTDKbKV3MP3_x3AXUAlHQSKWkrHxpbDVGLFMhaW9CkKqlB70GqnQ38yCWYzHjWc7UoLg4hV8xn2zGV1lq8yuTalsj7zXB0A78Ke6zh0UIu1tgUVAz7T8UnFeLQDz-H639sQdjb323PKAe-_ROHQSPVYr7b6a7pK5VxG7X-91hF0f0R6JNq6n2PYwbQD7c2UBtJ8tCfwHIncDExZkpXB8X-SsuaZF0jE8jXLF-XbqiBZSpRh6JrxV6jIyjTgfhGdLMsa4ERkY7jownR0Px2OrWaMgrUIeGlRyoM4ESqODTxee3jmCeojdzwbuRScCUp9zxFMMBQ6G0o4S9wBS2wqdfaEkp1CK81SPAOis3AdzQhEZvSrMRWuVL5tC9S2DcjvHDpmdebvNShjXi_MxZ9Pb-FgPH0K5-HD5PESDus-QlPZuIJWma_xGvblR7ko8ptqm78BVGemuA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+Scalable+Parallel+Libraries+Conference+%2C+October+6-8%2C+1993%2C+Mississippi+State%2C+Mississippi&rft.atitle=Parallel+matrix+transpose+algorithms+on+distributed+memory+concurrent+computers&rft.au=Jaeyoung+Choi&rft.au=Dongarra%2C+J.J.&rft.au=Walker%2C+D.W.&rft.date=1993-01-01&rft.pub=IEEE+Comput.+Soc.+Press&rft.isbn=9780818649806&rft.spage=245&rft.epage=252&rft_id=info:doi/10.1109%2FSPLC.1993.365559&rft.externalDocID=365559
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818649806/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818649806/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818649806/sc.gif&client=summon&freeimage=true