Parallel matrix transpose algorithms on distributed memory concurrent computers

This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl times/Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide app...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings of the Scalable Parallel Libraries Conference , October 6-8, 1993, Mississippi State, Mississippi s. 245 - 252
Hlavní autoři:	Jaeyoung Choi, Dongarra, J.J., Walker, D.W.
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE Comput. Soc. Press 1993
Témata:	Application software Computer architecture Concurrent computing Distributed computing Laboratories Lifting equipment Linear algebra Matrix decomposition Packaging Scattering
ISBN:	0818649801, 9780818649806
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Abstract	This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl times/Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C=A/spl middot/B, the algorithms are used to compute parallel multiplications of transposed matrices, C=A/sup T//spl middot/B/sup T/, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.< >
AbstractList	This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl times/Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C=A/spl middot/B, the algorithms are used to compute parallel multiplications of transposed matrices, C=A/sup T//spl middot/B/sup T/, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.< >
Author	Jaeyoung Choi Dongarra, J.J. Walker, D.W.
Author_xml	– sequence: 1 surname: Jaeyoung Choi fullname: Jaeyoung Choi organization: Math. Sci. Sect., Oak Ridge Nat. Lab., TN, USA – sequence: 2 givenname: J.J. surname: Dongarra fullname: Dongarra, J.J. organization: Math. Sci. Sect., Oak Ridge Nat. Lab., TN, USA – sequence: 3 givenname: D.W. surname: Walker fullname: Walker, D.W. organization: Math. Sci. Sect., Oak Ridge Nat. Lab., TN, USA
BookMark	eNotj11LwzAYhQMq6Obuxav8gdZ8N7mU4RcUNnD34237VitNU5IM3L-3MM_Nc-CBA2dFrqcwISEPnJWcM_f0ua-3JXdOltJord0VWTHLrVHOMn5LNin9sCVKKab0HdntIcI44kg95Dj80hxhSnNISGH8CnHI3z7RMNFuSItvThk76tGHeKZtmNpTjDjlpfp5UTHdk5sexoSbf67J4fXlsH0v6t3bx_a5LgbrciGEs00PXdMoKyvBtTQgKnTKcHQtOAlCVEaBBIlghO6d7DWTPRctVw5buSaPl9kBEY9zHDzE8_HyWP4BGYhPmg
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/SPLC.1993.365559
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EndPage	252
ExternalDocumentID	365559
GroupedDBID	6IE 6IK 6IL AAJGR AAWTH ACGHX ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK OCL RIB RIC RIE RIL
ID	FETCH-LOGICAL-i89t-2298bfadbb483721536a27e9461e9ca93a22764a3a3ea625f93f503f12c149ec3
IEDL.DBID	RIE
ISBN	0818649801 9780818649806
IngestDate	Tue Aug 26 22:02:01 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i89t-2298bfadbb483721536a27e9461e9ca93a22764a3a3ea625f93f503f12c149ec3
PageCount	8
ParticipantIDs	ieee_primary_365559
PublicationCentury	1900
PublicationDate	19930000
PublicationDateYYYYMMDD	1993-01-01
PublicationDate_xml	– year: 1993 text: 19930000
PublicationDecade	1990
PublicationTitle	Proceedings of the Scalable Parallel Libraries Conference , October 6-8, 1993, Mississippi State, Mississippi
PublicationTitleAbbrev	SPLC
PublicationYear	1993
Publisher	IEEE Comput. Soc. Press
Publisher_xml	– name: IEEE Comput. Soc. Press
SSID	ssj0000444045
Score	1.2370282
Snippet	This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P/spl...
SourceID	ieee
SourceType	Publisher
StartPage	245
SubjectTerms	Application software Computer architecture Concurrent computing Distributed computing Laboratories Lifting equipment Linear algebra Matrix decomposition Packaging Scattering
Title	Parallel matrix transpose algorithms on distributed memory concurrent computers
URI	https://ieeexplore.ieee.org/document/365559
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEIAtWjEwFUoRb3lgTZvYqR3PFRVDVSLRoVtlOxeo1CYoSRH8e_xIi5BY2JLIOUW2nHv47juEHsIYjB4LbXiJZ0GshAzkmKhAGXMXdMJopNxKz_h8niyXIm05264WBgBc8hkM7aU7y89KvbOhshFlY2MAd1CHc-ZLtQ7hFIs9M9aJIzxGCYuF-fO2fJ39_eGUMhSjl3Q2sYV6dOhl_uqt4lTLtPevjzpFg58SPZwelM8ZOoKij3r7Hg243bLn6DmVlW2XssFbC-P_xI2nmdeA5ea1rNbN27bGZYEzS9C1za8gw1ubfvuFjausPb4J61ZwPUCL6eNi8hS0TRSCdSKagBCRqFxmSll0vNHvlEnCQcQsAqGloJIQzmJJJQVpfKFc0Hwc0jwi2vhOoOkF6hZlAZcIR2DGRiLRlrdDNIhIxZJnRpxU5uX8CvXt7KzePSZj5Sfm-s-nN-jEZw7aWMYt6jbVDu7Qsf5o1nV175b2G-8Ooss
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ3fT4MwEMcbnSb6NJ0z_rYPvrJBy4A-Ly4z4iRxD3tb2nLokg0MMKP_vT1gMya--AYELqQNXO969_kScme7YPyYjeklP7ZcJaQlB0xZyix3QQced1Q106E_mQSzmYgaznbVCwMAVfEZ9PCw2suPM73GVFmfewOzAN4leyic1TRrbRMqCD4z65OK8egEnivMv7ch7GzOt_uUtui_ROEQW_V4r7b6S12lci6j9r9e64h0f5r0aLR1P8dkB9IOaW9UGmjz0Z6Q50jmKJiypCvE8X_SsuaZF0Dl8jXLF-XbqqBZSmNk6KL8FcR0hQW4X9QEy7oGOFHdGC66ZDq6nw7HViOjYC0CUVqMiUAlMlYK4fHGw3NPMh-E6zkgtBRcMuZ7ruSSgzTRUCJ4MrB54jBtoifQ_JS00iyFM0IdMPc6ItBI3GEahKNc6cfGnFTm4eScdHB05u81KGNeD8zFn1dvycF4-hTOw4fJ4yU5rOsIMbNxRVplvoZrsq8_ykWR31TT_A2aDaYU
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+Scalable+Parallel+Libraries+Conference+%2C+October+6-8%2C+1993%2C+Mississippi+State%2C+Mississippi&rft.atitle=Parallel+matrix+transpose+algorithms+on+distributed+memory+concurrent+computers&rft.au=Jaeyoung+Choi&rft.au=Dongarra%2C+J.J.&rft.au=Walker%2C+D.W.&rft.date=1993-01-01&rft.pub=IEEE+Comput.+Soc.+Press&rft.isbn=9780818649806&rft.spage=245&rft.epage=252&rft_id=info:doi/10.1109%2FSPLC.1993.365559&rft.externalDocID=365559
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818649806/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818649806/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818649806/sc.gif&client=summon&freeimage=true