An efficient parallel-processing method for transposing large matrices in place

We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedure...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on image processing Vol. 8; no. 9; pp. 1265 - 1275
Main Author:	Portnoff, M.R.
Format:	Journal Article
Language:	English
Published:	New York, NY IEEE 01.09.1999 Institute of Electrical and Electronics Engineers
Subjects:	Algorithms Applied sciences Computer architecture Concurrent computing Displays Exact sciences and technology Gates Image processing Indexing Information, signal and communications theory Mathematical analysis Matrices Matrix converters Matrix methods Microprocessors Parallel processing Random access memory Read-write memory Remote sensing Signal processing Telecommunications and information theory Parallel processing Image processing Fast algorithm Imaging Synthetic aperture radar
ISSN:	1057-7149
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
AbstractList	We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Cate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
Author	Portnoff, M.R.
Author_xml	– sequence: 1 givenname: M.R. surname: Portnoff fullname: Portnoff, M.R. organization: Lawrence Livermore Nat. Lab., CA, USA
BackLink	http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1923250$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/18267543$$D View this record in MEDLINE/PubMed
BookMark	eNqF0U1vFSEUBmAWNfZDF25dmFkYjYtpYWDgsGwav5Im3eiaMHCoGIYZYe6i_17qvdbEGBtISOA5b8g5p-QoLxkJecHoOWNUXwA_VyAEhyNywuioesWEPiantX6nlImRyafkmMEg1Sj4Cbm5zB2GEF3EvHWrLTYlTP1aFoe1xnzbzbh9W3wXltJtxea6Lr-uky232M12K7HJLuZuTdbhM_Ik2FTx-eE8I18_vP9y9am_vvn4-eryundiEFvvpZsYG4RGQMWkZThMEpwPMIGTXlNuXQheaEeZ9OBBaN-WHcSEyk2en5G3-9z20x87rJuZY3WYks247KrRkgOAouOjUnHBANpu8s1_5QCCjqNSj0OpOWhBG3x1gLtpRm_WEmdb7szv_jfw-gBsdTaF1mAX6x-nBz6M9zkXe-bKUmvBYFzc7BaX3EYSk2HU3A_fADf74beKd39VPGT-w77c24iID-7w-BPhEbgY
CODEN	IIPRE4
CitedBy_id	crossref_primary_10_1109_TSP_2013_2245656 crossref_primary_10_1007_s11265_012_0721_3 crossref_primary_10_1109_LSP_2009_2016836
Cites_doi	10.1007/978-1-4613-1333-5 10.1145/355719.355729 10.1109/83.210874
ContentType	Journal Article
Copyright	1999 INIST-CNRS
Copyright_xml	– notice: 1999 INIST-CNRS
DBID	RIA RIE AAYXX CITATION IQODW NPM 7SC 8FD JQ2 L7M L~C L~D 7X8 7SP F28 FR3
DOI	10.1109/83.784438
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis PubMed Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic Electronics & Communications Abstracts ANTE: Abstracts in New Technology & Engineering Engineering Research Database
DatabaseTitle	CrossRef PubMed Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional MEDLINE - Academic Electronics & Communications Abstracts Engineering Research Database ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList	Technology Research Database Computer and Information Systems Abstracts PubMed Computer and Information Systems Abstracts MEDLINE - Academic
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences Engineering
EndPage	1275
ExternalDocumentID	18267543 1923250 10_1109_83_784438 784438
Genre	Journal Article
GroupedDBID	--- -~X .DC 0R~ 29I 4.4 53G 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P RIA RIE RNS TAE TN5 VH1 AAYXX CITATION IQODW RIG NPM 7SC 8FD JQ2 L7M L~C L~D 7X8 7SP F28 FR3
ID	FETCH-LOGICAL-c424t-d6cb11249e8e716a1e2b68cdf8b8c6d903acffd49c016d8d849d9d9a24be7cbd3
IEDL.DBID	RIE
ISICitedReferencesCount	11
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000082282900010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	1057-7149
IngestDate	Sun Sep 28 02:09:00 EDT 2025 Sat Sep 27 23:39:58 EDT 2025 Sun Nov 09 12:57:06 EST 2025 Thu Oct 02 03:21:41 EDT 2025 Mon Jul 21 05:33:07 EDT 2025 Mon Jul 21 09:15:16 EDT 2025 Sat Nov 29 06:27:21 EST 2025 Tue Nov 18 21:08:01 EST 2025 Tue Aug 26 21:00:25 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	9
Keywords	Parallel processing Image processing Fast algorithm Imaging Synthetic aperture radar
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c424t-d6cb11249e8e716a1e2b68cdf8b8c6d903acffd49c016d8d849d9d9a24be7cbd3
Notes	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
PMID	18267543
PQID	26938940
PQPubID	23500
PageCount	11
ParticipantIDs	pascalfrancis_primary_1923250 proquest_miscellaneous_734188188 crossref_citationtrail_10_1109_83_784438 crossref_primary_10_1109_83_784438 proquest_miscellaneous_963888705 proquest_miscellaneous_28405577 pubmed_primary_18267543 ieee_primary_784438 proquest_miscellaneous_26938940
PublicationCentury	1900
PublicationDate	1999-09-01
PublicationDateYYYYMMDD	1999-09-01
PublicationDate_xml	– month: 09 year: 1999 text: 1999-09-01 day: 01
PublicationDecade	1990
PublicationPlace	New York, NY
PublicationPlace_xml	– name: New York, NY – name: United States
PublicationTitle	IEEE transactions on image processing
PublicationTitleAbbrev	TIP
PublicationTitleAlternate	IEEE Trans Image Process
PublicationYear	1999
Publisher	IEEE Institute of Electrical and Electronics Engineers
Publisher_xml	– name: IEEE – name: Institute of Electrical and Electronics Engineers
References	ref7 birkhoff (ref5) 1965 ref4 herstein (ref6) 1964 kleiman (ref10) 1996 curlander (ref3) 1991 robbins (ref8) 1996 lewis (ref9) 1996 ref1 claerbout (ref2) 1985
References_xml	– ident: ref4 doi: 10.1007/978-1-4613-1333-5 – year: 1985 ident: ref2 publication-title: Imaging the Earth s Interior – year: 1996 ident: ref10 publication-title: Programming with Threads – year: 1965 ident: ref5 publication-title: A Survey of Modern Algebra – year: 1964 ident: ref6 publication-title: Topics in Algebra – year: 1996 ident: ref8 publication-title: Practical UNIX Programming A Guide to Concurrency Communication and Multithreading – ident: ref1 doi: 10.1145/355719.355729 – year: 1991 ident: ref3 publication-title: Synthetic Aperture Radar – ident: ref7 doi: 10.1109/83.210874 – year: 1996 ident: ref9 publication-title: Threads Primer
SSID	ssj0014516
Score	1.6668777
Snippet	We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in...
SourceID	proquest pubmed pascalfrancis crossref ieee
SourceType	Aggregation Database Index Database Enrichment Source Publisher
StartPage	1265
SubjectTerms	Algorithms Applied sciences Computer architecture Concurrent computing Displays Exact sciences and technology Gates Image processing Indexing Information, signal and communications theory Mathematical analysis Matrices Matrix converters Matrix methods Microprocessors Parallel processing Random access memory Read-write memory Remote sensing Signal processing Telecommunications and information theory
Title	An efficient parallel-processing method for transposing large matrices in place
URI	https://ieeexplore.ieee.org/document/784438 https://www.ncbi.nlm.nih.gov/pubmed/18267543 https://www.proquest.com/docview/26938940 https://www.proquest.com/docview/28405577 https://www.proquest.com/docview/734188188 https://www.proquest.com/docview/963888705
Volume	8
WOSCitedRecordID	wos000082282900010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) issn: 1057-7149 databaseCode: RIE dateStart: 19920101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://ieeexplore.ieee.org/ omitProxy: false ssIdentifier: ssj0014516 providerName: IEEE
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7RigMcKCyvAC0W4sAlbbLx81ghKg6ocAC0tyi2x1KlkK2a3f7-ju3slkpdJJRLFI0dx_bY32TG3wB8FLYSFjHG-SlR8mC60vCaTBXanCprpVKJ1Of3N3V-rhcL82Pi2U5nYRAxBZ_hcbxNvny_dOv4q-xEac4bvQd7VEM-qrV1GMR8s8mxKRS9lpuJRKiuzIlujnPBO1tPyqUSIyG7kToj5CwWu2Fm2m7ODv6roU_hyYQq2WmeBs_gAQ4zOJgQJpv0d5zB47_oB5_D99OBYaKQoNpYJAHve-zLy3x2gERYzi_NCNiyVWZBT4_7GD7O_iR2fxzZxcBSaNcL-HX25efnr-WUYKF0fM5XpZfO1jH7NGoku6mrcW6ldj5oq530pmo6F4LnxhEw9Nprbjxd3ZxbVM765iXsD8sBXwNzRlofZN2hNBxdZWUQodJB17zrNJoCPm36vnUT-3hMgtG3yQqpTKubNndbAR-2opeZcuM-oVns9q3A5unhnfG8LU84lsBeAe8349uSGkXfSDfgcj22c2kIuvF_SZApLIRSBbAdEooQgSYApHeLxOWOlvVKFPAqT6_bJpKhpwRv3tz7ZW_hUSaMiNFt72B_dbXGQ3jorlcX49URacRCHyWNuAGKRQrZ
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB5BQQIOFJYCAUotxIFL2mTj-HGsEFURy8KhoN4iPyZSpZCtml1-P2M7u6VSFwnlElnjxBm_vsmMvwF4X9uitoghzk_WOW-1yTUvyVShzamwVkgZSX1-zuR8rs7P9feRZzuehUHEGHyGh-E2-vL9wq3Cr7IjqTiv1F24FxJnjYe1Ni6DkHE2ujZrSS_meqQRKgt9pKrDVPXG5hOzqYRYSDOQOtqUx2I70IwbzsnufzX1CTwecSU7TgPhKdzBfgK7I8Zk4wweJvDoLwLCZ_DtuGcYSSToaSzQgHcddvllOj1AIixlmGYEbdky8aDH4i4EkLNfkd8fB3bRsxjctQc_Tj6dfTzNxxQLueNTvsy9cLYM-adRIVlOpsSpFcr5VlnlhNdFZVzbeq4dQUOvvOLa02Wm3KJ01lfPYadf9PgSmNPC-laUBoXm6Aor2rotVKtKboxCncGHte4bN_KPhzQYXRPtkEI3qmqS2jJ4txG9TKQbtwlNgto3AuvS_Rv9eV2fkCzBvQwO1v3b0EQK3hHT42I1NFOhCbzxf0mQMVzXUmbAtkhIwgSKIJDaLhIWPFrYizqDF2l4XTeRTD1Z8-rVrV92AA9Oz77Omtnn-ZfX8DDRR4RYtzews7xa4T7cd7-XF8PV2zgv_gA9Lw06
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+efficient+parallel-processing+method+for+transposing+large+matrices+in+place&rft.jtitle=IEEE+transactions+on+image+processing&rft.au=Portnoff%2C+M+R&rft.date=1999-09-01&rft.issn=1057-7149&rft.volume=8&rft.issue=9&rft_id=info:doi/10.1109%2F83.784438&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1057-7149&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1057-7149&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1057-7149&client=summon