An efficient parallel-processing method for transposing large matrices in place

We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedure...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on image processing Vol. 8; no. 9; pp. 1265 - 1275
Main Author: Portnoff, M.R.
Format: Journal Article
Language:English
Published: New York, NY IEEE 01.09.1999
Institute of Electrical and Electronics Engineers
Subjects:
ISSN:1057-7149
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
AbstractList We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory
We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Cate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
Author Portnoff, M.R.
Author_xml – sequence: 1
  givenname: M.R.
  surname: Portnoff
  fullname: Portnoff, M.R.
  organization: Lawrence Livermore Nat. Lab., CA, USA
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1923250$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/18267543$$D View this record in MEDLINE/PubMed
BookMark eNqF0U1vFSEUBmAWNfZDF25dmFkYjYtpYWDgsGwav5Im3eiaMHCoGIYZYe6i_17qvdbEGBtISOA5b8g5p-QoLxkJecHoOWNUXwA_VyAEhyNywuioesWEPiantX6nlImRyafkmMEg1Sj4Cbm5zB2GEF3EvHWrLTYlTP1aFoe1xnzbzbh9W3wXltJtxea6Lr-uky232M12K7HJLuZuTdbhM_Ik2FTx-eE8I18_vP9y9am_vvn4-eryundiEFvvpZsYG4RGQMWkZThMEpwPMIGTXlNuXQheaEeZ9OBBaN-WHcSEyk2en5G3-9z20x87rJuZY3WYks247KrRkgOAouOjUnHBANpu8s1_5QCCjqNSj0OpOWhBG3x1gLtpRm_WEmdb7szv_jfw-gBsdTaF1mAX6x-nBz6M9zkXe-bKUmvBYFzc7BaX3EYSk2HU3A_fADf74beKd39VPGT-w77c24iID-7w-BPhEbgY
CODEN IIPRE4
CitedBy_id crossref_primary_10_1109_TSP_2013_2245656
crossref_primary_10_1007_s11265_012_0721_3
crossref_primary_10_1109_LSP_2009_2016836
Cites_doi 10.1007/978-1-4613-1333-5
10.1145/355719.355729
10.1109/83.210874
ContentType Journal Article
Copyright 1999 INIST-CNRS
Copyright_xml – notice: 1999 INIST-CNRS
DBID RIA
RIE
AAYXX
CITATION
IQODW
NPM
7SC
8FD
JQ2
L7M
L~C
L~D
7X8
7SP
F28
FR3
DOI 10.1109/83.784438
DatabaseName IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
PubMed
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
Electronics & Communications Abstracts
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
PubMed
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
Electronics & Communications Abstracts
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList Technology Research Database
Computer and Information Systems Abstracts

PubMed
Computer and Information Systems Abstracts
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Engineering
EndPage 1275
ExternalDocumentID 18267543
1923250
10_1109_83_784438
784438
Genre Journal Article
GroupedDBID ---
-~X
.DC
0R~
29I
4.4
53G
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
RIA
RIE
RNS
TAE
TN5
VH1
AAYXX
CITATION
IQODW
RIG
NPM
7SC
8FD
JQ2
L7M
L~C
L~D
7X8
7SP
F28
FR3
ID FETCH-LOGICAL-c424t-d6cb11249e8e716a1e2b68cdf8b8c6d903acffd49c016d8d849d9d9a24be7cbd3
IEDL.DBID RIE
ISICitedReferencesCount 11
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000082282900010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1057-7149
IngestDate Sun Sep 28 02:09:00 EDT 2025
Sat Sep 27 23:39:58 EDT 2025
Sun Nov 09 12:57:06 EST 2025
Thu Oct 02 03:21:41 EDT 2025
Mon Jul 21 05:33:07 EDT 2025
Mon Jul 21 09:15:16 EDT 2025
Sat Nov 29 06:27:21 EST 2025
Tue Nov 18 21:08:01 EST 2025
Tue Aug 26 21:00:25 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Keywords Parallel processing
Image processing
Fast algorithm
Imaging
Synthetic aperture radar
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c424t-d6cb11249e8e716a1e2b68cdf8b8c6d903acffd49c016d8d849d9d9a24be7cbd3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
PMID 18267543
PQID 26938940
PQPubID 23500
PageCount 11
ParticipantIDs pascalfrancis_primary_1923250
proquest_miscellaneous_734188188
crossref_citationtrail_10_1109_83_784438
crossref_primary_10_1109_83_784438
proquest_miscellaneous_963888705
proquest_miscellaneous_28405577
pubmed_primary_18267543
ieee_primary_784438
proquest_miscellaneous_26938940
PublicationCentury 1900
PublicationDate 1999-09-01
PublicationDateYYYYMMDD 1999-09-01
PublicationDate_xml – month: 09
  year: 1999
  text: 1999-09-01
  day: 01
PublicationDecade 1990
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: United States
PublicationTitle IEEE transactions on image processing
PublicationTitleAbbrev TIP
PublicationTitleAlternate IEEE Trans Image Process
PublicationYear 1999
Publisher IEEE
Institute of Electrical and Electronics Engineers
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
References ref7
birkhoff (ref5) 1965
ref4
herstein (ref6) 1964
kleiman (ref10) 1996
curlander (ref3) 1991
robbins (ref8) 1996
lewis (ref9) 1996
ref1
claerbout (ref2) 1985
References_xml – ident: ref4
  doi: 10.1007/978-1-4613-1333-5
– year: 1985
  ident: ref2
  publication-title: Imaging the Earth s Interior
– year: 1996
  ident: ref10
  publication-title: Programming with Threads
– year: 1965
  ident: ref5
  publication-title: A Survey of Modern Algebra
– year: 1964
  ident: ref6
  publication-title: Topics in Algebra
– year: 1996
  ident: ref8
  publication-title: Practical UNIX Programming A Guide to Concurrency Communication and Multithreading
– ident: ref1
  doi: 10.1145/355719.355729
– year: 1991
  ident: ref3
  publication-title: Synthetic Aperture Radar
– ident: ref7
  doi: 10.1109/83.210874
– year: 1996
  ident: ref9
  publication-title: Threads Primer
SSID ssj0014516
Score 1.6668777
Snippet We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in...
SourceID proquest
pubmed
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 1265
SubjectTerms Algorithms
Applied sciences
Computer architecture
Concurrent computing
Displays
Exact sciences and technology
Gates
Image processing
Indexing
Information, signal and communications theory
Mathematical analysis
Matrices
Matrix converters
Matrix methods
Microprocessors
Parallel processing
Random access memory
Read-write memory
Remote sensing
Signal processing
Telecommunications and information theory
Title An efficient parallel-processing method for transposing large matrices in place
URI https://ieeexplore.ieee.org/document/784438
https://www.ncbi.nlm.nih.gov/pubmed/18267543
https://www.proquest.com/docview/26938940
https://www.proquest.com/docview/28405577
https://www.proquest.com/docview/734188188
https://www.proquest.com/docview/963888705
Volume 8
WOSCitedRecordID wos000082282900010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  issn: 1057-7149
  databaseCode: RIE
  dateStart: 19920101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://ieeexplore.ieee.org/
  omitProxy: false
  ssIdentifier: ssj0014516
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7RigMcKCyvAC0W4sAlbbLx81ghKg6ocAC0tyi2x1KlkK2a3f7-ju3slkpdJJRLFI0dx_bY32TG3wB8FLYSFjHG-SlR8mC60vCaTBXanCprpVKJ1Of3N3V-rhcL82Pi2U5nYRAxBZ_hcbxNvny_dOv4q-xEac4bvQd7VEM-qrV1GMR8s8mxKRS9lpuJRKiuzIlujnPBO1tPyqUSIyG7kToj5CwWu2Fm2m7ODv6roU_hyYQq2WmeBs_gAQ4zOJgQJpv0d5zB47_oB5_D99OBYaKQoNpYJAHve-zLy3x2gERYzi_NCNiyVWZBT4_7GD7O_iR2fxzZxcBSaNcL-HX25efnr-WUYKF0fM5XpZfO1jH7NGoku6mrcW6ldj5oq530pmo6F4LnxhEw9Nprbjxd3ZxbVM765iXsD8sBXwNzRlofZN2hNBxdZWUQodJB17zrNJoCPm36vnUT-3hMgtG3yQqpTKubNndbAR-2opeZcuM-oVns9q3A5unhnfG8LU84lsBeAe8349uSGkXfSDfgcj22c2kIuvF_SZApLIRSBbAdEooQgSYApHeLxOWOlvVKFPAqT6_bJpKhpwRv3tz7ZW_hUSaMiNFt72B_dbXGQ3jorlcX49URacRCHyWNuAGKRQrZ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB5BQQIOFJYCAUotxIFL2mTj-HGsEFURy8KhoN4iPyZSpZCtml1-P2M7u6VSFwnlElnjxBm_vsmMvwF4X9uitoghzk_WOW-1yTUvyVShzamwVkgZSX1-zuR8rs7P9feRZzuehUHEGHyGh-E2-vL9wq3Cr7IjqTiv1F24FxJnjYe1Ni6DkHE2ujZrSS_meqQRKgt9pKrDVPXG5hOzqYRYSDOQOtqUx2I70IwbzsnufzX1CTwecSU7TgPhKdzBfgK7I8Zk4wweJvDoLwLCZ_DtuGcYSSToaSzQgHcddvllOj1AIixlmGYEbdky8aDH4i4EkLNfkd8fB3bRsxjctQc_Tj6dfTzNxxQLueNTvsy9cLYM-adRIVlOpsSpFcr5VlnlhNdFZVzbeq4dQUOvvOLa02Wm3KJ01lfPYadf9PgSmNPC-laUBoXm6Aor2rotVKtKboxCncGHte4bN_KPhzQYXRPtkEI3qmqS2jJ4txG9TKQbtwlNgto3AuvS_Rv9eV2fkCzBvQwO1v3b0EQK3hHT42I1NFOhCbzxf0mQMVzXUmbAtkhIwgSKIJDaLhIWPFrYizqDF2l4XTeRTD1Z8-rVrV92AA9Oz77Omtnn-ZfX8DDRR4RYtzews7xa4T7cd7-XF8PV2zgv_gA9Lw06
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+efficient+parallel-processing+method+for+transposing+large+matrices+in+place&rft.jtitle=IEEE+transactions+on+image+processing&rft.au=Portnoff%2C+M+R&rft.date=1999-09-01&rft.issn=1057-7149&rft.volume=8&rft.issue=9&rft_id=info:doi/10.1109%2F83.784438&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1057-7149&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1057-7149&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1057-7149&client=summon