An efficient parallel-processing method for transposing large matrices in place
We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedure...
Saved in:
| Published in: | IEEE transactions on image processing Vol. 8; no. 9; pp. 1265 - 1275 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York, NY
IEEE
01.09.1999
Institute of Electrical and Electronics Engineers |
| Subjects: | |
| ISSN: | 1057-7149 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. |
|---|---|
| AbstractList | We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Cate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg (1977) for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory. |
| Author | Portnoff, M.R. |
| Author_xml | – sequence: 1 givenname: M.R. surname: Portnoff fullname: Portnoff, M.R. organization: Lawrence Livermore Nat. Lab., CA, USA |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1923250$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/18267543$$D View this record in MEDLINE/PubMed |
| BookMark | eNqF0U1vFSEUBmAWNfZDF25dmFkYjYtpYWDgsGwav5Im3eiaMHCoGIYZYe6i_17qvdbEGBtISOA5b8g5p-QoLxkJecHoOWNUXwA_VyAEhyNywuioesWEPiantX6nlImRyafkmMEg1Sj4Cbm5zB2GEF3EvHWrLTYlTP1aFoe1xnzbzbh9W3wXltJtxea6Lr-uky232M12K7HJLuZuTdbhM_Ik2FTx-eE8I18_vP9y9am_vvn4-eryundiEFvvpZsYG4RGQMWkZThMEpwPMIGTXlNuXQheaEeZ9OBBaN-WHcSEyk2en5G3-9z20x87rJuZY3WYks247KrRkgOAouOjUnHBANpu8s1_5QCCjqNSj0OpOWhBG3x1gLtpRm_WEmdb7szv_jfw-gBsdTaF1mAX6x-nBz6M9zkXe-bKUmvBYFzc7BaX3EYSk2HU3A_fADf74beKd39VPGT-w77c24iID-7w-BPhEbgY |
| CODEN | IIPRE4 |
| CitedBy_id | crossref_primary_10_1109_TSP_2013_2245656 crossref_primary_10_1007_s11265_012_0721_3 crossref_primary_10_1109_LSP_2009_2016836 |
| Cites_doi | 10.1007/978-1-4613-1333-5 10.1145/355719.355729 10.1109/83.210874 |
| ContentType | Journal Article |
| Copyright | 1999 INIST-CNRS |
| Copyright_xml | – notice: 1999 INIST-CNRS |
| DBID | RIA RIE AAYXX CITATION IQODW NPM 7SC 8FD JQ2 L7M L~C L~D 7X8 7SP F28 FR3 |
| DOI | 10.1109/83.784438 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis PubMed Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic Electronics & Communications Abstracts ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
| DatabaseTitle | CrossRef PubMed Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional MEDLINE - Academic Electronics & Communications Abstracts Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
| DatabaseTitleList | Technology Research Database Computer and Information Systems Abstracts PubMed Computer and Information Systems Abstracts MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences Engineering |
| EndPage | 1275 |
| ExternalDocumentID | 18267543 1923250 10_1109_83_784438 784438 |
| Genre | Journal Article |
| GroupedDBID | --- -~X .DC 0R~ 29I 4.4 53G 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P RIA RIE RNS TAE TN5 VH1 AAYXX CITATION IQODW RIG NPM 7SC 8FD JQ2 L7M L~C L~D 7X8 7SP F28 FR3 |
| ID | FETCH-LOGICAL-c424t-d6cb11249e8e716a1e2b68cdf8b8c6d903acffd49c016d8d849d9d9a24be7cbd3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 11 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000082282900010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1057-7149 |
| IngestDate | Sun Sep 28 02:09:00 EDT 2025 Sat Sep 27 23:39:58 EDT 2025 Sun Nov 09 12:57:06 EST 2025 Thu Oct 02 03:21:41 EDT 2025 Mon Jul 21 05:33:07 EDT 2025 Mon Jul 21 09:15:16 EDT 2025 Sat Nov 29 06:27:21 EST 2025 Tue Nov 18 21:08:01 EST 2025 Tue Aug 26 21:00:25 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 9 |
| Keywords | Parallel processing Image processing Fast algorithm Imaging Synthetic aperture radar |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c424t-d6cb11249e8e716a1e2b68cdf8b8c6d903acffd49c016d8d849d9d9a24be7cbd3 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 |
| PMID | 18267543 |
| PQID | 26938940 |
| PQPubID | 23500 |
| PageCount | 11 |
| ParticipantIDs | pascalfrancis_primary_1923250 proquest_miscellaneous_734188188 crossref_citationtrail_10_1109_83_784438 crossref_primary_10_1109_83_784438 proquest_miscellaneous_963888705 proquest_miscellaneous_28405577 pubmed_primary_18267543 ieee_primary_784438 proquest_miscellaneous_26938940 |
| PublicationCentury | 1900 |
| PublicationDate | 1999-09-01 |
| PublicationDateYYYYMMDD | 1999-09-01 |
| PublicationDate_xml | – month: 09 year: 1999 text: 1999-09-01 day: 01 |
| PublicationDecade | 1990 |
| PublicationPlace | New York, NY |
| PublicationPlace_xml | – name: New York, NY – name: United States |
| PublicationTitle | IEEE transactions on image processing |
| PublicationTitleAbbrev | TIP |
| PublicationTitleAlternate | IEEE Trans Image Process |
| PublicationYear | 1999 |
| Publisher | IEEE Institute of Electrical and Electronics Engineers |
| Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers |
| References | ref7 birkhoff (ref5) 1965 ref4 herstein (ref6) 1964 kleiman (ref10) 1996 curlander (ref3) 1991 robbins (ref8) 1996 lewis (ref9) 1996 ref1 claerbout (ref2) 1985 |
| References_xml | – ident: ref4 doi: 10.1007/978-1-4613-1333-5 – year: 1985 ident: ref2 publication-title: Imaging the Earth s Interior – year: 1996 ident: ref10 publication-title: Programming with Threads – year: 1965 ident: ref5 publication-title: A Survey of Modern Algebra – year: 1964 ident: ref6 publication-title: Topics in Algebra – year: 1996 ident: ref8 publication-title: Practical UNIX Programming A Guide to Concurrency Communication and Multithreading – ident: ref1 doi: 10.1145/355719.355729 – year: 1991 ident: ref3 publication-title: Synthetic Aperture Radar – ident: ref7 doi: 10.1109/83.210874 – year: 1996 ident: ref9 publication-title: Threads Primer |
| SSID | ssj0014516 |
| Score | 1.6668777 |
| Snippet | We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in... |
| SourceID | proquest pubmed pascalfrancis crossref ieee |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 1265 |
| SubjectTerms | Algorithms Applied sciences Computer architecture Concurrent computing Displays Exact sciences and technology Gates Image processing Indexing Information, signal and communications theory Mathematical analysis Matrices Matrix converters Matrix methods Microprocessors Parallel processing Random access memory Read-write memory Remote sensing Signal processing Telecommunications and information theory |
| Title | An efficient parallel-processing method for transposing large matrices in place |
| URI | https://ieeexplore.ieee.org/document/784438 https://www.ncbi.nlm.nih.gov/pubmed/18267543 https://www.proquest.com/docview/26938940 https://www.proquest.com/docview/28405577 https://www.proquest.com/docview/734188188 https://www.proquest.com/docview/963888705 |
| Volume | 8 |
| WOSCitedRecordID | wos000082282900010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) issn: 1057-7149 databaseCode: RIE dateStart: 19920101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://ieeexplore.ieee.org/ omitProxy: false ssIdentifier: ssj0014516 providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7RigMcKCyvAC0W4sAlbbLx81ghKg6ocAC0tyi2x1KlkK2a3f7-ju3slkpdJJRLFI0dx_bY32TG3wB8FLYSFjHG-SlR8mC60vCaTBXanCprpVKJ1Of3N3V-rhcL82Pi2U5nYRAxBZ_hcbxNvny_dOv4q-xEac4bvQd7VEM-qrV1GMR8s8mxKRS9lpuJRKiuzIlujnPBO1tPyqUSIyG7kToj5CwWu2Fm2m7ODv6roU_hyYQq2WmeBs_gAQ4zOJgQJpv0d5zB47_oB5_D99OBYaKQoNpYJAHve-zLy3x2gERYzi_NCNiyVWZBT4_7GD7O_iR2fxzZxcBSaNcL-HX25efnr-WUYKF0fM5XpZfO1jH7NGoku6mrcW6ldj5oq530pmo6F4LnxhEw9Nprbjxd3ZxbVM765iXsD8sBXwNzRlofZN2hNBxdZWUQodJB17zrNJoCPm36vnUT-3hMgtG3yQqpTKubNndbAR-2opeZcuM-oVns9q3A5unhnfG8LU84lsBeAe8349uSGkXfSDfgcj22c2kIuvF_SZApLIRSBbAdEooQgSYApHeLxOWOlvVKFPAqT6_bJpKhpwRv3tz7ZW_hUSaMiNFt72B_dbXGQ3jorlcX49URacRCHyWNuAGKRQrZ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB5BQQIOFJYCAUotxIFL2mTj-HGsEFURy8KhoN4iPyZSpZCtml1-P2M7u6VSFwnlElnjxBm_vsmMvwF4X9uitoghzk_WOW-1yTUvyVShzamwVkgZSX1-zuR8rs7P9feRZzuehUHEGHyGh-E2-vL9wq3Cr7IjqTiv1F24FxJnjYe1Ni6DkHE2ujZrSS_meqQRKgt9pKrDVPXG5hOzqYRYSDOQOtqUx2I70IwbzsnufzX1CTwecSU7TgPhKdzBfgK7I8Zk4wweJvDoLwLCZ_DtuGcYSSToaSzQgHcddvllOj1AIixlmGYEbdky8aDH4i4EkLNfkd8fB3bRsxjctQc_Tj6dfTzNxxQLueNTvsy9cLYM-adRIVlOpsSpFcr5VlnlhNdFZVzbeq4dQUOvvOLa02Wm3KJ01lfPYadf9PgSmNPC-laUBoXm6Aor2rotVKtKboxCncGHte4bN_KPhzQYXRPtkEI3qmqS2jJ4txG9TKQbtwlNgto3AuvS_Rv9eV2fkCzBvQwO1v3b0EQK3hHT42I1NFOhCbzxf0mQMVzXUmbAtkhIwgSKIJDaLhIWPFrYizqDF2l4XTeRTD1Z8-rVrV92AA9Oz77Omtnn-ZfX8DDRR4RYtzews7xa4T7cd7-XF8PV2zgv_gA9Lw06 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+efficient+parallel-processing+method+for+transposing+large+matrices+in+place&rft.jtitle=IEEE+transactions+on+image+processing&rft.au=Portnoff%2C+M+R&rft.date=1999-09-01&rft.issn=1057-7149&rft.volume=8&rft.issue=9&rft_id=info:doi/10.1109%2F83.784438&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1057-7149&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1057-7149&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1057-7149&client=summon |