Efficient algorithms for computing rank‐revealing factorizations on a GPU
Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level‐3 BLAS....
Saved in:
| Published in: | Numerical linear algebra with applications Vol. 30; no. 6 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Oxford
Wiley Subscription Services, Inc
01.12.2023
|
| Subjects: | |
| ISSN: | 1070-5325, 1099-1506 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level‐3 BLAS. This article presents two alternative algorithms for computing a rank‐revealing factorization of the form , where and are orthogonal and is trapezoidal (or triangular if is square). Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix‐matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve significant acceleration over finely tuned GPU implementations of the SVD while providing low rank approximation errors close to that of the SVD. |
|---|---|
| AbstractList | Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level‐3 BLAS. This article presents two alternative algorithms for computing a rank‐revealing factorization of the form A=UTV∗$$ \mathbf{\mathsf{A}}=\mathbf{\mathsf{UT}}{\mathbf{\mathsf{V}}}^{\ast } $$, where U$$ \mathbf{\mathsf{U}} $$ and V$$ \mathbf{\mathsf{V}} $$ are orthogonal and T$$ \mathbf{\mathsf{T}} $$ is trapezoidal (or triangular if A$$ \mathbf{\mathsf{A}} $$ is square). Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix‐matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve significant acceleration over finely tuned GPU implementations of the SVD while providing low rank approximation errors close to that of the SVD. Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level‐3 BLAS. This article presents two alternative algorithms for computing a rank‐revealing factorization of the form , where and are orthogonal and is trapezoidal (or triangular if is square). Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix‐matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve significant acceleration over finely tuned GPU implementations of the SVD while providing low rank approximation errors close to that of the SVD. |
| Author | Gopal, Abinand Heavner, Nathan Martinsson, Per‐Gunnar Chen, Chao |
| Author_xml | – sequence: 1 givenname: Nathan surname: Heavner fullname: Heavner, Nathan organization: Department of Applied Mathematics University of Colorado at Boulder Boulder Colorado USA – sequence: 2 givenname: Chao orcidid: 0000-0002-5385-3651 surname: Chen fullname: Chen, Chao organization: Oden Institute University of Texas at Austin Austin Texas USA – sequence: 3 givenname: Abinand surname: Gopal fullname: Gopal, Abinand organization: Department of Mathematics Yale University New Haven Connecticut USA – sequence: 4 givenname: Per‐Gunnar surname: Martinsson fullname: Martinsson, Per‐Gunnar organization: Oden Institute & Department of Mathematics University of Texas at Austin Austin Texas USA |
| BookMark | eNotkMFKAzEURYNUsK2CnxBw42bqSzKZZJZSahULurDrkKZJnTpNapIKuvIT_Ea_xBnq6l0uh_vgjNDAB28RuiQwIQD0xrd6QjnhJ2hIoK4LwqEa9FlAwRnlZ2iU0hYAKl6zIXqcOdeYxvqMdbsJscmvu4RdiNiE3f6QG7_BUfu33--faD-sbvvCaZM79EvnJviEg8caz5-X5-jU6TbZi_87Rsu72cv0vlg8zR-mt4vCUFLlwoE2tXZSk5UGblZkzSx1outkl6TgdWk4EZTWxAJ1q5KuudBrWZUgHSWOjdHVcXcfw_vBpqy24RB991JRKTkTpWSio66PlIkhpWid2sdmp-OnIqB6VapTpXpV7A9bmF7Z |
| Cites_doi | 10.1109/78.139256 10.1137/090771806 10.1109/7.250406 10.1023/A:1019112103049 10.1145/3242670 10.1007/BF01396757 10.1073/pnas.0709640104 10.1145/2535371 10.1137/0913043 10.1137/S0097539704442696 10.1137/0905030 10.1137/1.9780898719574 10.1137/1.9781611971446 10.1137/18M1179432 10.1145/355984.355990 10.1109/78.157185 10.1137/0717073 10.4153/CMB-1966-083-2 10.1137/S0895479892241913 10.1109/ISBI.2008.4541126 10.1007/s00607-002-1469-6 10.1137/S0895479892242232 10.1016/j.parco.2009.12.005 10.1007/BF01436084 10.1109/IPDPSW.2010.5470941 10.1007/978-3-319-06548-9_1 10.1137/15M1026080 10.1137/0910005 10.1016/j.acha.2010.02.003 10.1137/0911029 10.1007/978-1-4757-4532-0_2 10.1137/S0895479891223781 10.1007/s00211-007-0114-x 10.1016/0024-3795(87)90103-0 10.1155/2010/540159 10.1137/17M1141977 10.1145/3019134 10.1007/BF01436075 10.1137/S1064827597325141 10.1137/13092157X 10.1137/1.9781611971408 10.1007/BF02288367 10.1109/JPROC.2008.917757 10.1137/0917055 10.1137/1.9781611971484 10.1023/A:1019254318361 10.1137/S1064827595296732 10.1137/1.9781611971217 10.1145/567806.567807 10.1093/qmath/11.1.50 10.1137/16M1074527 |
| ContentType | Journal Article |
| Copyright | 2023 John Wiley & Sons, Ltd. |
| Copyright_xml | – notice: 2023 John Wiley & Sons, Ltd. |
| DBID | AAYXX CITATION 7SC 7TB 8FD FR3 JQ2 KR7 L7M L~C L~D |
| DOI | 10.1002/nla.2515 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Civil Engineering Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Civil Engineering Abstracts CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Mathematics |
| EISSN | 1099-1506 |
| ExternalDocumentID | 10_1002_nla_2515 |
| GroupedDBID | -~X .3N .4S .DC .GA .Y3 05W 0R~ 10A 123 1L6 1OB 1OC 1ZS 31~ 33P 3SF 3WU 4.4 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 5VS 66C 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHQN AAMMB AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAYXX AAZKR ABCQN ABCUV ABEFU ABEML ABIJN ABPVW ACAHQ ACBWZ ACCZN ACGFS ACPOU ACRPL ACSCC ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADNMO ADOZA ADXAS ADZMN AEFGJ AEIGN AEIMD AENEX AEUYR AEYWJ AFBPY AFFPM AFGKR AFWVQ AFZJQ AGHNM AGQPQ AGXDD AGYGG AHBTC AIDQK AIDYY AIQQE AITYG AIURR AJXKR ALAGY ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMVHM AMYDB ARCSS ASPBG ATUGU AUFTA AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BMNLL BMXJE BNHUX BROTX BRXPI BY8 CITATION CS3 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM DU5 EBS EDO EJD F00 F01 F04 FEDTE G-S G.N GBZZK GNP GODZA H.T H.X HBH HF~ HGLYW HHY HVGLF HZ~ IX1 J0M JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES M6O MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A NF~ O66 O8X O9- OIG P2P P2W P2X P4D PALCI PQQKQ Q.N Q11 QB0 QRW R.K RIWAO RJQFR ROL RX1 RYL SAMSI SUPJJ TUS UB1 V2E W8V W99 WBKPD WIB WIH WIK WOHZO WQJ WXSBR WYISQ XBAML XG1 XPP XV2 ZZTAW ~IA ~WT 7SC 7TB 8FD FR3 JQ2 KR7 L7M L~C L~D |
| ID | FETCH-LOGICAL-c216t-f0ac9af8a1ba05cb1d3e2f7c9a8d3e87594c5172291e02fb42d57ad86408f21f3 |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001000637800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1070-5325 |
| IngestDate | Sun Jul 13 05:34:18 EDT 2025 Sat Nov 29 05:32:00 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c216t-f0ac9af8a1ba05cb1d3e2f7c9a8d3e87594c5172291e02fb42d57ad86408f21f3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-5385-3651 |
| PQID | 2885374837 |
| PQPubID | 2034341 |
| ParticipantIDs | proquest_journals_2885374837 crossref_primary_10_1002_nla_2515 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-12-00 20231201 |
| PublicationDateYYYYMMDD | 2023-12-01 |
| PublicationDate_xml | – month: 12 year: 2023 text: 2023-12-00 |
| PublicationDecade | 2020 |
| PublicationPlace | Oxford |
| PublicationPlace_xml | – name: Oxford |
| PublicationTitle | Numerical linear algebra with applications |
| PublicationYear | 2023 |
| Publisher | Wiley Subscription Services, Inc |
| Publisher_xml | – name: Wiley Subscription Services, Inc |
| References | e_1_2_10_23_1 e_1_2_10_46_1 e_1_2_10_21_1 e_1_2_10_44_1 e_1_2_10_42_1 e_1_2_10_40_1 Golub GH (e_1_2_10_33_1) 1996 Stewart GW (e_1_2_10_22_1) 1994 e_1_2_10_2_1 e_1_2_10_4_1 e_1_2_10_18_1 e_1_2_10_53_1 e_1_2_10_6_1 e_1_2_10_16_1 e_1_2_10_39_1 e_1_2_10_55_1 e_1_2_10_8_1 e_1_2_10_14_1 e_1_2_10_37_1 e_1_2_10_13_1 e_1_2_10_34_1 e_1_2_10_11_1 e_1_2_10_32_1 e_1_2_10_30_1 Rudelson M (e_1_2_10_57_1) 2010 e_1_2_10_51_1 e_1_2_10_29_1 e_1_2_10_27_1 e_1_2_10_25_1 e_1_2_10_48_1 e_1_2_10_24_1 e_1_2_10_45_1 e_1_2_10_43_1 e_1_2_10_20_1 e_1_2_10_41_1 e_1_2_10_52_1 e_1_2_10_3_1 e_1_2_10_19_1 e_1_2_10_54_1 e_1_2_10_5_1 e_1_2_10_17_1 e_1_2_10_38_1 e_1_2_10_56_1 e_1_2_10_7_1 e_1_2_10_15_1 e_1_2_10_36_1 e_1_2_10_12_1 e_1_2_10_35_1 e_1_2_10_9_1 e_1_2_10_10_1 e_1_2_10_31_1 e_1_2_10_50_1 e_1_2_10_28_1 e_1_2_10_49_1 e_1_2_10_26_1 e_1_2_10_47_1 |
| References_xml | – ident: e_1_2_10_3_1 doi: 10.1109/78.139256 – ident: e_1_2_10_40_1 doi: 10.1137/090771806 – ident: e_1_2_10_16_1 doi: 10.1109/7.250406 – ident: e_1_2_10_50_1 doi: 10.1023/A:1019112103049 – ident: e_1_2_10_23_1 doi: 10.1145/3242670 – ident: e_1_2_10_34_1 doi: 10.1007/BF01396757 – ident: e_1_2_10_14_1 doi: 10.1073/pnas.0709640104 – ident: e_1_2_10_36_1 doi: 10.1145/2535371 – ident: e_1_2_10_5_1 doi: 10.1137/0913043 – ident: e_1_2_10_15_1 doi: 10.1137/S0097539704442696 – ident: e_1_2_10_11_1 doi: 10.1137/0905030 – ident: e_1_2_10_32_1 doi: 10.1137/1.9780898719574 – ident: e_1_2_10_37_1 doi: 10.1137/1.9781611971446 – ident: e_1_2_10_49_1 doi: 10.1137/18M1179432 – ident: e_1_2_10_54_1 doi: 10.1145/355984.355990 – ident: e_1_2_10_19_1 – ident: e_1_2_10_13_1 doi: 10.1109/78.157185 – ident: e_1_2_10_10_1 doi: 10.1137/0717073 – ident: e_1_2_10_20_1 doi: 10.4153/CMB-1966-083-2 – ident: e_1_2_10_51_1 doi: 10.1137/S0895479892241913 – ident: e_1_2_10_27_1 doi: 10.1109/ISBI.2008.4541126 – volume-title: Matrix computations year: 1996 ident: e_1_2_10_33_1 – ident: e_1_2_10_18_1 doi: 10.1007/s00607-002-1469-6 – ident: e_1_2_10_35_1 doi: 10.1137/S0895479892242232 – ident: e_1_2_10_41_1 doi: 10.1016/j.parco.2009.12.005 – ident: e_1_2_10_29_1 doi: 10.1007/BF01436084 – ident: e_1_2_10_42_1 doi: 10.1109/IPDPSW.2010.5470941 – ident: e_1_2_10_43_1 doi: 10.1007/978-3-319-06548-9_1 – ident: e_1_2_10_52_1 doi: 10.1137/15M1026080 – ident: e_1_2_10_48_1 doi: 10.1137/0910005 – ident: e_1_2_10_56_1 doi: 10.1016/j.acha.2010.02.003 – ident: e_1_2_10_6_1 doi: 10.1137/0911029 – ident: e_1_2_10_25_1 doi: 10.1007/978-1-4757-4532-0_2 – ident: e_1_2_10_2_1 doi: 10.1137/S0895479891223781 – ident: e_1_2_10_39_1 doi: 10.1007/s00211-007-0114-x – start-page: 225 volume-title: Numerical analysis 1993: Proceedings of the 15th Dundee Conference June–July 1993 year: 1994 ident: e_1_2_10_22_1 – ident: e_1_2_10_38_1 – ident: e_1_2_10_4_1 doi: 10.1016/0024-3795(87)90103-0 – ident: e_1_2_10_28_1 doi: 10.1155/2010/540159 – ident: e_1_2_10_53_1 doi: 10.1137/17M1141977 – ident: e_1_2_10_17_1 doi: 10.1145/3019134 – ident: e_1_2_10_7_1 doi: 10.1007/BF01436075 – ident: e_1_2_10_55_1 doi: 10.1137/S1064827597325141 – start-page: 1576 volume-title: Proceedings of the International Congress of Mathematicians 2010 (ICM 2010) year: 2010 ident: e_1_2_10_57_1 – ident: e_1_2_10_46_1 doi: 10.1137/13092157X – ident: e_1_2_10_12_1 doi: 10.1137/1.9781611971408 – ident: e_1_2_10_44_1 doi: 10.1007/BF02288367 – ident: e_1_2_10_26_1 doi: 10.1109/JPROC.2008.917757 – ident: e_1_2_10_21_1 doi: 10.1137/0917055 – ident: e_1_2_10_9_1 doi: 10.1137/1.9781611971484 – ident: e_1_2_10_24_1 doi: 10.1023/A:1019254318361 – ident: e_1_2_10_30_1 doi: 10.1137/S1064827595296732 – ident: e_1_2_10_8_1 doi: 10.1137/1.9781611971217 – ident: e_1_2_10_31_1 doi: 10.1145/567806.567807 – ident: e_1_2_10_45_1 doi: 10.1093/qmath/11.1.50 – ident: e_1_2_10_47_1 doi: 10.1137/16M1074527 |
| SSID | ssj0006593 |
| Score | 2.3336682 |
| Snippet | Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Index Database |
| SubjectTerms | Algorithms Computation Factorization Matrices (mathematics) Singular value decomposition |
| Title | Efficient algorithms for computing rank‐revealing factorizations on a GPU |
| URI | https://www.proquest.com/docview/2885374837 |
| Volume | 30 |
| WOSCitedRecordID | wos001000637800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVWIB databaseName: Wiley Online Library Full Collection 2020 customDbUrl: eissn: 1099-1506 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006593 issn: 1070-5325 databaseCode: DRFUL dateStart: 19960101 isFulltext: true titleUrlDefault: https://onlinelibrary.wiley.com providerName: Wiley-Blackwell |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWLQc4IJ6iUJCRuFUpiZOs7SOCtkiU1Qp1pd5WfkKlVbbal3rkzInfyC9hHD-SLZdy4BJFluJEni-eseebzwi9LZRwUbvNdKkLJ6qtMzmSRSaVpcaC_668zuwZHY_ZxQWfDAY_Yy3Mdk6bhl1f86v_ampoA2O70tl_MHfqFBrgHowOVzA7XG9l-ONWFKKljs-_LWDt_91rLrTs8U3LcnYHtSeWg9NwEm1Ruj97JxZmujSCODydTPvx63jjMzzzQxeeiqV7h0s9hxq5Xja822QV21BUM2736TtCgYkJ_0ViAcES3rMGpGPo6G6_3IkdrEJ12AS6i19_CgG4WPb3LkjZ44H46RYmnKwufenzkQltnGdO97A_R4fczWU34f419Qcp2bk4gpCt7txbTOnf8HqJi-h1m0mrru2evIP2CK05G6K9j19PpmfJr4-8hHP65ihlnJN38a27wc2ub28DlvOH6EFYaeD3HiGP0MA0j9H9L0mmd_UEfU5YwR1WMGAFJ6xgh5XfP34llOBdlOBFgwUGlDxF05Pj8w-fsnC6RqZIMVpnNheKC8tEIUVeK1no0hBLoY3BHSxjeaVqCG8JL0xOrKyIrqnQbFTlzJLCls_QsFk05jnCJVPaWFlLqcAnaMEMzQUHz0k5qxQp99GbOCyzKy-iMrs57PvoII7XLPxNqxlhEE1Sd-jBi1t08RLd60B2gIbr5ca8QnfVdn25Wr4O5vwDQsJ0eA |
| linkProvider | Wiley-Blackwell |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+algorithms+for+computing+rank%E2%80%90revealing+factorizations+on+a+GPU&rft.jtitle=Numerical+linear+algebra+with+applications&rft.au=Heavner%2C+Nathan&rft.au=Chen%2C+Chao&rft.au=Gopal%2C+Abinand&rft.au=Martinsson%2C+Per%E2%80%90Gunnar&rft.date=2023-12-01&rft.issn=1070-5325&rft.eissn=1099-1506&rft.volume=30&rft.issue=6&rft_id=info:doi/10.1002%2Fnla.2515&rft.externalDBID=n%2Fa&rft.externalDocID=10_1002_nla_2515 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1070-5325&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1070-5325&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1070-5325&client=summon |