Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation
Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we...
Uloženo v:
| Vydáno v: | 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) s. 98 - 103 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Konferenční příspěvek Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.01.2016
|
| Témata: | |
| ISSN: | 2380-6923 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we propose several novel algorithmic transformations in HT to expose higher Instruction-Level Parallelism. Our propositions are backed by theoretical proofs and a series of experiments using commercial general-purpose processors. Finally, we show that algorithm-architecture co-design leads to the most efficient realization of HT. A detailed experimental study with architectural modifications is presented for a commercial CGRA. The benchmarking results with some of the recent HT implementations show 30-40% improvement in performance. |
|---|---|
| AbstractList | Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we propose several novel algorithmic transformations in HT to expose higher Instruction-Level Parallelism. Our propositions are backed by theoretical proofs and a series of experiments using commercial general-purpose processors. Finally, we show that algorithm-architecture co-design leads to the most efficient realization of HT. A detailed experimental study with architectural modifications is presented for a commercial CGRA. The benchmarking results with some of the recent HT implementations show 30-40% improvement in performance. |
| Author | Nandy, S. K. Narayan, Ranjani Vatwani, Tarun Chattopadhyay, Anupam Merchant, Farhad Raha, Soumyendu |
| Author_xml | – sequence: 1 givenname: Farhad surname: Merchant fullname: Merchant, Farhad email: farhad@cadl.iisc.ernet.in organization: CADLab, Indian Inst. of Sci., Bangalore, India – sequence: 2 givenname: Tarun surname: Vatwani fullname: Vatwani, Tarun email: tarun@cadl.iisc.ernet.in – sequence: 3 givenname: Anupam surname: Chattopadhyay fullname: Chattopadhyay, Anupam email: anupam@ntu.edu.sg organization: Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore – sequence: 4 givenname: Soumyendu surname: Raha fullname: Raha, Soumyendu email: raha@serc.iisc.ernet.in organization: Sci. Comput. Lab., Indian Inst. of Sci., Bangalore, India – sequence: 5 givenname: S. K. surname: Nandy fullname: Nandy, S. K. email: nandy@serc.iisc.ernet.in organization: CADLab, Indian Inst. of Sci., Bangalore, India – sequence: 6 givenname: Ranjani surname: Narayan fullname: Narayan, Ranjani email: ranjani.narayan@morphingmachines.com |
| BookMark | eNotjs9LwzAcxaMouM1dvXjJ0UtnfrVNjmVuThiIOr2WNP1mi7TNTDph_vUW5-nxgfc-vDG66HwHCN1QMqOUqPuP9dvTw4wRms0GPENTlUsqspzLnBB2jkaMS5JkivErNI7xkxAiU5KPkC3MzsG367Z4Ya0zDroev7zipTa9D-5H9853uDriotkO3O_apAjDpAfTHwLguU9qiG7bYW_xyh8i7HxTQ8CboLtofWj_DNfo0uomwvQ_J-h9udjMV8n6-fFpXqwTx4jsE2mUZRnwjGlgUiumiKgVUaKuiNUgaV7XxmhOoILUVMxWAqimymhTC-CcT9DdybsP_usAsS9bFw00je5gOFdSSVOlSCbSoXp7qjoAKPfBtTocy1xwobjgvz3UaDI |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding Journal Article |
| DBID | 6IE 6IL CBEJK RIE RIL 7SP 8FD L7M |
| DOI | 10.1109/VLSID.2016.109 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) (UW System Shared) IEEE Proceedings Order Plans (POP All) 1998-Present Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
| DatabaseTitle | Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 9781467387002 1467387002 |
| EISSN | 2380-6923 |
| EndPage | 103 |
| ExternalDocumentID | 7434934 |
| Genre | orig-research |
| GroupedDBID | 23M 29R 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS 7SP 8FD L7M |
| ID | FETCH-LOGICAL-i208t-8c9f26e362ae28a92904d9094db0fae817ddcca30ebe5cb2fb4e1a19cacd4e333 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 9 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000386981600031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Thu Jul 10 22:46:12 EDT 2025 Wed Aug 27 02:04:49 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i208t-8c9f26e362ae28a92904d9094db0fae817ddcca30ebe5cb2fb4e1a19cacd4e333 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
| PQID | 1815990645 |
| PQPubID | 23500 |
| PageCount | 6 |
| ParticipantIDs | proquest_miscellaneous_1815990645 ieee_primary_7434934 |
| PublicationCentury | 2000 |
| PublicationDate | 2016-01-01 |
| PublicationDateYYYYMMDD | 2016-01-01 |
| PublicationDate_xml | – month: 01 year: 2016 text: 2016-01-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) |
| PublicationTitleAbbrev | ICVD |
| PublicationYear | 2016 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0008507 ssj0001772259 |
| Score | 1.9991778 |
| Snippet | Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense... |
| SourceID | proquest ieee |
| SourceType | Aggregation Database Publisher |
| StartPage | 98 |
| SubjectTerms | Algorithm design and analysis Algorithms Co-design computation Computer architecture Conferences Exposure Factorization Heat treatment Householder transformations Linear algebra numerical linear algebra Optimization Parallel processing parallelism reconfigurable architectures Software algorithms Transformations (mathematics) Transforms |
| Title | Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation |
| URI | https://ieeexplore.ieee.org/document/7434934 https://www.proquest.com/docview/1815990645 |
| WOSCitedRecordID | wos000386981600031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JT8JAFJ4A8aAXFzDiljHxaKHL0M4cCUIwIQQVCTcynXkjJNoaFhP_va9DBRK9eGsPbTrvTef73k7IrfAUxK7EP82wyGForjnScHAMixXCF5jIziEb9aJ-n4_HYlAgd5taGACwyWdQyy5tLF-napW5yuqIdkwErEiKURSua7W2_hSkiZbK56cwR6KTN2n0XFEf9Z4f7rNMrrBmkw_tKJVf568Flc7h_z7niFS21Xl0sMGdY1KA5IQc7DQWLBPTVNMZZL4C2rY9IvBF9PGJdux0nbz0ksZftPn2ivfL6bvT3Iko0FbqaJvaQVNDu-lqAVmUCuZ0uMNz06RCXjrtYavr5BMVnJnv8qXDlTB-CAhaEnwukRq5TAu08HTsGgnci7RGlQYuqrahYt_EDDzpCSWVZhAEwSkpJWkCZ4T62gjtxxzNIbTwQsNNDBGuXHpShEyxKilnEpt8rJtmTHJhVcnNj8gnuJGz6IRMANcxQarRQGgMWeP870cvyH6mv7X_45KUlvMVXJE99bmcLebXdjd8A92KuR4 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwEB2xScCFXewYiSOBLG5iH6vSqohSsRTELXLsMVSCBJUWib9n4gaoBBduySFRPOP4vdkBjmSgMfMV_WmWJx4nc81TVqBneaYJvtAmbg7ZfSfpdsXDg7yaguPvWhhEdMlneFJeuli-KfSodJWdEtpxGfFpmK1xHvrjaq0fjwoRRUfmq3NYENWp2jQGvjy979yen5W5XPGJSz90w1R-ncAOVlpL__ugZVj_qc9jV9_IswJTmK_C4kRrwTWwdf3Ux9JbwJquSwS9iF3fsJabr1MVX7Lsg9WfH-l--PTi1SdiCqxReMYld7DCsnYxesMyToUD1ptgukW-DnetZq_R9qqZCl4_9MXQE1raMEaCLYWhUESOfG4k2Xgm861CESTGkFIjn5Rb01loM46BCqRW2nCMomgDZvIix01gobHShJkgg4hsvNgKm2FCK1eBkjHXfAvWSomlr-O2GWklrC04_BJ5Slu5jE-oHGkdKZGNGoFjzGvbfz96APPt3mUn7Zx3L3ZgodTl2BuyCzPDwQj3YE6_D_tvg323Mz4Bpii8ZQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+29th+International+Conference+on+VLSI+Design+and+2016+15th+International+Conference+on+Embedded+Systems+%28VLSID%29&rft.atitle=Achieving+Efficient+QR+Factorization+by+Algorithm-Architecture+Co-design+of+Householder+Transformation&rft.au=Merchant%2C+Farhad&rft.au=Vatwani%2C+Tarun&rft.au=Chattopadhyay%2C+Anupam&rft.au=Raha%2C+Soumyendu&rft.date=2016-01-01&rft.pub=IEEE&rft.eissn=2380-6923&rft.spage=98&rft.epage=103&rft_id=info:doi/10.1109%2FVLSID.2016.109&rft.externalDocID=7434934 |