Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation
Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we...
Uložené v:
| Vydané v: | 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) s. 98 - 103 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Konferenčný príspevok.. Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.01.2016
|
| Predmet: | |
| ISSN: | 2380-6923 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we propose several novel algorithmic transformations in HT to expose higher Instruction-Level Parallelism. Our propositions are backed by theoretical proofs and a series of experiments using commercial general-purpose processors. Finally, we show that algorithm-architecture co-design leads to the most efficient realization of HT. A detailed experimental study with architectural modifications is presented for a commercial CGRA. The benchmarking results with some of the recent HT implementations show 30-40% improvement in performance. |
|---|---|
| Bibliografia: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
| ISSN: | 2380-6923 |
| DOI: | 10.1109/VLSID.2016.109 |