Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation

Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) S. 98 - 103
Hauptverfasser: Merchant, Farhad, Vatwani, Tarun, Chattopadhyay, Anupam, Raha, Soumyendu, Nandy, S. K., Narayan, Ranjani
Format: Tagungsbericht Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 01.01.2016
Schlagworte:
ISSN:2380-6923
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we propose several novel algorithmic transformations in HT to expose higher Instruction-Level Parallelism. Our propositions are backed by theoretical proofs and a series of experiments using commercial general-purpose processors. Finally, we show that algorithm-architecture co-design leads to the most efficient realization of HT. A detailed experimental study with architectural modifications is presented for a commercial CGRA. The benchmarking results with some of the recent HT implementations show 30-40% improvement in performance.
Bibliographie:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
ISSN:2380-6923
DOI:10.1109/VLSID.2016.109