Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation

Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) s. 98 - 103
Hlavní autoři: Merchant, Farhad, Vatwani, Tarun, Chattopadhyay, Anupam, Raha, Soumyendu, Nandy, S. K., Narayan, Ranjani
Médium: Konferenční příspěvek Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.01.2016
Témata:
ISSN:2380-6923
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we propose several novel algorithmic transformations in HT to expose higher Instruction-Level Parallelism. Our propositions are backed by theoretical proofs and a series of experiments using commercial general-purpose processors. Finally, we show that algorithm-architecture co-design leads to the most efficient realization of HT. A detailed experimental study with architectural modifications is presented for a commercial CGRA. The benchmarking results with some of the recent HT implementations show 30-40% improvement in performance.
AbstractList Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense research on HT, there exists a scope to expose higher Instruction Level Parallelism in HT through algorithmic transforms. In this paper, we propose several novel algorithmic transformations in HT to expose higher Instruction-Level Parallelism. Our propositions are backed by theoretical proofs and a series of experiments using commercial general-purpose processors. Finally, we show that algorithm-architecture co-design leads to the most efficient realization of HT. A detailed experimental study with architectural modifications is presented for a commercial CGRA. The benchmarking results with some of the recent HT implementations show 30-40% improvement in performance.
Author Nandy, S. K.
Narayan, Ranjani
Vatwani, Tarun
Chattopadhyay, Anupam
Merchant, Farhad
Raha, Soumyendu
Author_xml – sequence: 1
  givenname: Farhad
  surname: Merchant
  fullname: Merchant, Farhad
  email: farhad@cadl.iisc.ernet.in
  organization: CADLab, Indian Inst. of Sci., Bangalore, India
– sequence: 2
  givenname: Tarun
  surname: Vatwani
  fullname: Vatwani, Tarun
  email: tarun@cadl.iisc.ernet.in
– sequence: 3
  givenname: Anupam
  surname: Chattopadhyay
  fullname: Chattopadhyay, Anupam
  email: anupam@ntu.edu.sg
  organization: Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
– sequence: 4
  givenname: Soumyendu
  surname: Raha
  fullname: Raha, Soumyendu
  email: raha@serc.iisc.ernet.in
  organization: Sci. Comput. Lab., Indian Inst. of Sci., Bangalore, India
– sequence: 5
  givenname: S. K.
  surname: Nandy
  fullname: Nandy, S. K.
  email: nandy@serc.iisc.ernet.in
  organization: CADLab, Indian Inst. of Sci., Bangalore, India
– sequence: 6
  givenname: Ranjani
  surname: Narayan
  fullname: Narayan, Ranjani
  email: ranjani.narayan@morphingmachines.com
BookMark eNotjs9LwzAcxaMouM1dvXjJ0UtnfrVNjmVuThiIOr2WNP1mi7TNTDph_vUW5-nxgfc-vDG66HwHCN1QMqOUqPuP9dvTw4wRms0GPENTlUsqspzLnBB2jkaMS5JkivErNI7xkxAiU5KPkC3MzsG367Z4Ya0zDroev7zipTa9D-5H9853uDriotkO3O_apAjDpAfTHwLguU9qiG7bYW_xyh8i7HxTQ8CboLtofWj_DNfo0uomwvQ_J-h9udjMV8n6-fFpXqwTx4jsE2mUZRnwjGlgUiumiKgVUaKuiNUgaV7XxmhOoILUVMxWAqimymhTC-CcT9DdybsP_usAsS9bFw00je5gOFdSSVOlSCbSoXp7qjoAKPfBtTocy1xwobjgvz3UaDI
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IL
CBEJK
RIE
RIL
7SP
8FD
L7M
DOI 10.1109/VLSID.2016.109
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
Electronics & Communications Abstracts
Technology Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781467387002
1467387002
EISSN 2380-6923
EndPage 103
ExternalDocumentID 7434934
Genre orig-research
GroupedDBID 23M
29R
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
7SP
8FD
L7M
ID FETCH-LOGICAL-i208t-8c9f26e362ae28a92904d9094db0fae817ddcca30ebe5cb2fb4e1a19cacd4e333
IEDL.DBID RIE
ISICitedReferencesCount 9
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000386981600031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Thu Jul 10 22:46:12 EDT 2025
Wed Aug 27 02:04:49 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i208t-8c9f26e362ae28a92904d9094db0fae817ddcca30ebe5cb2fb4e1a19cacd4e333
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
PQID 1815990645
PQPubID 23500
PageCount 6
ParticipantIDs proquest_miscellaneous_1815990645
ieee_primary_7434934
PublicationCentury 2000
PublicationDate 2016-01-01
PublicationDateYYYYMMDD 2016-01-01
PublicationDate_xml – month: 01
  year: 2016
  text: 2016-01-01
  day: 01
PublicationDecade 2010
PublicationTitle 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID)
PublicationTitleAbbrev ICVD
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008507
ssj0001772259
Score 1.9991778
Snippet Householder Transformation (HT) is a prime building block of widely used numerical linear algebra primitives such as QR factorization. Despite years of intense...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 98
SubjectTerms Algorithm design and analysis
Algorithms
Co-design
computation
Computer architecture
Conferences
Exposure
Factorization
Heat treatment
Householder transformations
Linear algebra
numerical linear algebra
Optimization
Parallel processing
parallelism
reconfigurable architectures
Software algorithms
Transformations (mathematics)
Transforms
Title Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation
URI https://ieeexplore.ieee.org/document/7434934
https://www.proquest.com/docview/1815990645
WOSCitedRecordID wos000386981600031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JT8JAFJ4A8aAXFzDiljHxaKHL0M4cCUIwIQQVCTcynXkjJNoaFhP_va9DBRK9eGsPbTrvTef73k7IrfAUxK7EP82wyGForjnScHAMixXCF5jIziEb9aJ-n4_HYlAgd5taGACwyWdQyy5tLF-napW5yuqIdkwErEiKURSua7W2_hSkiZbK56cwR6KTN2n0XFEf9Z4f7rNMrrBmkw_tKJVf568Flc7h_z7niFS21Xl0sMGdY1KA5IQc7DQWLBPTVNMZZL4C2rY9IvBF9PGJdux0nbz0ksZftPn2ivfL6bvT3Iko0FbqaJvaQVNDu-lqAVmUCuZ0uMNz06RCXjrtYavr5BMVnJnv8qXDlTB-CAhaEnwukRq5TAu08HTsGgnci7RGlQYuqrahYt_EDDzpCSWVZhAEwSkpJWkCZ4T62gjtxxzNIbTwQsNNDBGuXHpShEyxKilnEpt8rJtmTHJhVcnNj8gnuJGz6IRMANcxQarRQGgMWeP870cvyH6mv7X_45KUlvMVXJE99bmcLebXdjd8A92KuR4
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwEB2xScCFXewYiSOBLG5iH6vSqohSsRTELXLsMVSCBJUWib9n4gaoBBduySFRPOP4vdkBjmSgMfMV_WmWJx4nc81TVqBneaYJvtAmbg7ZfSfpdsXDg7yaguPvWhhEdMlneFJeuli-KfSodJWdEtpxGfFpmK1xHvrjaq0fjwoRRUfmq3NYENWp2jQGvjy979yen5W5XPGJSz90w1R-ncAOVlpL__ugZVj_qc9jV9_IswJTmK_C4kRrwTWwdf3Ux9JbwJquSwS9iF3fsJabr1MVX7Lsg9WfH-l--PTi1SdiCqxReMYld7DCsnYxesMyToUD1ptgukW-DnetZq_R9qqZCl4_9MXQE1raMEaCLYWhUESOfG4k2Xgm861CESTGkFIjn5Rb01loM46BCqRW2nCMomgDZvIix01gobHShJkgg4hsvNgKm2FCK1eBkjHXfAvWSomlr-O2GWklrC04_BJ5Slu5jE-oHGkdKZGNGoFjzGvbfz96APPt3mUn7Zx3L3ZgodTl2BuyCzPDwQj3YE6_D_tvg323Mz4Bpii8ZQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+29th+International+Conference+on+VLSI+Design+and+2016+15th+International+Conference+on+Embedded+Systems+%28VLSID%29&rft.atitle=Achieving+Efficient+QR+Factorization+by+Algorithm-Architecture+Co-design+of+Householder+Transformation&rft.au=Merchant%2C+Farhad&rft.au=Vatwani%2C+Tarun&rft.au=Chattopadhyay%2C+Anupam&rft.au=Raha%2C+Soumyendu&rft.date=2016-01-01&rft.pub=IEEE&rft.eissn=2380-6923&rft.spage=98&rft.epage=103&rft_id=info:doi/10.1109%2FVLSID.2016.109&rft.externalDocID=7434934