Scalable parallel tridiagonal algorithms with diagonal pivoting and their optimization for many-core architectures

Saved in:
Bibliographic Details
Title: Scalable parallel tridiagonal algorithms with diagonal pivoting and their optimization for many-core architectures
Authors: Chang, Li-Wen
Contributors: Hwu, Wen-Mei W.
Publication Year: 2014
Collection: University of Illinois at Urbana-Champaign: IDEALS (Illinois Digital Environment for Access to Learning and Scholarship)
Subject Terms: Tridiagonal Solver, SPIKE algorithm, Linear Recurrence, Cyclic Reduction, Diagonal Pivoting, Graphics Processing Unit (GPU) Computing, General Purpose computation on Graphics Processing Units (GPGPU), Many-core
Description: Tridiagonal solvers are important building blocks for a wide range of scientific applications that are commonly performance-sensitive. Recently, many-core architectures, such as GPUs, have become ubiquitous targets for these applications. Therefore, a high-performance general-purpose GPU tridiagonal solver becomes critical. However, no existing GPU tridiagonal solver provides comparable quality of solutions to most common, general-purpose CPU tridiagonal solvers, like Matlab or Intel MKL, due to no pivoting. Meanwhile, conventional pivoting algorithms are sequential and not applicable to GPUs. In this thesis, we propose three scalable tridiagonal algorithms with diagonal pivoting for better quality of solutions than the state-of-the-art GPU tridiagonal solvers. A SPIKE-Diagonal Pivoting algorithm efficiently partitions the workloads of a tridiagonal solver and provides pivoting in each partition. A Parallel Diagonal Pivoting algorithm transforms the conventional diagonal pivoting algorithm into a parallelizable form which can be solved by high-performance parallel linear recurrence solvers. An Adaptive R-Cyclic Reduction algorithm introduces pivoting into the conventional R-Cyclic Reduction family, which commonly suffers limited quality of solutions due to no applicable pivoting. Our proposed algorithms can provide comparable quality of solutions to CPU tridiagonal solvers, like Matlab or Intel MKL, without compromising the high throughput GPUs provide. ; Item withdrawn by Mark Zulauf (zulauf@illinois.edu) on 2014-07-16T19:23:38Z Item was in collections: University of Illinois Theses & Dissertations (ID: 1) No. of bitstreams: 7 Chang_Li-Wen.pdf: 585161 bytes, checksum: fa4904bdd341ba8fc97c1e372c8faa8a (MD5) Chang_Li-Wen.pdf: 585141 bytes, checksum: 88171c1746f89754009e1d0e13aa2253 (MD5) Chang_Li-Wen.pdf: 585497 bytes, checksum: 123eddacf2de5f0e14c054ff17a75116 (MD5) Chang_Li-Wen.pdf: 585384 bytes, checksum: bde40c354a3959a10f30344b8d6aea91 (MD5) Chang_Li-Wen.pdf: 585384 bytes, checksum: ...
Document Type: text
Language: English
Relation: http://hdl.handle.net/2142/50588
Availability: http://hdl.handle.net/2142/50588
Rights: Copyright 2014 Li-Wen Chang
Accession Number: edsbas.F565DEAD
Database: BASE
Description
Abstract:Tridiagonal solvers are important building blocks for a wide range of scientific applications that are commonly performance-sensitive. Recently, many-core architectures, such as GPUs, have become ubiquitous targets for these applications. Therefore, a high-performance general-purpose GPU tridiagonal solver becomes critical. However, no existing GPU tridiagonal solver provides comparable quality of solutions to most common, general-purpose CPU tridiagonal solvers, like Matlab or Intel MKL, due to no pivoting. Meanwhile, conventional pivoting algorithms are sequential and not applicable to GPUs. In this thesis, we propose three scalable tridiagonal algorithms with diagonal pivoting for better quality of solutions than the state-of-the-art GPU tridiagonal solvers. A SPIKE-Diagonal Pivoting algorithm efficiently partitions the workloads of a tridiagonal solver and provides pivoting in each partition. A Parallel Diagonal Pivoting algorithm transforms the conventional diagonal pivoting algorithm into a parallelizable form which can be solved by high-performance parallel linear recurrence solvers. An Adaptive R-Cyclic Reduction algorithm introduces pivoting into the conventional R-Cyclic Reduction family, which commonly suffers limited quality of solutions due to no applicable pivoting. Our proposed algorithms can provide comparable quality of solutions to CPU tridiagonal solvers, like Matlab or Intel MKL, without compromising the high throughput GPUs provide. ; Item withdrawn by Mark Zulauf (zulauf@illinois.edu) on 2014-07-16T19:23:38Z Item was in collections: University of Illinois Theses & Dissertations (ID: 1) No. of bitstreams: 7 Chang_Li-Wen.pdf: 585161 bytes, checksum: fa4904bdd341ba8fc97c1e372c8faa8a (MD5) Chang_Li-Wen.pdf: 585141 bytes, checksum: 88171c1746f89754009e1d0e13aa2253 (MD5) Chang_Li-Wen.pdf: 585497 bytes, checksum: 123eddacf2de5f0e14c054ff17a75116 (MD5) Chang_Li-Wen.pdf: 585384 bytes, checksum: bde40c354a3959a10f30344b8d6aea91 (MD5) Chang_Li-Wen.pdf: 585384 bytes, checksum: ...