Compensated summation and dot product algorithms for floating-point vectors on parallel architectures: Error bounds, implementation and application in the Krylov subspace methods

The aim of the paper is to improve parallel algorithms that obtain higher precision in floating point reduction-type operations while working within the basic floating point type. The compensated parallel variants of summation and dot product operations for floating point vectors are considered (lev...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Journal of computational and applied mathematics Ročník 414; s. 114434
Hlavní autori:	Evstigneev, N.M., Ryabkov, O.I., Bocharov, A.N., Petrovskiy, V.P., Teplyakov, I.O.
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Elsevier B.V 01.11.2022
Predmet:	Accurate dot product Accurate summation Compensated algorithms General purpose GPU Krylov subspace solvers Krylov subspace solvers Compensated algorithms Accurate dot product General purpose GPU Accurate summation
ISSN:	0377-0427, 1879-1778
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	The aim of the paper is to improve parallel algorithms that obtain higher precision in floating point reduction-type operations while working within the basic floating point type. The compensated parallel variants of summation and dot product operations for floating point vectors are considered (level 1 BLAS operations). The methods are based on the work of Rump, Ogita and Oishi. Parallel implementations in block and pairwise reduction variants are under consideration. Analytical error bounds are obtained for real- and complex-valued vectors that are represented by floating point numbers according to the IEEE 754 (IEC 60559) standard for all variants of parallel algorithms. The algorithms are written in C++ Compute Unified Device Architecture (CUDA) for Graphics Processing Units (GPUs) and their accuracy is tested for different vector sizes and different condition numbers. The suggested compensated variant is compared to the multiple-precision library for GPUs in terms of efficiency. The designed algorithms are tested in Krylov-type matrix-based methods with preconditioners that originate from different challenging computational problems. It is shown that the compensated variant of algorithms allows one to accelerate convergence and obtain more accurate results even when the matrix operations are in base precision.
ISSN:	0377-0427 1879-1778
DOI:	10.1016/j.cam.2022.114434