Improving the performance of classical linear algebra iterative methods via hybrid parallelism

We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss–Seidel, conjugate gradient and biconjugate gradient stabilized) on CPUs as well as variations of them. This class of algorithms, that are ubiquitous in computational framework...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of parallel and distributed computing Ročník 179; s. 104711
Hlavní autoři: Martinez-Ferrer, Pedro J., Arslan, Tufan, Beltran, Vicenç
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.09.2023
Témata:
ISSN:0743-7315, 1096-0848
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss–Seidel, conjugate gradient and biconjugate gradient stabilized) on CPUs as well as variations of them. This class of algorithms, that are ubiquitous in computational frameworks, are duly documented and the corresponding source code is made publicly available for reproducibility. Both weak and strong scalability benchmarks are conducted to statistically analyse their relative efficiencies. The weak scalability results assert the superiority of a task-based hybrid parallelisation over MPI-only and fork-join hybrid implementations. Indeed, the task-based model is able to achieve speedups of up to 25% larger than its MPI-only counterpart depending on the numerical method and the computational resources used. For strong scalability scenarios, hybrid methods based on tasks remain more efficient with moderate computational resources where data locality does not play an important role. Fork-join hybridisation often yields mixed results and hence does not seem to bring a competitive advantage over a much simpler MPI approach. •Four classical linear algebra iterative methods are hybridised on CPUs.•Implementations with MPI, fork-join, and task-based parallel models are compared.•For weak scalability scenarios, tasks yield the best performance results.•For strong scalability scenarios, tasks remain competitive with moderate resources.•Fork-join hybrid methods often yield mixed results.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2023.04.012