Search Results - "communication avoiding algorithm"
-
1
Parallel Communication-Avoiding Algorithm for Triangular Matrix Inversion on Homogeneous and Heterogeneous Platforms
ISSN: 0885-7458, 1573-7640Published: Boston Springer US 01.08.2015Published in International journal of parallel programming (01.08.2015)“…We address in this paper the parallelization of a recursive algorithm for large scale triangular matrix inversion based on the ‘Divide and Conquer’ (D&C)…”
Get full text
Journal Article -
2
An efficient randomized QLP algorithm for approximating the singular value decomposition
ISSN: 0020-0255, 1872-6291Published: Elsevier Inc 01.11.2023Published in Information sciences (01.11.2023)“…The rank-revealing pivoted QLP decomposition approximates the computationally prohibitive singular value decomposition (SVD) via two consecutive column-pivoted…”
Get full text
Journal Article -
3
Parallel Tall-and-Skinny QR Factorization Based on LU-CholeskyQR Algorithm
ISSN: 2168-9253Published: IEEE 02.09.2025Published in Proceedings / IEEE International Conference on Cluster Computing (02.09.2025)“…We present optimal parallel QR factorization algorithms with reduced communication overhead. QR factorization is widely applied to solve various problems in…”
Get full text
Conference Proceeding -
4
EA4RCA:Efficient AIE accelerator design framework for Regular Communication-Avoiding Algorithm
ISSN: 2331-8422Published: Ithaca Cornell University Library, arXiv.org 09.07.2024Published in arXiv.org (09.07.2024)“…With the introduction of the Adaptive Intelligence Engine (AIE), the Versal Adaptive Compute Acceleration Platform (Versal ACAP) has garnered great attention…”
Get full text
Paper -
5
A communication-avoiding implicit–explicit method for a free-surface ocean model
ISSN: 0021-9991, 1090-2716Published: United States Elsevier Inc 15.01.2016Published in Journal of computational physics (15.01.2016)“…We examine a nonlinear elimination method for the free-surface ocean equations based on barotropic–baroclinic decomposition. The two dimensional scalar…”
Get full text
Journal Article -
6
CholeskyQR2: a simple and communication-avoiding algorithm for computing a tall-skinny QR factorization on a large-scale parallel system
ISBN: 1479975621, 9781479975624Published: Piscataway, NJ, USA IEEE Press 01.11.2014Published in 2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (01.11.2014)“…Designing communication-avoiding algorithms is crucial for high performance computing on a large-scale parallel system…”
Get full text
Conference Proceeding -
7
Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters
ISSN: 2167-4337Published: ACM 11.11.2023Published in International Conference for High Performance Computing, Networking, Storage and Analysis (Online) (11.11.2023)“…This paper presents a unified communication optimization frame-work for sparse triangular solve (SpTRSV) algorithms on CPU and GPU clusters. The framework…”
Get full text
Conference Proceeding -
8
Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree
ISSN: 1530-2075Published: IEEE 01.05.2021Published in Proceedings - IEEE International Parallel and Distributed Processing Symposium (01.05.2021)“…The widely used alternating least squares (ALS) algorithm for the canonical polyadic (CP) tensor decomposition is dominated in cost by the matricized-tensor…”
Get full text
Conference Proceeding -
9
Communication-Avoiding Cholesky-QR2 for Rectangular Matrices
ISSN: 1530-2075Published: IEEE 01.05.2019Published in Proceedings - IEEE International Parallel and Distributed Processing Symposium (01.05.2019)“…Scalable QR factorization algorithms for solving least squares and eigenvalue problems are critical given the increasing parallelism within modern machines. We…”
Get full text
Conference Proceeding -
10
A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices
ISSN: 1530-2075Published: IEEE 01.05.2018Published in 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01.05.2018)“…We propose a new algorithm to improve the strong scalability of right-looking sparse LU factorization on distributed memory systems. Our 3D sparse LU algorithm…”
Get full text
Conference Proceeding -
11
A Sparse Direct Solver for Distributed Memory Xeon Phi-Accelerated Systems
ISSN: 1530-2075Published: IEEE 01.05.2015Published in Proceedings - IEEE International Parallel and Distributed Processing Symposium (01.05.2015)“…This paper presents the first sparse direct solver for distributed memory systems comprising hybrid multicourse CPU and Intel Xeon Pico-processors. It builds…”
Get full text
Conference Proceeding -
12
Communication Avoiding Algorithms: Analysis and Code Generation for Parallel Systems
ISSN: 1089-795XPublished: IEEE 01.10.2015Published in 2015 International Conference on Parallel Architecture and Compilation (PACT) (01.10.2015)“… The class of .5D communication-avoiding algorithms were developed to address this bottleneck…”
Get full text
Conference Proceeding -
13
A massively parallel tensor contraction framework for coupled-cluster computations
ISSN: 0743-7315, 1096-0848Published: Elsevier Inc 01.12.2014Published in Journal of parallel and distributed computing (01.12.2014)“…Precise calculation of molecular electronic wavefunctions by methods such as coupled-cluster requires the computation of tensor contractions, the cost of which…”
Get full text
Journal Article -
14
A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines
ISSN: 1877-0509, 1877-0509Published: Elsevier B.V 2012Published in Procedia computer science (2012)“…We study several solvers for the solution of general linear systems where the main objective is to reduce the communication overhead due to pivoting. We first…”
Get full text
Journal Article -
15
Communication-Avoiding Algorithms for a High-Performance Hyperbolic Pde Engine
Published: ProQuest Dissertations & Theses 01.01.2020“…The study of waves has always been an important subject of research. Earthquakes, for example, have a direct impact on the daily lives of millions of people…”
Get full text
Dissertation -
16
Multiscale high-order/low-order (HOLO) algorithms and applications
ISSN: 0021-9991, 1090-2716Published: Cambridge Elsevier Inc 01.02.2017Published in Journal of computational physics (01.02.2017)“…We review the state of the art in the formulation, implementation, and performance of so-called high-order/low-order (HOLO) algorithms for challenging…”
Get full text
Journal Article -
17
Communication Avoiding Block Low-Rank Parallel Multifrontal Triangular Solve with Many Right-Hand Sides
ISSN: 0895-4798, 1095-7162Published: Society for Industrial and Applied Mathematics 01.01.2024Published in SIAM journal on matrix analysis and applications (01.01.2024)“…Block low-rank (BLR) compression can significantly reduce the memory and time costs of parallel sparse direct solvers. In this paper, we investigate the…”
Get full text
Journal Article -
18
Parallel Fast Multipole Method accelerated FFT on HPC clusters
ISSN: 0167-8191, 1872-7336Published: Elsevier B.V 01.07.2021Published in Parallel computing (01.07.2021)“… In the past decade there has been a growing interest in communication-avoiding algorithms. The distributed memory Fast Fourier Transform is an important algorithm which suffers from major communication bottlenecks…”
Get full text
Journal Article -
19
Communication avoiding algorithms
ISBN: 9781467362184, 1467362182Published: IEEE 01.11.2012Published in 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC) (01.11.2012)“…This article consists of a collection of slides from the author's conference presentation. Some of the specific areas/topics discussed include: To redesign…”
Get full text
Conference Proceeding -
20
Multiscale high-order/low-order (HOLO) algorithms and applications
ISSN: 0021-9991, 1090-2716Published: United States Elsevier 11.11.2016Published in Journal of computational physics (11.11.2016)“…Here, we review the state of the art in the formulation, implementation, and performance of so-called high-order/low-order (HOLO) algorithms for challenging…”
Get full text
Journal Article

