Search Results - "communication avoiding algorithms"
-
1
A massively parallel tensor contraction framework for coupled-cluster computations
ISSN: 0743-7315, 1096-0848Published: Elsevier Inc 01.12.2014Published in Journal of parallel and distributed computing (01.12.2014)“…Precise calculation of molecular electronic wavefunctions by methods such as coupled-cluster requires the computation of tensor contractions, the cost of which…”
Get full text
Journal Article -
2
Multiscale high-order/low-order (HOLO) algorithms and applications
ISSN: 0021-9991, 1090-2716Published: Cambridge Elsevier Inc 01.02.2017Published in Journal of computational physics (01.02.2017)“…We review the state of the art in the formulation, implementation, and performance of so-called high-order/low-order (HOLO) algorithms for challenging…”
Get full text
Journal Article -
3
Communication Avoiding Block Low-Rank Parallel Multifrontal Triangular Solve with Many Right-Hand Sides
ISSN: 0895-4798, 1095-7162Published: Society for Industrial and Applied Mathematics 01.01.2024Published in SIAM journal on matrix analysis and applications (01.01.2024)“…Block low-rank (BLR) compression can significantly reduce the memory and time costs of parallel sparse direct solvers. In this paper, we investigate the…”
Get full text
Journal Article -
4
Parallel Fast Multipole Method accelerated FFT on HPC clusters
ISSN: 0167-8191, 1872-7336Published: Elsevier B.V 01.07.2021Published in Parallel computing (01.07.2021)“…With increasing sizes of distributed systems, there comes an increased risk of communication bottlenecks. In the past decade there has been a growing interest…”
Get full text
Journal Article -
5
Multiscale high-order/low-order (HOLO) algorithms and applications
ISSN: 0021-9991, 1090-2716Published: United States Elsevier 11.11.2016Published in Journal of computational physics (11.11.2016)“…Here, we review the state of the art in the formulation, implementation, and performance of so-called high-order/low-order (HOLO) algorithms for challenging…”
Get full text
Journal Article -
6
Communication-Avoiding Recursive Aggregation
ISSN: 2168-9253Published: IEEE 31.10.2023Published in Proceedings / IEEE International Conference on Cluster Computing (31.10.2023)“…Recursive aggregation has been of considerable interest due to its unifying a wide range of deductive-analytic workloads, including social-media mining and…”
Get full text
Conference Proceeding -
7
Translational process: Mathematical software perspective
ISSN: 1877-7503, 1877-7511Published: Netherlands Elsevier B.V 01.05.2021Published in Journal of computational science (01.05.2021)“…Each successive generation of computer architecture has brought new challenges to achieving high performance mathematical solvers, necessitating development…”
Get full text
Journal Article -
8
Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time–space decomposition
ISSN: 0021-9991, 1090-2716Published: Cambridge Elsevier Inc 15.03.2018Published in Journal of computational physics (15.03.2018)“…•A GPU implementation of the swept time–space decomposition rule is presented.•Three versions of the scheme are considered.•The shared-memory implementation…”
Get full text
Journal Article -
9
Reconstructing Householder vectors from Tall-Skinny QR
ISSN: 0743-7315, 1096-0848Published: United States Elsevier Inc 01.11.2015Published in Journal of parallel and distributed computing (01.11.2015)“…The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of matrices with many more…”
Get full text
Journal Article -
10
Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems
ISSN: 0920-8542, 1573-0484Published: New York Springer US 01.02.2021Published in The Journal of supercomputing (01.02.2021)“…Applications that exploit the architectural details of high-performance computing (HPC) systems have become increasingly invaluable in academia and industry…”
Get full text
Journal Article -
11
Reducing Communication in Graph Neural Network Training
ISSN: 2167-4329Published: United States IEEE 01.11.2020Published in International Conference for High Performance Computing, Networking, Storage and Analysis (Online) (01.11.2020)“…Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this…”
Get full text
Conference Proceeding Journal Article -
12
Communication-Avoiding Symmetric-Indefinite Factorization
ISSN: 0895-4798, 1095-7162Published: United States SIAM 01.01.2014Published in SIAM journal on matrix analysis and applications (01.01.2014)“…We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular…”
Get full text
Journal Article -
13
A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines
ISSN: 1877-0509, 1877-0509Published: Elsevier B.V 2012Published in Procedia computer science (2012)“…We study several solvers for the solution of general linear systems where the main objective is to reduce the communication overhead due to pivoting. We first…”
Get full text
Journal Article -
14
Cyclops Tensor Framework: Reducing Communication and Eliminating Load Imbalance in Massively Parallel Contractions
ISBN: 146736066X, 9781467360661ISSN: 1530-2075Published: IEEE 01.05.2013Published in 2013 IEEE 27th International Symposium on Parallel and Distributed Processing (01.05.2013)“…Cyclops (cyclic-operations) Tensor Framework (CTF) 1 is a distributed library for tensor contractions. CTF aims to scale high-dimensional tensor contractions…”
Get full text
Conference Proceeding -
15
Distributed-Memory Sparse Kernels for Machine Learning
ISSN: 1530-2075Published: IEEE 01.05.2022Published in Proceedings - IEEE International Parallel and Distributed Processing Symposium (01.05.2022)“…Sampled Dense Times Dense Matrix Multiplication (SDDMM) and Sparse Times Dense Matrix Multiplication (SpMM) appear in diverse settings, such as collaborative…”
Get full text
Conference Proceeding -
16
Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication
ISSN: 1530-2075Published: IEEE 01.05.2016Published in Proceedings - IEEE International Parallel and Distributed Processing Symposium (01.05.2016)“…Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and…”
Get full text
Conference Proceeding -
17
Event-Triggered Communication in Parallel Computing
Published: IEEE 01.11.2018Published in 2018 IEEE/ACM 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (scalA) (01.11.2018)“…Communication overhead in parallel systems can be a significant bottleneck in scaling up parallel computation. In this paper, we propose event-triggered…”
Get full text
Conference Proceeding -
18
Write-Avoiding Algorithms
ISSN: 1530-2075Published: IEEE 01.05.2016Published in Proceedings - IEEE International Parallel and Distributed Processing Symposium (01.05.2016)“…Communication, i.e., moving data between levels of a memory hierarchy or between processors over a network, is much more expensive (in time or energy) than…”
Get full text
Conference Proceeding -
19
Minimizing Communication in All-Pairs Shortest Paths
ISBN: 146736066X, 9781467360661ISSN: 1530-2075Published: IEEE 01.05.2013Published in 2013 IEEE 27th International Symposium on Parallel and Distributed Processing (01.05.2013)“…We consider distributed memory algorithms for the all-pairs shortest paths (APSP) problem. Scaling the APSP problem to high concurrencies requires both…”
Get full text
Conference Proceeding -
20
Recent Developments in Iterative Methods for Reducing Synchronization
ISSN: 2473-3636Published: IEEE 01.11.2019Published in International Symposium on Distributed Computing and Applications to Business, Engineering & Science (Online) (01.11.2019)“…On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications…”
Get full text
Conference Proceeding

