Search Results - "communication-avoiding algorithms"

Refine Results
  1. 1

    A massively parallel tensor contraction framework for coupled-cluster computations by Solomonik, Edgar, Matthews, Devin, Hammond, Jeff R., Stanton, John F., Demmel, James

    ISSN: 0743-7315, 1096-0848
    Published: Elsevier Inc 01.12.2014
    “…Precise calculation of molecular electronic wavefunctions by methods such as coupled-cluster requires the computation of tensor contractions, the cost of which…”
    Get full text
    Journal Article
  2. 2

    Multiscale high-order/low-order (HOLO) algorithms and applications by Chacón, L., Chen, G., Knoll, D.A., Newman, C., Park, H., Taitano, W., Willert, J.A., Womeldorff, G.

    ISSN: 0021-9991, 1090-2716
    Published: Cambridge Elsevier Inc 01.02.2017
    Published in Journal of computational physics (01.02.2017)
    “…We review the state of the art in the formulation, implementation, and performance of so-called high-order/low-order (HOLO) algorithms for challenging…”
    Get full text
    Journal Article
  3. 3

    Communication Avoiding Block Low-Rank Parallel Multifrontal Triangular Solve with Many Right-Hand Sides by Amestoy, Patrick, Boiteau, Olivier, Buttari, Alfredo, Gerest, Matthieu, Jézéquel, Fabienne, L’Excellent, Jean-Yves, Mary, Theo

    ISSN: 0895-4798, 1095-7162
    Published: Society for Industrial and Applied Mathematics 01.01.2024
    “…Block low-rank (BLR) compression can significantly reduce the memory and time costs of parallel sparse direct solvers. In this paper, we investigate the…”
    Get full text
    Journal Article
  4. 4

    Parallel Fast Multipole Method accelerated FFT on HPC clusters by Mehta, Chahak, Karthi, Amarnath, Jetly, Vishrut, Chaudhury, Bhaskar

    ISSN: 0167-8191, 1872-7336
    Published: Elsevier B.V 01.07.2021
    Published in Parallel computing (01.07.2021)
    “…With increasing sizes of distributed systems, there comes an increased risk of communication bottlenecks. In the past decade there has been a growing interest…”
    Get full text
    Journal Article
  5. 5

    Multiscale high-order/low-order (HOLO) algorithms and applications by Chacon, Luis, Chen, Guangye, Knoll, Dana Alan, Newman, Christopher Kyle, Park, HyeongKae, Taitano, William, Willert, Jeff A., Womeldorff, Geoffrey Alan

    ISSN: 0021-9991, 1090-2716
    Published: United States Elsevier 11.11.2016
    Published in Journal of computational physics (11.11.2016)
    “…Here, we review the state of the art in the formulation, implementation, and performance of so-called high-order/low-order (HOLO) algorithms for challenging…”
    Get full text
    Journal Article
  6. 6

    Communication-Avoiding Recursive Aggregation by Sun, Yihao, Kumar, Sidharth, Gilray, Thomas, Micinski, Kristopher

    ISSN: 2168-9253
    Published: IEEE 31.10.2023
    “…Recursive aggregation has been of considerable interest due to its unifying a wide range of deductive-analytic workloads, including social-media mining and…”
    Get full text
    Conference Proceeding
  7. 7

    Translational process: Mathematical software perspective by Dongarra, Jack, Gates, Mark, Luszczek, Piotr, Tomov, Stanimire

    ISSN: 1877-7503, 1877-7511
    Published: Netherlands Elsevier B.V 01.05.2021
    Published in Journal of computational science (01.05.2021)
    “…Each successive generation of computer architecture has brought new challenges to achieving high performance mathematical solvers, necessitating development…”
    Get full text
    Journal Article
  8. 8

    Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time–space decomposition by Magee, Daniel J., Niemeyer, Kyle E.

    ISSN: 0021-9991, 1090-2716
    Published: Cambridge Elsevier Inc 15.03.2018
    Published in Journal of computational physics (15.03.2018)
    “…•A GPU implementation of the swept time–space decomposition rule is presented.•Three versions of the scheme are considered.•The shared-memory implementation…”
    Get full text
    Journal Article
  9. 9

    Reconstructing Householder vectors from Tall-Skinny QR by Ballard, G., Demmel, J., Grigori, L., Jacquelin, M., Knight, N., Nguyen, H.D.

    ISSN: 0743-7315, 1096-0848
    Published: United States Elsevier Inc 01.11.2015
    “…The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of matrices with many more…”
    Get full text
    Journal Article
  10. 10

    Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems by Magee, Daniel J., Walker, Anthony S., Niemeyer, Kyle E.

    ISSN: 0920-8542, 1573-0484
    Published: New York Springer US 01.02.2021
    Published in The Journal of supercomputing (01.02.2021)
    “…Applications that exploit the architectural details of high-performance computing (HPC) systems have become increasingly invaluable in academia and industry…”
    Get full text
    Journal Article
  11. 11

    Reducing Communication in Graph Neural Network Training by Tripathy, Alok, Yelick, Katherine, Buluc, Aydin

    ISSN: 2167-4329
    Published: United States IEEE 01.11.2020
    “…Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this…”
    Get full text
    Conference Proceeding Journal Article
  12. 12

    Communication-Avoiding Symmetric-Indefinite Factorization by Ballard, Grey, Becker, Dulceneia, Demmel, James, Dongarra, Jack, Druinsky, Alex, Peled, Inon, Schwartz, Oded, Toledo, Sivan, Yamazaki, Ichitaro

    ISSN: 0895-4798, 1095-7162
    Published: United States SIAM 01.01.2014
    “…We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular…”
    Get full text
    Journal Article
  13. 13

    A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines by Baboulin, Marc, Donfack, Simplice, Dongarra, Jack, Grigori, Laura, Rémy, Adrien, Tomov, Stanimire

    ISSN: 1877-0509, 1877-0509
    Published: Elsevier B.V 2012
    Published in Procedia computer science (2012)
    “…We study several solvers for the solution of general linear systems where the main objective is to reduce the communication overhead due to pivoting. We first…”
    Get full text
    Journal Article
  14. 14

    Cyclops Tensor Framework: Reducing Communication and Eliminating Load Imbalance in Massively Parallel Contractions by Solomonik, Edgar, Matthews, Devin, Hammond, Jeff R., Demmel, James

    ISBN: 146736066X, 9781467360661
    ISSN: 1530-2075
    Published: IEEE 01.05.2013
    “…Cyclops (cyclic-operations) Tensor Framework (CTF) 1 is a distributed library for tensor contractions. CTF aims to scale high-dimensional tensor contractions…”
    Get full text
    Conference Proceeding
  15. 15

    Distributed-Memory Sparse Kernels for Machine Learning by Bharadwaj, Vivek, Buluc, Aydin, Demmel, James

    ISSN: 1530-2075
    Published: IEEE 01.05.2022
    “…Sampled Dense Times Dense Matrix Multiplication (SDDMM) and Sparse Times Dense Matrix Multiplication (SpMM) appear in diverse settings, such as collaborative…”
    Get full text
    Conference Proceeding
  16. 16

    Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication by Koanantakool, Penporn, Azad, Ariful, Buluc, Aydin, Morozov, Dmitriy, Sang-Yun Oh, Oliker, Leonid, Yelick, Katherine

    ISSN: 1530-2075
    Published: IEEE 01.05.2016
    “…Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and…”
    Get full text
    Conference Proceeding
  17. 17

    Event-Triggered Communication in Parallel Computing by Ghosh, Soumyadip, Saha, Kamal K., Gupta, Vijay, Tryggvason, Gretar

    Published: IEEE 01.11.2018
    “…Communication overhead in parallel systems can be a significant bottleneck in scaling up parallel computation. In this paper, we propose event-triggered…”
    Get full text
    Conference Proceeding
  18. 18

    Write-Avoiding Algorithms by Carson, Erin, Demmel, James, Grigori, Laura, Knight, Nicholas, Koanantakool, Penporn, Schwartz, Oded, Simhadri, Harsha Vardhan

    ISSN: 1530-2075
    Published: IEEE 01.05.2016
    “…Communication, i.e., moving data between levels of a memory hierarchy or between processors over a network, is much more expensive (in time or energy) than…”
    Get full text
    Conference Proceeding
  19. 19

    Minimizing Communication in All-Pairs Shortest Paths by Solomonik, Edgar, Buluç, Aydın, Demmel, James

    ISBN: 146736066X, 9781467360661
    ISSN: 1530-2075
    Published: IEEE 01.05.2013
    “…We consider distributed memory algorithms for the all-pairs shortest paths (APSP) problem. Scaling the APSP problem to high concurrencies requires both…”
    Get full text
    Conference Proceeding
  20. 20

    Recent Developments in Iterative Methods for Reducing Synchronization by Zou, Qinmeng, Magoules, Frederic

    ISSN: 2473-3636
    Published: IEEE 01.11.2019
    “…On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications…”
    Get full text
    Conference Proceeding