Suchergebnisse - distributed-memory parallelism
-
1
3D DFT by block tensor-matrix multiplication via a modified Cannon's algorithm: Implementation and scaling on distributed-memory clusters with fat tree networks
ISSN: 0743-7315Veröffentlicht: Elsevier Inc 01.11.2024Veröffentlicht in Journal of parallel and distributed computing (01.11.2024)“… A known scalability bottleneck of the parallel 3D FFT is its use of all-to-all communications. Here, we present S3DFT, a library that circumvents this by using …”
Volltext
Journal Article -
2
A framework for exploiting task and data parallelism on distributed memory multicomputers
ISSN: 1045-9219Veröffentlicht: IEEE 01.11.1997Veröffentlicht in IEEE transactions on parallel and distributed systems (01.11.1997)“… compiler and run-time support for distributed memory machines. In this paper, we explore a new compiler optimization for regular scientific applications-the simultaneous exploitation of task and data parallelism …”
Volltext
Journal Article -
3
Axially-deformed solution of the Skyrme-Hartree-Fock-Bogoliubov equations using the transformed harmonic oscillator basis (IV) hfbtho (v4.0): A new version of the program
ISSN: 0010-4655, 1879-2944Veröffentlicht: United States Elsevier B.V 01.07.2022Veröffentlicht in Computer physics communications (01.07.2022)“… We describe the new version 4.0 of the code hfbtho that solves the nuclear Hartree-Fock-Bogoliubov problem by using the deformed harmonic oscillator basis in …”
Volltext
Journal Article -
4
superB/NRPy: scalable, task-based numerical relativity for 3G gravitational wave science
ISSN: 0264-9381, 1361-6382Veröffentlicht: IOP Publishing 01.08.2025Veröffentlicht in Classical and quantum gravity (01.08.2025)Volltext
Journal Article -
5
Iterators, Schedulers, and Distributed-memory Parallelism
ISSN: 0038-0644, 1097-024XVeröffentlicht: New York John Wiley & Sons, Ltd 01.04.1996Veröffentlicht in Software, practice & experience (01.04.1996)“… ’ for sequential and parallel query evaluation. Unfortunately, those earlier models have a severe drawback with respect to resource allocation in distributed‐memory systems …”
Volltext
Journal Article -
6
Massively parallel implementation and approaches to simulate quantum dynamics using Krylov subspace techniques
ISSN: 0010-4655, 1879-2944Veröffentlicht: Elsevier B.V 01.02.2019Veröffentlicht in Computer physics communications (01.02.2019)“… We have developed an application and implemented parallel algorithms in order to provide a computational framework suitable for massively parallel …”
Volltext
Journal Article -
7
Leveraging HPC accelerator architectures with modern techniques — hydrologic modeling on GPUs with ParFlow
ISSN: 1420-0597, 1573-1499Veröffentlicht: Cham Springer International Publishing 01.10.2021Veröffentlicht in Computational geosciences (01.10.2021)“… Rapidly changing heterogeneous supercomputer architectures pose a great challenge to many scientific communities trying to leverage the latest technology in …”
Volltext
Journal Article -
8
MPI+X: task-based parallelisation and dynamic load balance of finite element assembly
ISSN: 1061-8562, 1029-0257Veröffentlicht: Abingdon Taylor & Francis 16.03.2019Veröffentlicht in International journal of computational fluid dynamics (16.03.2019)“… of the MPI partitions to compute element matrices and vectors and then of their assemblies. In a MPI+X hybrid parallelism context, X has consisted traditionally of loop …”
Volltext
Journal Article -
9
Parallelization of a distributed ecohydrological model
ISSN: 1364-8152, 1873-6726Veröffentlicht: Oxford Elsevier Ltd 01.03.2018Veröffentlicht in Environmental modelling & software : with environment data news (01.03.2018)“… High resolution simulations at a large scale are therefore computationally expensive and cause a run-time memory burden. Using distributed (MPI) and shared (OpenMP …”
Volltext
Journal Article -
10
A scalable scheduling scheme for functional parallelism on distributed memory multiprocessor systems
ISSN: 1045-9219Veröffentlicht: Los Alamitos, CA IEEE 01.04.1995Veröffentlicht in IEEE transactions on parallel and distributed systems (01.04.1995)“… and partially at run time. Assuming infinite number of processors, the compile time schedule is found using a new concept of the threshold of a task that quantifies a trade-off between the schedule-length and the degree of parallelism …”
Volltext
Journal Article -
11
A shared compilation stack for distributed-memory parallelism in stencil DSLs
ISSN: 2331-8422Veröffentlicht: Ithaca Cornell University Library, arXiv.org 02.04.2024Veröffentlicht in arXiv.org (02.04.2024)“… Domain Specific Languages (DSLs) increase programmer productivity and provide high performance. Their targeted abstractions allow scientists to express …”
Volltext
Paper -
12
A Robust Compile Time Method for Scheduling Task Parallelism on Distributed Memory Machines
ISSN: 0920-8542, 1573-0484Veröffentlicht: 01.10.1998Veröffentlicht in The Journal of supercomputing (01.10.1998)“… A compile time scheduling algorithm for a variable number of available processors is introduced and the impact of the change of computation and communication …”
Volltext
Journal Article -
13
On the Test Particle Monte-Carlo method to solve the steady state Boltzmann equation, the congruity of its results with experiments and its potential for shared memory parallelism
ISSN: 0021-9991, 1090-2716Veröffentlicht: Cambridge Elsevier Inc 01.11.2021Veröffentlicht in Journal of computational physics (01.11.2021)“… The Test Particle Monte Carlo is a known method to solve the steady state Boltzmann particle transport equation in rarefied gas systems. A description of the …”
Volltext
Journal Article -
14
High-Performance Sorting-Based k-mer Counting in Distributed Memory with Flexible Hybrid Parallelism
ISSN: 2331-8422Veröffentlicht: Ithaca Cornell University Library, arXiv.org 10.07.2024Veröffentlicht in arXiv.org (10.07.2024)“… Due to the growing volume of data, the scaling of the counting process is critical. In the literature, distributed memory software uses hash tables, which exhibit poor cache friendliness and consume excessive memory …”
Volltext
Paper -
15
CAPTURE: Memory-Centric Partitioning for Distributed DNN Training with Hybrid Parallelism
ISSN: 2640-0316Veröffentlicht: IEEE 18.12.2023Veröffentlicht in Proceedings - International Conference on High Performance Computing (18.12.2023)“… Hybrid-parallel training approaches have emerged that combine pipelining with data and tensor parallelism to facilitate the training of large DL models on distributed hardware setups …”
Volltext
Tagungsbericht -
16
A study of shared-memory parallelism in a multifrontal solver
ISSN: 0167-8191, 1872-7336Veröffentlicht: Elsevier B.V 01.03.2014Veröffentlicht in Parallel computing (01.03.2014)“… We introduce shared-memory parallelism in a parallel distributed-memory solver, targeting multi-core architectures …”
Volltext
Journal Article -
17
A robust compile time method for scheduling task parallelism on distributed memory machines
ISBN: 9780818676338, 0818676337ISSN: 1089-795XVeröffentlicht: IEEE 1996Veröffentlicht in Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique (1996)“… A desirable property of a compile time scheduling algorithm is robustness against the variations in the computation and communication costs so that the run …”
Volltext
Tagungsbericht -
18
Reservoir Echo State Network for Classification of Multivariate Time Series
ISSN: 2770-0135Veröffentlicht: IEEE 18.12.2023Veröffentlicht in Proceedings (IEEE International Conference on High Performance Computing Workshops) (18.12.2023)“… It leverages both CPU-shared memory and parallel distributed memory architecture to efficiently capture reservoir state's optimal model space representation, addressing computational challenges in MTS analysis …”
Volltext
Tagungsbericht -
19
Automated MPI-X Code Generation for Scalable Finite-Difference Solvers
ISSN: 1530-2075Veröffentlicht: IEEE 03.06.2025Veröffentlicht in Proceedings - IEEE International Parallel and Distributed Processing Symposium (03.06.2025)“… This paper introduces automated codegeneration techniques specifically tailored for distributed memory parallelism (DMP …”
Volltext
Tagungsbericht -
20
Scalable Adaptive PDE Solvers in Arbitrary Domains
ISSN: 2167-4337Veröffentlicht: ACM 14.11.2021Veröffentlicht in SC21: International Conference for High Performance Computing, Networking, Storage and Analysis (14.11.2021)“… Efficiently and accurately simulating partial differential equations (PDEs) in and around arbitrarily defined geometries, especially with high levels of …”
Volltext
Tagungsbericht

