Suchergebnisse - Shared memory programming on NUMA

1

Wird geladen …

ARS: an adaptive runtime system for locality optimization von Tao, Jie, Schulz, Martin, Karl, Wolfgang

ISSN: 0167-739X, 1872-7115

Veröffentlicht: Elsevier B.V 01.07.2003

Veröffentlicht in Future generation computer systems (01.07.2003)
“… Shared memory programs running on Non-Uniform Memory Access (NUMA) machines usually face inherent performance problems stemming from excessive remote memory accesses …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
2

Wird geladen …

Shared memory NUMA programming on I-WAY von Nieplocha, J., Harrison, R.J.

ISBN: 0818675829, 9780818675829

ISSN: 1082-8907

Veröffentlicht: IEEE 1996

Veröffentlicht in High-Performance Distributed Computing, 5th International Symposium On: HPDC-5 (1996)
“… The performance of the Global Array shared-memory non-uniform memory-access programming model is explored on the I-WAY, wide-area network distributed supercomputer environment …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
3

Wird geladen …

Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures von Catalán, Sandra, Igual, Francisco D., Herrero, José R., Rodríguez-Sánchez, Rafael, Quintana-Ortí, Enrique S.

ISSN: 0743-7315, 1096-0848

Veröffentlicht: Elsevier Inc 01.05.2023

Veröffentlicht in Journal of parallel and distributed computing (01.05.2023)
“… We propose a methodology to address the programmability issues derived from the emergence of new-generation shared-memory NUMA architectures …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
4

Wird geladen …

Memory Access Behavior Analysis of NUMA‐Based Shared Memory Programs von Tao, Jie, Karl, Wolfgang, Schulz, Martin

ISSN: 1058-9244, 1875-919X

Veröffentlicht: 01.01.2002

Veröffentlicht in Scientific programming (01.01.2002)
“… Shared memory applications running transparently on top of NUMA architectures often face severe performance problems due to bad data locality and excessive remote memory accesses …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
5

Wird geladen …

Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management von Drebes, Andi, Pop, Antoniu, Heydemann, Karine, Cohen, Albert, Drach, Nathalie

Veröffentlicht: ACM 01.09.2016

Veröffentlicht in 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)
“… Dynamic task-parallel programming models are popular on shared-memory systems, promising enhanced scalability, load balancing and locality …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
6

Wird geladen …

Unfair Scheduling Patterns in NUMA Architectures von Ben-David, Naama, Scully, Ziv, Blelloch, Guy E.

ISSN: 2641-7936

Veröffentlicht: IEEE 01.09.2019

Veröffentlicht in Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)
“… This begs the question: what concurrent scheduling models are realistic? This issue is complicated by the intricacies of modern hardware, such as cache coherence protocols and non-uniform memory access (NUMA …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
7

Wird geladen …

Mitigating the NUMA effect on task-based runtime systems von Maroñas, Marcos, Navarro, Antoni, Ayguadé, Eduard, Beltran, Vicenç

ISSN: 0920-8542, 1573-0484

Veröffentlicht: New York Springer US 01.09.2023

Veröffentlicht in The Journal of supercomputing (01.09.2023)
“… However, due to hardware restrictions, they adopt a NUMA approach, where each processor accesses local memory faster than remote memories …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
8

Wird geladen …

NUMASFP: NUMA-Aware Dynamic Service Function Chain Placement in Multi-Core Servers von Chintapalli, Venkatarami Reddy, Korrapati, Sai Balaram, Tamma, Bheemarjuna Reddy, A, Antony Franklin

ISSN: 2155-2509

Veröffentlicht: IEEE 04.01.2022

Veröffentlicht in International Conference on Communication Systems and Networks (Online) (04.01.2022)
“… However, sophisticated servers follow non-uniform memory access (NUMA) architecture in which CPU cores are distributed across different NUMA nodes to enhance scalability …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
9

Wird geladen …

Performance prediction and evaluation of parallel processing on a NUMA multiprocessor von Zhang, X., Qin, X.

ISSN: 0098-5589, 1939-3520

Veröffentlicht: New York, NY IEEE 01.10.1991

Veröffentlicht in IEEE transactions on software engineering (01.10.1991)
“… , where network contention and memory contention are considered. Performance measurements to support the models and analyses through several numerical examples have been done on the BBN GP1000, a NUMA shared-memory multiprocessor …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
10

Wird geladen …

PNS Lock: A Portable NUMA-Aware Lock with a Standard Interface von Gandham, Brahmaiah, Alapati, Praveen

Veröffentlicht: IEEE 21.09.2024

Veröffentlicht in 2024 International Symposium on Parallel Computing and Distributed Systems (PCDS) (21.09.2024)
“… In shared memory programming, there is a need for synchronization primitives that are suitable for modern hardware to achieve high performance and reduce contention …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
11

Wird geladen …

Empirical Installation of Linear Algebra Shared-Memory Subroutines for Auto-Tuning von Cámara, Jesús, Cuenca, Javier, Giménez, Domingo, García, Luis Pedro, Vidal, Antonio M.

ISSN: 0885-7458, 1573-7640

Veröffentlicht: Boston Springer US 01.06.2014

Veröffentlicht in International journal of parallel programming (01.06.2014)
“… Medium NUMA and large cc-NUMA systems are used in the experiments. This variety of routines, libraries and systems allows us to obtain general conclusions about the methodology to use for linear algebra shared-memory routines auto-tuning …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
12

Wird geladen …

Some useful strategies for unstructured edge-based solvers on shared memory machines von Aubry, R., Houzeaux, G., Vázquez, M., Cela, J. M.

ISSN: 0029-5981, 1097-0207, 1097-0207

Veröffentlicht: Chichester, UK John Wiley & Sons, Ltd 04.02.2011

Veröffentlicht in International journal for numerical methods in engineering (04.02.2011)
“… Three strategies for shared memory parallel edge‐based solvers are proposed which guarantee that nodes belonging to one thread are not accessed by other threads for vertex …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
13

Wird geladen …

NestedMP: Enabling cache-aware thread mapping for nested parallel shared memory applications von He, Jiangzhou, Chen, Wenguang, Tang, Zhizhong

ISSN: 0167-8191, 1872-7336

Veröffentlicht: Elsevier B.V 01.01.2016

Veröffentlicht in Parallel computing (01.01.2016)
“… •Task-core mapping schemas for nested-parallel applications may affect performance.•NestedMP allows programmers to declare number of threads for parallel …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
14

Wird geladen …

Analyzing the execution of sparse matrix-vector product on the Finisterrae SMP-NUMA system von Pichel, Juan C., Lorenzo, Juan A., Heras, Dora B., Cabaleiro, Jose C., Pena, Tomás F.

ISSN: 0920-8542, 1573-0484

Veröffentlicht: Boston Springer US 01.11.2011

Veröffentlicht in The Journal of supercomputing (01.11.2011)
“… In this paper, the sparse matrix-vector product (SpMV) is evaluated on the FinisTerrae SMP-NUMA supercomputer …”

Volltext

Journal Article Tagungsbericht

Zu den Favoriten

Gespeichert in:
15

Wird geladen …

An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs von Marongiu, A., Benini, L.

ISSN: 0018-9340, 1557-9956

Veröffentlicht: New York IEEE 01.02.2012

Veröffentlicht in IEEE transactions on computers (01.02.2012)
“… To efficiently exploit the advantages of low-latency high-bandwidth memory modules in the hierarchy, there is the need for programming models and/or language features that expose such architectural details …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
16

Wird geladen …

Online scalability characterization of data-parallel programs on many cores von Younghyun Cho, Oh, Surim, Egger, Bernhard

Veröffentlicht: ACM 01.09.2016

Veröffentlicht in 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)
“… Reflecting the architecture of NUMA systems, contention is modeled at the last-level caches of the compute nodes and the memory nodes using a two-level queuing model to estimate the mean service time …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
17

Wird geladen …

An evaluation of MPI and OpenMP paradigms in finite‐difference explicit methods for PDEs on shared‐memory multi‐ and manycore systems von Cabral, Frederico L., Gonzaga de Oliveira, Sanderson L., Osthoff, Carla, Costa, Gabriel P., Brandão, Diego N., Kischinhevsky, Mauricio

ISSN: 1532-0626, 1532-0634

Veröffentlicht: Hoboken Wiley Subscription Services, Inc 25.10.2020

Veröffentlicht in Concurrency and computation (25.10.2020)
“… ® Scalable Processor and the coprocessor Knights Landing. In this study, the performance of a hybrid parallel programming with message passing interface (MPI) and Open Multi‐Processing (OpenMP …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
18

Wird geladen …

Zippy: A Framework for Computation and Visualization on a GPU Cluster von Fan, Zhe, Qiu, Feng, Kaufman, Arie E.

ISSN: 0167-7055, 1467-8659

Veröffentlicht: Oxford, UK Blackwell Publishing Ltd 01.04.2008

Veröffentlicht in Computer graphics forum (01.04.2008)
“… ‐level parallelism hierarchy and a non‐uniform memory access (NUMA) model. Zippy preserves the advantages of both message passing and shared‐memory models …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
19

Wird geladen …

REPLICA MBTAC: multithreaded dual-mode processor von Forsell, Martti, Roivainen, Jussi, Leppänen, Ville

ISSN: 0920-8542, 1573-0484

Veröffentlicht: New York Springer US 01.05.2018

Veröffentlicht in The Journal of supercomputing (01.05.2018)
“… These include support for cost-efficient machine instruction-level synchronization and uniform shared global memory for enabling easy-to-program memory allocation of data structures and data movement …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
20

Wird geladen …

Decentralized lock-free distributed queue in MPI remote memory access model von Paznikov, Alexey A., Burachenko, Alexander V., Abuelsoud, Mohamed M.

ISSN: 2267-1242, 2555-0403, 2267-1242

Veröffentlicht: Les Ulis EDP Sciences 01.01.2024

Veröffentlicht in E3S web of conferences (01.01.2024)
“… (concurrent, distributed) data structures. In shared-memory machines (such as SMP/NUMA systems …”

Volltext

Journal Article Tagungsbericht

Zu den Favoriten

Gespeichert in:

Suchergebnisse - Shared memory programming on NUMA

ARS: an adaptive runtime system for locality optimization von Tao, Jie, Schulz, Martin, Karl, Wolfgang

Shared memory NUMA programming on I-WAY von Nieplocha, J., Harrison, R.J.

Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures von Catalán, Sandra, Igual, Francisco D., Herrero, José R., Rodríguez-Sánchez, Rafael, Quintana-Ortí, Enrique S.

Memory Access Behavior Analysis of NUMA‐Based Shared Memory Programs von Tao, Jie, Karl, Wolfgang, Schulz, Martin

Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management von Drebes, Andi, Pop, Antoniu, Heydemann, Karine, Cohen, Albert, Drach, Nathalie

Unfair Scheduling Patterns in NUMA Architectures von Ben-David, Naama, Scully, Ziv, Blelloch, Guy E.

Mitigating the NUMA effect on task-based runtime systems von Maroñas, Marcos, Navarro, Antoni, Ayguadé, Eduard, Beltran, Vicenç

NUMASFP: NUMA-Aware Dynamic Service Function Chain Placement in Multi-Core Servers von Chintapalli, Venkatarami Reddy, Korrapati, Sai Balaram, Tamma, Bheemarjuna Reddy, A, Antony Franklin

Performance prediction and evaluation of parallel processing on a NUMA multiprocessor von Zhang, X., Qin, X.

PNS Lock: A Portable NUMA-Aware Lock with a Standard Interface von Gandham, Brahmaiah, Alapati, Praveen

Empirical Installation of Linear Algebra Shared-Memory Subroutines for Auto-Tuning von Cámara, Jesús, Cuenca, Javier, Giménez, Domingo, García, Luis Pedro, Vidal, Antonio M.

Some useful strategies for unstructured edge-based solvers on shared memory machines von Aubry, R., Houzeaux, G., Vázquez, M., Cela, J. M.

NestedMP: Enabling cache-aware thread mapping for nested parallel shared memory applications von He, Jiangzhou, Chen, Wenguang, Tang, Zhizhong

Analyzing the execution of sparse matrix-vector product on the Finisterrae SMP-NUMA system von Pichel, Juan C., Lorenzo, Juan A., Heras, Dora B., Cabaleiro, Jose C., Pena, Tomás F.

An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs von Marongiu, A., Benini, L.

Online scalability characterization of data-parallel programs on many cores von Younghyun Cho, Oh, Surim, Egger, Bernhard

An evaluation of MPI and OpenMP paradigms in finite‐difference explicit methods for PDEs on shared‐memory multi‐ and manycore systems von Cabral, Frederico L., Gonzaga de Oliveira, Sanderson L., Osthoff, Carla, Costa, Gabriel P., Brandão, Diego N., Kischinhevsky, Mauricio

Zippy: A Framework for Computation and Visualization on a GPU Cluster von Fan, Zhe, Qiu, Feng, Kaufman, Arie E.

REPLICA MBTAC: multithreaded dual-mode processor von Forsell, Martti, Roivainen, Jussi, Leppänen, Ville

Decentralized lock-free distributed queue in MPI remote memory access model von Paznikov, Alexey A., Burachenko, Alexander V., Abuelsoud, Mohamed M.

Suchwerkzeuge:

Treffer weiter einschränken

Format

Schlagwortumfeld

Thema

Sprache

Erscheinungsjahr