Suchergebnisse - Computing methodologies Distributed computing methodologies Distributed programming languages

  1. 1

    MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems von Hsia, Samuel, Golden, Alicia, Acun, Bilge, Ardalani, Newsha, DeVito, Zachary, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    Veröffentlicht: IEEE 29.06.2024
    “… Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs …”
    Volltext
    Tagungsbericht
  2. 2

    Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs von Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

    Veröffentlicht: IEEE 01.09.2021
    “… Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs …”
    Volltext
    Tagungsbericht
  3. 3

    Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics von Dathathri, Roshan, Gill, Gurbinder, Hoang, Loc, Jatala, Vishwesh, Pingali, Keshav, Nandivada, V. Krishna, Dang, Hoang-Vu, Snir, Marc

    ISSN: 2641-7936
    Veröffentlicht: IEEE 01.09.2019
    “… Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP …”
    Volltext
    Tagungsbericht
  4. 4

    AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs von Zhang, Ruisi, Javaheripi, Mojan, Ghodsi, Zahra, Bleiweiss, Amit, Koushanfar, Farinaz

    Veröffentlicht: IEEE 09.07.2023
    “… Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications …”
    Volltext
    Tagungsbericht
  5. 5

    Auto-parallelizing stateful distributed streaming applications von Schneider, Scott, Hirzel, Martin, Gedik, Bugra, Wu, Kun-Lung

    Veröffentlicht: ACM 01.09.2012
    “… They are comprised of operator graphs that produce and consume data tuples. The streaming programming model naturally exposes task and pipeline parallelism, enabling it to exploit parallel systems of all kinds, including large clusters …”
    Volltext
    Tagungsbericht
  6. 6

    Optimizing Distributed ML Communication with Fused Computation-Collective Operations von Punniyamurthy, Kishore, Hamidouche, Khaled, Beckmann, Bradford M.

    Veröffentlicht: IEEE 17.11.2024
    “… Machine learning models are distributed across multiple nodes using numerous parallelism strategies …”
    Volltext
    Tagungsbericht
  7. 7

    Legate Sparse: Distributed Sparse Computing in Python von Yadav, Rohan, Lee, Wonchan, Elibol, Melih, Patti, Taylor Lee, Papadakis, Manolis, Garland, Michael, Aiken, Alex, Kjolstad, Fredrik, Bauer, Michael

    ISSN: 2167-4337
    Veröffentlicht: ACM 11.11.2023
    “… The standard implementation of SciPy is restricted to a single CPU and cannot take advantage of modern distributed and accelerated computing resources …”
    Volltext
    Tagungsbericht
  8. 8

    Introduction to Parallel and Distributed Programming using N-Body Simulations von Van Craen, Alexander, Breyer, Marcel, Pfluger, Dirk

    Veröffentlicht: IEEE 17.11.2024
    “… This paper describes how we use n-body simulations as an interesting and visually compelling way to teach efficient, parallel, and distributed programming …”
    Volltext
    Tagungsbericht
  9. 9

    Legate NumPy: Accelerated and Distributed Array Computing von Bauer, Michael, Garland, Michael

    ISSN: 2167-4337
    Veröffentlicht: ACM 17.11.2019
    “… Legate works by translating NumPy programs to the Legion programming model and then leverages the scalability of the Legion runtime system to distribute data and computations across an arbitrary sized machine. Compared to similar programs written in the distributed Dask array library in Python, Legate achieves speed-ups of up to 10X on 1280 CPUs and 100X on 256 GPUs …”
    Volltext
    Tagungsbericht
  10. 10

    Managing Workflow Malleability in Urgent Computing for Earthquake Alerts von Ejarque, Jorge, Monterrubio-Velasco, Marisol, Bhihe, Cedric, Pienkowska, Marta, De La Puente, Josep, Badia, Rosa M.

    Veröffentlicht: IEEE 17.11.2024
    “… UCIS4EQ is an urgent computing platform that estimates ground shaking based on high-performance parallel 3D simulations …”
    Volltext
    Tagungsbericht
  11. 11

    Enabling Low-Overhead HT-HPC Workflows at Extreme Scale using GNU Parallel von Maheshwari, Ketan, Arndt, William, Karimi, Ahmad Maroof, Yin, Junqi, Suter, Frederic, Johnson, Seth, Da Silva, Rafael Ferreira

    Veröffentlicht: IEEE 17.11.2024
    “… GNU Parallel is a versatile and powerful tool for process parallelization widely used in scientific computing …”
    Volltext
    Tagungsbericht
  12. 12

    A Sparsity-Aware Distributed-Memory Algorithm for Sparse-Sparse Matrix Multiplication von Hong, Yuxi, Buluc, Aydin

    Veröffentlicht: IEEE 17.11.2024
    “… Distributed-memory parallel algorithms for SpGEMM have mainly focused on sparsity-oblivious approaches that use 2D and 3D partitioning …”
    Volltext
    Tagungsbericht
  13. 13

    Towards an Optimized Heterogeneous Distributed Task Scheduler in OpenMP Cluster von Neveu, Remy, Ceccato, Rodrigo, Leite, Gustavo, Araujo, Guido, Diaz, Jose M. Monsalve, Yviquel, Herve

    Veröffentlicht: IEEE 17.11.2024
    “… This paper addresses the challenges of optimizing task scheduling for a distributed, task-based execution model in OpenMP for cluster computing environments …”
    Volltext
    Tagungsbericht
  14. 14

    CUDASTF: Bridging the Gap Between CUDA and Task Parallelism von Augonnet, Cedric, Alexandrescu, Andrei, Sidelnik, Albert, Garland, Michael

    Veröffentlicht: IEEE 17.11.2024
    “… Organizing computation as asynchronous tasks with data-driven dependencies is a simple and efficient model for single- and multi-GPU programs. Sequential Task …”
    Volltext
    Tagungsbericht
  15. 15

    Scalable and Consistent Graph Neural Networks for Distributed Mesh-based Data-driven Modeling von Barwey, Shivam, Balin, Riccardo, Lusch, Bethany, Patel, Saumil, Balakrishnan, Ramesh, Pal, Pinaki, Maulik, Romit, Vishwanath, Venkatram

    Veröffentlicht: IEEE 17.11.2024
    “… This work develops a distributed graph neural network (GNN) methodology for mesh-based modeling applications using a consistent neural message passing layer …”
    Volltext
    Tagungsbericht
  16. 16

    Accelerating Communications in Federated Applications with Transparent Object Proxies von Pauloski, J. Gregory, Hayot-Sasson, Valerie, Ward, Logan, Hudson, Nathaniel, Sabino, Charlie, Baughman, Matt, Chard, Kyle, Foster, Ian

    ISSN: 2167-4337
    Veröffentlicht: ACM 11.11.2023
    “… Here, we overcome this obstacle with a new programming paradigm that decouples control flow from data flow by extending the pass-by-reference model to distributed applications …”
    Volltext
    Tagungsbericht
  17. 17

    Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication von Ranawaka, Isuru, Hussain, Md Taufique, Block, Charles, Gerogiannis, Gerasimos, Torrellas, Josep, Azad, Ariful

    Veröffentlicht: IEEE 17.11.2024
    “… Unfortunately, popular distributed algorithms like sparse SUMMA deliver suboptimal performance for TS-SpGEMM …”
    Volltext
    Tagungsbericht
  18. 18

    Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine von Sly-Delgado, Barry, Tovar, Ben, Zhou, Jin, Thain, Douglas

    Veröffentlicht: IEEE 17.11.2024
    “… High energy physics experiments produce petabytes of data annually that must be reduced to gain insight into the laws of nature. Early-stage reduction executes …”
    Volltext
    Tagungsbericht
  19. 19

    Enhance the Strong Scaling of LAMMPS on Fugaku von Li, Jianxiong, Zhao, Tong, Guo, Zhuoqiang, Shi, Shunchen, Liu, Lijun, Tan, Guangming, Jia, Weile, Yuan, Guojun, Wang, Zhan

    ISSN: 2167-4337
    Veröffentlicht: ACM 11.11.2023
    “… Physical phenomenon such as protein folding requires simulation up to microseconds of physical time, which directly corresponds to the strong scaling of …”
    Volltext
    Tagungsbericht
  20. 20

    MPI Progress For All von Zhou, Hui, Latham, Robert, Raffenetti, Ken, Guo, Yanfei, Thakur, Rajeev

    Veröffentlicht: IEEE 17.11.2024
    “… The opaque nature of MPI progress poses significant challenges in advancing MPI within modern high-performance computing practices …”
    Volltext
    Tagungsbericht