Suchergebnisse - Computing methodologies → Distributed computing methodologies

  1. 1

    MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems von Hsia, Samuel, Golden, Alicia, Acun, Bilge, Ardalani, Newsha, DeVito, Zachary, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    Veröffentlicht: IEEE 29.06.2024
    “… Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs …”
    Volltext
    Tagungsbericht
  2. 2

    MMDFL: Multi-Model-based Decentralized Federated Learning for Resource-Constrained AIoT Systems von Yan, Dengke, Yang, Yanxin, Hu, Ming, Fu, Xin, Chen, Mingsong

    Veröffentlicht: IEEE 22.06.2025
    “… However, DFL still faces three major challenges, i.e., limited computing power and network bandwidth of resource-constrained devices, non-Independent and Identically Distributed (non-IID …”
    Volltext
    Tagungsbericht
  3. 3

    Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs von Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

    Veröffentlicht: IEEE 01.09.2021
    “… Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs …”
    Volltext
    Tagungsbericht
  4. 4

    PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models von Lee, Yunjae, Kim, Hyeseong, Rhu, Minsoo

    Veröffentlicht: IEEE 29.06.2024
    “… Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and …”
    Volltext
    Tagungsbericht
  5. 5

    Centralized Training and Decentralized Control through the Actor-Critic Paradigm for Highly Optimized Multicores von Dietrich, Benedikt, Khdr, Heba, Henkel, Jorg

    Veröffentlicht: IEEE 22.06.2025
    “… While distributed, neural-network-based resource controllers represent the state of the art for their ability to cope with the ever-expanding decision space, such approaches suffer from several …”
    Volltext
    Tagungsbericht
  6. 6

    DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning von Meng, Chunyang, Song, Shijie, Tong, Haogang, Pan, Maolin, Yu, Yang

    ISSN: 2643-1572
    Veröffentlicht: IEEE 11.09.2023
    “… Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm …”
    Volltext
    Tagungsbericht
  7. 7

    NDFT: Accelerating Density Functional Theory Calculations via Hardware/Software Co-Design on Near-Data Computing System von Jiang, Qingcai, Tu, Buxin, Hao, Xiaoyu, Chen, Junshi, An, Hong

    Veröffentlicht: IEEE 22.06.2025
    “… Linear-response time-dependent Density Functional Theory (LR-TDDFT) is a widely used method for accurately predicting the excited-state properties of physical …”
    Volltext
    Tagungsbericht
  8. 8

    Derm: SLA-aware Resource Management for Highly Dynamic Microservices von Chen, Liao, Luo, Shutian, Lin, Chenyu, Mo, Zizhao, Xu, Huanle, Ye, Kejiang, Xu, Chengzhong

    Veröffentlicht: IEEE 29.06.2024
    “… Ensuring efficient resource allocation while providing service level agreement (SLA) guarantees for end-to-end (E2E) latency is crucial for microservice …”
    Volltext
    Tagungsbericht
  9. 9

    Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics von Dathathri, Roshan, Gill, Gurbinder, Hoang, Loc, Jatala, Vishwesh, Pingali, Keshav, Nandivada, V. Krishna, Dang, Hoang-Vu, Snir, Marc

    ISSN: 2641-7936
    Veröffentlicht: IEEE 01.09.2019
    “… Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP …”
    Volltext
    Tagungsbericht
  10. 10

    HADFL: Heterogeneity-aware Decentralized Federated Learning Framework von Cao, Jing, Lian, Zirui, Liu, Weihong, Zhu, Zongwei, Ji, Cheng

    Veröffentlicht: IEEE 05.12.2021
    “… Federated learning (FL) supports training models on geographically distributed devices …”
    Volltext
    Tagungsbericht
  11. 11

    AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs von Zhang, Ruisi, Javaheripi, Mojan, Ghodsi, Zahra, Bleiweiss, Amit, Koushanfar, Farinaz

    Veröffentlicht: IEEE 09.07.2023
    “… Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications …”
    Volltext
    Tagungsbericht
  12. 12

    Submodularity of Distributed Join Computation von Li, Rundong, Riedewald, Mirek, Deng, Xinyan

    ISSN: 0730-8078
    Veröffentlicht: United States 01.06.2018
    “… We study distributed equi-join computation in the presence of join-attribute skew, which causes load imbalance …”
    Weitere Angaben
    Journal Article
  13. 13

    DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems von Song, Ruibing, Wu, Chunshu, Liu, Chuan, Li, Ang, Huang, Michael, Geng, Tony Tong

    Veröffentlicht: IEEE 29.06.2024
    “… With the rapid digitization of the world, an increasing number of real-world applications are turning to non-Euclidean data, modeled as graphs. Due to their …”
    Volltext
    Tagungsbericht
  14. 14

    Invited: Waving the Double-Edged Sword: Building Resilient CAVs with Edge and Cloud Computing von Liu, Xiangguo, Luo, Yunpeng, Goeckner, Anthony, Chakraborty, Trishna, Jiao, Ruochen, Wang, Ningfei, Wang, Yixuan, Sato, Takami, Chen, Qi Alfred, Zhu, Qi

    Veröffentlicht: IEEE 09.07.2023
    “… for better situation awareness and more intelligent decision making. On the other hand, the more distributed computing process and the wireless nature of V2X …”
    Volltext
    Tagungsbericht
  15. 15

    MyML: User-Driven Machine Learning von Goyal, Vidushi, Bertacco, Valeria, Das, Reetuparna

    Veröffentlicht: IEEE 05.12.2021
    “… Machine learning (ML) on resource-constrained edge devices is expensive and often requires offloading computation to the cloud, which may compromise the …”
    Volltext
    Tagungsbericht
  16. 16

    Personalized Heterogeneity-aware Federated Search Towards Better Accuracy and Energy Efficiency von Yang, Zhao, Sun, Qingshuang

    ISSN: 1558-2434
    Veröffentlicht: ACM 29.10.2022
    “… Federated learning (FL), a new distributed technology, allows us to train the global model on the edge and embedded devices without local data sharing …”
    Volltext
    Tagungsbericht
  17. 17

    Optimizing Distributed ML Communication with Fused Computation-Collective Operations von Punniyamurthy, Kishore, Hamidouche, Khaled, Beckmann, Bradford M.

    Veröffentlicht: IEEE 17.11.2024
    “… Machine learning models are distributed across multiple nodes using numerous parallelism strategies …”
    Volltext
    Tagungsbericht
  18. 18

    FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers von Chai, Zheng, Chen, Yujing, Anwar, Ali, Zhao, Liang, Cheng, Yue, Rangwala, Huzefa

    ISSN: 2167-4337
    Veröffentlicht: ACM 14.11.2021
    “… Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private …”
    Volltext
    Tagungsbericht
  19. 19

    Revamping Sampling-Based PGO with Context-Sensitivity and Pseudo-instrumentation von He, Wenlei, Yu, Hongtao, Wang, Lei, Oh, Taewook

    ISSN: 2643-2838
    Veröffentlicht: IEEE 02.03.2024
    “… The ever increasing scale of modern data center demands more effective optimizations, as even a small percentage of performance improvement can result in a …”
    Volltext
    Tagungsbericht
  20. 20

    COCA: Generative Root Cause Analysis for Distributed Systems with Code Knowledge von Li, Yichen, Wu, Yulun, Liu, Jinyang, Jiang, Zhihan, Chen, Zhuangbin, Yu, Guangba, Lyu, Michael R.

    ISSN: 1558-1225
    Veröffentlicht: IEEE 26.04.2025
    “… Runtime failures are commonplace in modern distributed systems. When such issues arise, users often turn to platforms such as Github or JIRA to report them and request assistance …”
    Volltext
    Tagungsbericht