Search Results - Computing methodologies Distributed computing methodologies

Refine Results
  1. 1

    MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems by Hsia, Samuel, Golden, Alicia, Acun, Bilge, Ardalani, Newsha, DeVito, Zachary, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    Published: IEEE 29.06.2024
    “…Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs…”
    Get full text
    Conference Proceeding
  2. 2

    MMDFL: Multi-Model-based Decentralized Federated Learning for Resource-Constrained AIoT Systems by Yan, Dengke, Yang, Yanxin, Hu, Ming, Fu, Xin, Chen, Mingsong

    Published: IEEE 22.06.2025
    “… However, DFL still faces three major challenges, i.e., limited computing power and network bandwidth of resource-constrained devices, non-Independent and Identically Distributed (non-IID…”
    Get full text
    Conference Proceeding
  3. 3

    Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs by Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

    Published: IEEE 01.09.2021
    “…Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs…”
    Get full text
    Conference Proceeding
  4. 4

    PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models by Lee, Yunjae, Kim, Hyeseong, Rhu, Minsoo

    Published: IEEE 29.06.2024
    “…Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and…”
    Get full text
    Conference Proceeding
  5. 5

    Centralized Training and Decentralized Control through the Actor-Critic Paradigm for Highly Optimized Multicores by Dietrich, Benedikt, Khdr, Heba, Henkel, Jorg

    Published: IEEE 22.06.2025
    “…While distributed, neural-network-based resource controllers represent the state of the art for their ability to cope with the ever-expanding decision space, such approaches suffer from several…”
    Get full text
    Conference Proceeding
  6. 6

    DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning by Meng, Chunyang, Song, Shijie, Tong, Haogang, Pan, Maolin, Yu, Yang

    ISSN: 2643-1572
    Published: IEEE 11.09.2023
    “…Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm…”
    Get full text
    Conference Proceeding
  7. 7

    NDFT: Accelerating Density Functional Theory Calculations via Hardware/Software Co-Design on Near-Data Computing System by Jiang, Qingcai, Tu, Buxin, Hao, Xiaoyu, Chen, Junshi, An, Hong

    Published: IEEE 22.06.2025
    “…Linear-response time-dependent Density Functional Theory (LR-TDDFT) is a widely used method for accurately predicting the excited-state properties of physical…”
    Get full text
    Conference Proceeding
  8. 8

    Derm: SLA-aware Resource Management for Highly Dynamic Microservices by Chen, Liao, Luo, Shutian, Lin, Chenyu, Mo, Zizhao, Xu, Huanle, Ye, Kejiang, Xu, Chengzhong

    Published: IEEE 29.06.2024
    “…Ensuring efficient resource allocation while providing service level agreement (SLA) guarantees for end-to-end (E2E) latency is crucial for microservice…”
    Get full text
    Conference Proceeding
  9. 9

    HADFL: Heterogeneity-aware Decentralized Federated Learning Framework by Cao, Jing, Lian, Zirui, Liu, Weihong, Zhu, Zongwei, Ji, Cheng

    Published: IEEE 05.12.2021
    “…Federated learning (FL) supports training models on geographically distributed devices…”
    Get full text
    Conference Proceeding
  10. 10

    Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics by Dathathri, Roshan, Gill, Gurbinder, Hoang, Loc, Jatala, Vishwesh, Pingali, Keshav, Nandivada, V. Krishna, Dang, Hoang-Vu, Snir, Marc

    ISSN: 2641-7936
    Published: IEEE 01.09.2019
    “…Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP…”
    Get full text
    Conference Proceeding
  11. 11

    AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs by Zhang, Ruisi, Javaheripi, Mojan, Ghodsi, Zahra, Bleiweiss, Amit, Koushanfar, Farinaz

    Published: IEEE 09.07.2023
    “…Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications…”
    Get full text
    Conference Proceeding
  12. 12

    Submodularity of Distributed Join Computation by Li, Rundong, Riedewald, Mirek, Deng, Xinyan

    ISSN: 0730-8078
    Published: United States 01.06.2018
    “…We study distributed equi-join computation in the presence of join-attribute skew, which causes load imbalance…”
    Get more information
    Journal Article
  13. 13

    DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems by Song, Ruibing, Wu, Chunshu, Liu, Chuan, Li, Ang, Huang, Michael, Geng, Tony Tong

    Published: IEEE 29.06.2024
    “…With the rapid digitization of the world, an increasing number of real-world applications are turning to non-Euclidean data, modeled as graphs. Due to their…”
    Get full text
    Conference Proceeding
  14. 14

    Invited: Waving the Double-Edged Sword: Building Resilient CAVs with Edge and Cloud Computing by Liu, Xiangguo, Luo, Yunpeng, Goeckner, Anthony, Chakraborty, Trishna, Jiao, Ruochen, Wang, Ningfei, Wang, Yixuan, Sato, Takami, Chen, Qi Alfred, Zhu, Qi

    Published: IEEE 09.07.2023
    “… for better situation awareness and more intelligent decision making. On the other hand, the more distributed computing process and the wireless nature of V2X…”
    Get full text
    Conference Proceeding
  15. 15

    MyML: User-Driven Machine Learning by Goyal, Vidushi, Bertacco, Valeria, Das, Reetuparna

    Published: IEEE 05.12.2021
    “…Machine learning (ML) on resource-constrained edge devices is expensive and often requires offloading computation to the cloud, which may compromise the…”
    Get full text
    Conference Proceeding
  16. 16

    Personalized Heterogeneity-aware Federated Search Towards Better Accuracy and Energy Efficiency by Yang, Zhao, Sun, Qingshuang

    ISSN: 1558-2434
    Published: ACM 29.10.2022
    “…Federated learning (FL), a new distributed technology, allows us to train the global model on the edge and embedded devices without local data sharing…”
    Get full text
    Conference Proceeding
  17. 17

    Optimizing Distributed ML Communication with Fused Computation-Collective Operations by Punniyamurthy, Kishore, Hamidouche, Khaled, Beckmann, Bradford M.

    Published: IEEE 17.11.2024
    “…Machine learning models are distributed across multiple nodes using numerous parallelism strategies…”
    Get full text
    Conference Proceeding
  18. 18

    FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers by Chai, Zheng, Chen, Yujing, Anwar, Ali, Zhao, Liang, Cheng, Yue, Rangwala, Huzefa

    ISSN: 2167-4337
    Published: ACM 14.11.2021
    “…Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private…”
    Get full text
    Conference Proceeding
  19. 19

    Revamping Sampling-Based PGO with Context-Sensitivity and Pseudo-instrumentation by He, Wenlei, Yu, Hongtao, Wang, Lei, Oh, Taewook

    ISSN: 2643-2838
    Published: IEEE 02.03.2024
    “…The ever increasing scale of modern data center demands more effective optimizations, as even a small percentage of performance improvement can result in a…”
    Get full text
    Conference Proceeding
  20. 20

    COCA: Generative Root Cause Analysis for Distributed Systems with Code Knowledge by Li, Yichen, Wu, Yulun, Liu, Jinyang, Jiang, Zhihan, Chen, Zhuangbin, Yu, Guangba, Lyu, Michael R.

    ISSN: 1558-1225
    Published: IEEE 26.04.2025
    “…Runtime failures are commonplace in modern distributed systems. When such issues arise, users often turn to platforms such as Github or JIRA to report them and request assistance…”
    Get full text
    Conference Proceeding