Search Results - Computing methodologies Distributed computing methodologies
-
1
MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Published: IEEE 29.06.2024Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs…”
Get full text
Conference Proceeding -
2
MMDFL: Multi-Model-based Decentralized Federated Learning for Resource-Constrained AIoT Systems
Published: IEEE 22.06.2025Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… However, DFL still faces three major challenges, i.e., limited computing power and network bandwidth of resource-constrained devices, non-Independent and Identically Distributed (non-IID…”
Get full text
Conference Proceeding -
3
Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs
Published: IEEE 01.09.2021Published in 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)“…Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs…”
Get full text
Conference Proceeding -
4
PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Published: IEEE 29.06.2024Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and…”
Get full text
Conference Proceeding -
5
Centralized Training and Decentralized Control through the Actor-Critic Paradigm for Highly Optimized Multicores
Published: IEEE 22.06.2025Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…While distributed, neural-network-based resource controllers represent the state of the art for their ability to cope with the ever-expanding decision space, such approaches suffer from several…”
Get full text
Conference Proceeding -
6
DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning
ISSN: 2643-1572Published: IEEE 11.09.2023Published in IEEE/ACM International Conference on Automated Software Engineering : [proceedings] (11.09.2023)“…Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm…”
Get full text
Conference Proceeding -
7
NDFT: Accelerating Density Functional Theory Calculations via Hardware/Software Co-Design on Near-Data Computing System
Published: IEEE 22.06.2025Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Linear-response time-dependent Density Functional Theory (LR-TDDFT) is a widely used method for accurately predicting the excited-state properties of physical…”
Get full text
Conference Proceeding -
8
Derm: SLA-aware Resource Management for Highly Dynamic Microservices
Published: IEEE 29.06.2024Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…Ensuring efficient resource allocation while providing service level agreement (SLA) guarantees for end-to-end (E2E) latency is crucial for microservice…”
Get full text
Conference Proceeding -
9
HADFL: Heterogeneity-aware Decentralized Federated Learning Framework
Published: IEEE 05.12.2021Published in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“…Federated learning (FL) supports training models on geographically distributed devices…”
Get full text
Conference Proceeding -
10
Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics
ISSN: 2641-7936Published: IEEE 01.09.2019Published in Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)“…Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP…”
Get full text
Conference Proceeding -
11
AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs
Published: IEEE 09.07.2023Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“…Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications…”
Get full text
Conference Proceeding -
12
Submodularity of Distributed Join Computation
ISSN: 0730-8078Published: United States 01.06.2018Published in Proceedings - ACM-SIGMOD International Conference on Management of Data (01.06.2018)“…We study distributed equi-join computation in the presence of join-attribute skew, which causes load imbalance…”
Get more information
Journal Article -
13
DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems
Published: IEEE 29.06.2024Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…With the rapid digitization of the world, an increasing number of real-world applications are turning to non-Euclidean data, modeled as graphs. Due to their…”
Get full text
Conference Proceeding -
14
Invited: Waving the Double-Edged Sword: Building Resilient CAVs with Edge and Cloud Computing
Published: IEEE 09.07.2023Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“… for better situation awareness and more intelligent decision making. On the other hand, the more distributed computing process and the wireless nature of V2X…”
Get full text
Conference Proceeding -
15
MyML: User-Driven Machine Learning
Published: IEEE 05.12.2021Published in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“…Machine learning (ML) on resource-constrained edge devices is expensive and often requires offloading computation to the cloud, which may compromise the…”
Get full text
Conference Proceeding -
16
Personalized Heterogeneity-aware Federated Search Towards Better Accuracy and Energy Efficiency
ISSN: 1558-2434Published: ACM 29.10.2022Published in 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (29.10.2022)“…Federated learning (FL), a new distributed technology, allows us to train the global model on the edge and embedded devices without local data sharing…”
Get full text
Conference Proceeding -
17
Optimizing Distributed ML Communication with Fused Computation-Collective Operations
Published: IEEE 17.11.2024Published in SC24: International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)“…Machine learning models are distributed across multiple nodes using numerous parallelism strategies…”
Get full text
Conference Proceeding -
18
FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers
ISSN: 2167-4337Published: ACM 14.11.2021Published in SC21: International Conference for High Performance Computing, Networking, Storage and Analysis (14.11.2021)“…Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private…”
Get full text
Conference Proceeding -
19
Revamping Sampling-Based PGO with Context-Sensitivity and Pseudo-instrumentation
ISSN: 2643-2838Published: IEEE 02.03.2024Published in Proceedings / International Symposium on Code Generation and Optimization (02.03.2024)“…The ever increasing scale of modern data center demands more effective optimizations, as even a small percentage of performance improvement can result in a…”
Get full text
Conference Proceeding -
20
COCA: Generative Root Cause Analysis for Distributed Systems with Code Knowledge
ISSN: 1558-1225Published: IEEE 26.04.2025Published in Proceedings / International Conference on Software Engineering (26.04.2025)“…Runtime failures are commonplace in modern distributed systems. When such issues arise, users often turn to platforms such as Github or JIRA to report them and request assistance…”
Get full text
Conference Proceeding