Suchergebnisse - Computing methodologies → Distributed computing methodologies
-
1
MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Veröffentlicht: IEEE 29.06.2024Veröffentlicht in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“… Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs …”
Volltext
Tagungsbericht -
2
MMDFL: Multi-Model-based Decentralized Federated Learning for Resource-Constrained AIoT Systems
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… However, DFL still faces three major challenges, i.e., limited computing power and network bandwidth of resource-constrained devices, non-Independent and Identically Distributed (non-IID …”
Volltext
Tagungsbericht -
3
Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs
Veröffentlicht: IEEE 01.09.2021Veröffentlicht in 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)“… Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs …”
Volltext
Tagungsbericht -
4
PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Veröffentlicht: IEEE 29.06.2024Veröffentlicht in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“… Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and …”
Volltext
Tagungsbericht -
5
Centralized Training and Decentralized Control through the Actor-Critic Paradigm for Highly Optimized Multicores
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… While distributed, neural-network-based resource controllers represent the state of the art for their ability to cope with the ever-expanding decision space, such approaches suffer from several …”
Volltext
Tagungsbericht -
6
DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning
ISSN: 2643-1572Veröffentlicht: IEEE 11.09.2023Veröffentlicht in IEEE/ACM International Conference on Automated Software Engineering : [proceedings] (11.09.2023)“… Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm …”
Volltext
Tagungsbericht -
7
NDFT: Accelerating Density Functional Theory Calculations via Hardware/Software Co-Design on Near-Data Computing System
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… Linear-response time-dependent Density Functional Theory (LR-TDDFT) is a widely used method for accurately predicting the excited-state properties of physical …”
Volltext
Tagungsbericht -
8
Derm: SLA-aware Resource Management for Highly Dynamic Microservices
Veröffentlicht: IEEE 29.06.2024Veröffentlicht in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“… Ensuring efficient resource allocation while providing service level agreement (SLA) guarantees for end-to-end (E2E) latency is crucial for microservice …”
Volltext
Tagungsbericht -
9
Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics
ISSN: 2641-7936Veröffentlicht: IEEE 01.09.2019Veröffentlicht in Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)“… Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP …”
Volltext
Tagungsbericht -
10
HADFL: Heterogeneity-aware Decentralized Federated Learning Framework
Veröffentlicht: IEEE 05.12.2021Veröffentlicht in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“… Federated learning (FL) supports training models on geographically distributed devices …”
Volltext
Tagungsbericht -
11
AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs
Veröffentlicht: IEEE 09.07.2023Veröffentlicht in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“… Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications …”
Volltext
Tagungsbericht -
12
Submodularity of Distributed Join Computation
ISSN: 0730-8078Veröffentlicht: United States 01.06.2018Veröffentlicht in Proceedings - ACM-SIGMOD International Conference on Management of Data (01.06.2018)“… We study distributed equi-join computation in the presence of join-attribute skew, which causes load imbalance …”
Weitere Angaben
Journal Article -
13
DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems
Veröffentlicht: IEEE 29.06.2024Veröffentlicht in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“… With the rapid digitization of the world, an increasing number of real-world applications are turning to non-Euclidean data, modeled as graphs. Due to their …”
Volltext
Tagungsbericht -
14
Invited: Waving the Double-Edged Sword: Building Resilient CAVs with Edge and Cloud Computing
Veröffentlicht: IEEE 09.07.2023Veröffentlicht in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“… for better situation awareness and more intelligent decision making. On the other hand, the more distributed computing process and the wireless nature of V2X …”
Volltext
Tagungsbericht -
15
MyML: User-Driven Machine Learning
Veröffentlicht: IEEE 05.12.2021Veröffentlicht in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“… Machine learning (ML) on resource-constrained edge devices is expensive and often requires offloading computation to the cloud, which may compromise the …”
Volltext
Tagungsbericht -
16
Personalized Heterogeneity-aware Federated Search Towards Better Accuracy and Energy Efficiency
ISSN: 1558-2434Veröffentlicht: ACM 29.10.2022Veröffentlicht in 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (29.10.2022)“… Federated learning (FL), a new distributed technology, allows us to train the global model on the edge and embedded devices without local data sharing …”
Volltext
Tagungsbericht -
17
Optimizing Distributed ML Communication with Fused Computation-Collective Operations
Veröffentlicht: IEEE 17.11.2024Veröffentlicht in SC24: International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)“… Machine learning models are distributed across multiple nodes using numerous parallelism strategies …”
Volltext
Tagungsbericht -
18
FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers
ISSN: 2167-4337Veröffentlicht: ACM 14.11.2021Veröffentlicht in SC21: International Conference for High Performance Computing, Networking, Storage and Analysis (14.11.2021)“… Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private …”
Volltext
Tagungsbericht -
19
Revamping Sampling-Based PGO with Context-Sensitivity and Pseudo-instrumentation
ISSN: 2643-2838Veröffentlicht: IEEE 02.03.2024Veröffentlicht in Proceedings / International Symposium on Code Generation and Optimization (02.03.2024)“… The ever increasing scale of modern data center demands more effective optimizations, as even a small percentage of performance improvement can result in a …”
Volltext
Tagungsbericht -
20
COCA: Generative Root Cause Analysis for Distributed Systems with Code Knowledge
ISSN: 1558-1225Veröffentlicht: IEEE 26.04.2025Veröffentlicht in Proceedings / International Conference on Software Engineering (26.04.2025)“… Runtime failures are commonplace in modern distributed systems. When such issues arise, users often turn to platforms such as Github or JIRA to report them and request assistance …”
Volltext
Tagungsbericht