CoCaR: Enabling Efficient Dynamic DNN-Based Model Caching and Request Routing in MEC
Mobile edge computing (MEC) can pre-cache deep neural networks (DNNs) near end-users, providing low-latency services and improving users' quality of experience (QoE). However, caching all DNN models at edge servers with limited capacity is difficult, and the impact of model loading time on QoE...
Gespeichert in:
| Veröffentlicht in: | Annual Joint Conference of the IEEE Computer and Communications Societies S. 1 - 10 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
19.05.2025
|
| Schlagworte: | |
| ISSN: | 2641-9874 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Mobile edge computing (MEC) can pre-cache deep neural networks (DNNs) near end-users, providing low-latency services and improving users' quality of experience (QoE). However, caching all DNN models at edge servers with limited capacity is difficult, and the impact of model loading time on QoE is underexplored. Hence, we introduce dynamic DNNs in edge scenarios, disassembling a complete DNN model into interrelated submodels for more fine-grained and flexible model caching and request routing solutions. Further, this raises the pressing issue of joint deciding request routing and sub model caching for dynamic DNNs to balance model inference precision and loading latency for QoE optimization. In this paper, we study the joint dynamic model caching and request routing problem in MEC networks, aiming to maximize user request inference precision under constraints of server resources, latency, and model loading time. To tackle this problem, we propose CoCaR, an algorithm based on linear programming and random rounding that leverages dynamic DNNs to optimize caching and routing schemes, achieving near-optimal performance. Simulation results show that the proposed CoCaR achieves significant performance improvements compared to state-of-the-art baselines. |
|---|---|
| ISSN: | 2641-9874 |
| DOI: | 10.1109/INFOCOM55648.2025.11044457 |