CoCaR: Enabling Efficient Dynamic DNN-Based Model Caching and Request Routing in MEC

Mobile edge computing (MEC) can pre-cache deep neural networks (DNNs) near end-users, providing low-latency services and improving users' quality of experience (QoE). However, caching all DNN models at edge servers with limited capacity is difficult, and the impact of model loading time on QoE...

Full description

Saved in:
Bibliographic Details
Published in:Annual Joint Conference of the IEEE Computer and Communications Societies pp. 1 - 10
Main Authors: Qiu, Shuting, Dong, Fang, Tan, Siyu, Shen, Dian, Zhou, Ruiting, Fan, Qilin
Format: Conference Proceeding
Language:English
Published: IEEE 19.05.2025
Subjects:
ISSN:2641-9874
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Mobile edge computing (MEC) can pre-cache deep neural networks (DNNs) near end-users, providing low-latency services and improving users' quality of experience (QoE). However, caching all DNN models at edge servers with limited capacity is difficult, and the impact of model loading time on QoE is underexplored. Hence, we introduce dynamic DNNs in edge scenarios, disassembling a complete DNN model into interrelated submodels for more fine-grained and flexible model caching and request routing solutions. Further, this raises the pressing issue of joint deciding request routing and sub model caching for dynamic DNNs to balance model inference precision and loading latency for QoE optimization. In this paper, we study the joint dynamic model caching and request routing problem in MEC networks, aiming to maximize user request inference precision under constraints of server resources, latency, and model loading time. To tackle this problem, we propose CoCaR, an algorithm based on linear programming and random rounding that leverages dynamic DNNs to optimize caching and routing schemes, achieving near-optimal performance. Simulation results show that the proposed CoCaR achieves significant performance improvements compared to state-of-the-art baselines.
ISSN:2641-9874
DOI:10.1109/INFOCOM55648.2025.11044457