Quantization Compensator Network: Server-Side Feature Reconstruction in Partitioned IoT Systems
Gespeichert in:
| Titel: | Quantization Compensator Network: Server-Side Feature Reconstruction in Partitioned IoT Systems |
|---|---|
| Autoren: | Sánchez Leal, Isaac, Berg, Oscar Artur Bernd, Krug, Silvia, Saqib, Eiraj, Shallari, Irida, Jantsch, Axel, O'Nils, Mattias, 1969, Nordström, Tomas |
| Quelle: | IEEE Access. 13:186488-186508 |
| Schlagwörter: | 1-bit quantization, accuracy recovery, deep learning, deep vision, edge computing, feature map reconstruction, Internet of Things (IoT), QCNets, quantization compensation networks, server-side reconstruction, system partitioning, tiny ML |
| Beschreibung: | With the growing number of IoT devices generating data at the edge, there is a rising demand to run machine learning (ML) models directly on these resource-constrained nodes. To overcome hardware limitations, a common approach is to partition the model between the node and a more capable edge or cloud server. However, this introduces a communication bottleneck, especially for transmitting intermediate feature maps. Extreme quantization, such as 1-bit quantization, drastically reduces communication cost but causes significant accuracy degradation. Existing solutions like full-model retraining offer limited recovery, while methods such as autoencoders shift computational burden to the IoT node. In this work, we propose Quantization Compensator Network (QCNet)—a lightweight, server-side module that reconstructs high-fidelity feature maps directly from 1-bit quantized data. QCNet is used alongside fine-tuning of the server-side model and introduces no additional computation on the IoT node. We evaluate QCNet across diverse vision models (ResNet50, ViT-B/16, ConvNeXt Tiny, and YOLOv3 Tiny) and tasks (classification, detection), showing that it consistently outperforms standard dequantization, autoencoder-based, and Quantization-Aware Training (QAT) approaches. Remarkably, QCNet achieves accuracy close to—or even surpassing—that of the original unpartitioned models, while maintaining a favorable accuracy–latency trade-off. QCNet offers a practical and efficient solution for enabling accurate distributed intelligence on communication- and compute-limited IoT platforms. |
| Dateibeschreibung: | electronic |
| Zugangs-URL: | https://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-55987 https://doi.org/10.1109/ACCESS.2025.3627072 |
| Datenbank: | SwePub |
| Abstract: | With the growing number of IoT devices generating data at the edge, there is a rising demand to run machine learning (ML) models directly on these resource-constrained nodes. To overcome hardware limitations, a common approach is to partition the model between the node and a more capable edge or cloud server. However, this introduces a communication bottleneck, especially for transmitting intermediate feature maps. Extreme quantization, such as 1-bit quantization, drastically reduces communication cost but causes significant accuracy degradation. Existing solutions like full-model retraining offer limited recovery, while methods such as autoencoders shift computational burden to the IoT node. In this work, we propose Quantization Compensator Network (QCNet)—a lightweight, server-side module that reconstructs high-fidelity feature maps directly from 1-bit quantized data. QCNet is used alongside fine-tuning of the server-side model and introduces no additional computation on the IoT node. We evaluate QCNet across diverse vision models (ResNet50, ViT-B/16, ConvNeXt Tiny, and YOLOv3 Tiny) and tasks (classification, detection), showing that it consistently outperforms standard dequantization, autoencoder-based, and Quantization-Aware Training (QAT) approaches. Remarkably, QCNet achieves accuracy close to—or even surpassing—that of the original unpartitioned models, while maintaining a favorable accuracy–latency trade-off. QCNet offers a practical and efficient solution for enabling accurate distributed intelligence on communication- and compute-limited IoT platforms. |
|---|---|
| ISSN: | 21693536 |
| DOI: | 10.1109/ACCESS.2025.3627072 |
Full Text Finder
Nájsť tento článok vo Web of Science