In EDS ansehen

Quantization Compensator Network: Server-Side Feature Reconstruction in Partitioned IoT Systems

Gespeichert in:

Bibliographische Detailangaben
Titel:	Quantization Compensator Network: Server-Side Feature Reconstruction in Partitioned IoT Systems
Autoren:	Sánchez Leal, Isaac, Berg, Oscar Artur Bernd, Krug, Silvia, Saqib, Eiraj, Shallari, Irida, Jantsch, Axel, O'Nils, Mattias, 1969, Nordström, Tomas
Quelle:	IEEE Access. 13:186488-186508
Schlagwörter:	1-bit quantization, accuracy recovery, deep learning, deep vision, edge computing, feature map reconstruction, Internet of Things (IoT), QCNets, quantization compensation networks, server-side reconstruction, system partitioning, tiny ML
Beschreibung:	With the growing number of IoT devices generating data at the edge, there is a rising demand to run machine learning (ML) models directly on these resource-constrained nodes. To overcome hardware limitations, a common approach is to partition the model between the node and a more capable edge or cloud server. However, this introduces a communication bottleneck, especially for transmitting intermediate feature maps. Extreme quantization, such as 1-bit quantization, drastically reduces communication cost but causes significant accuracy degradation. Existing solutions like full-model retraining offer limited recovery, while methods such as autoencoders shift computational burden to the IoT node. In this work, we propose Quantization Compensator Network (QCNet)—a lightweight, server-side module that reconstructs high-fidelity feature maps directly from 1-bit quantized data. QCNet is used alongside fine-tuning of the server-side model and introduces no additional computation on the IoT node. We evaluate QCNet across diverse vision models (ResNet50, ViT-B/16, ConvNeXt Tiny, and YOLOv3 Tiny) and tasks (classification, detection), showing that it consistently outperforms standard dequantization, autoencoder-based, and Quantization-Aware Training (QAT) approaches. Remarkably, QCNet achieves accuracy close to—or even surpassing—that of the original unpartitioned models, while maintaining a favorable accuracy–latency trade-off. QCNet offers a practical and efficient solution for enabling accurate distributed intelligence on communication- and compute-limited IoT platforms.
Dateibeschreibung:	electronic
Zugangs-URL:	https://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-55987 https://doi.org/10.1109/ACCESS.2025.3627072
Datenbank:	SwePub

View record in SwePub

Full Text Finder

Nájsť tento článok vo Web of Science

Beschreibung
Abstract:	With the growing number of IoT devices generating data at the edge, there is a rising demand to run machine learning (ML) models directly on these resource-constrained nodes. To overcome hardware limitations, a common approach is to partition the model between the node and a more capable edge or cloud server. However, this introduces a communication bottleneck, especially for transmitting intermediate feature maps. Extreme quantization, such as 1-bit quantization, drastically reduces communication cost but causes significant accuracy degradation. Existing solutions like full-model retraining offer limited recovery, while methods such as autoencoders shift computational burden to the IoT node. In this work, we propose Quantization Compensator Network (QCNet)—a lightweight, server-side module that reconstructs high-fidelity feature maps directly from 1-bit quantized data. QCNet is used alongside fine-tuning of the server-side model and introduces no additional computation on the IoT node. We evaluate QCNet across diverse vision models (ResNet50, ViT-B/16, ConvNeXt Tiny, and YOLOv3 Tiny) and tasks (classification, detection), showing that it consistently outperforms standard dequantization, autoencoder-based, and Quantization-Aware Training (QAT) approaches. Remarkably, QCNet achieves accuracy close to—or even surpassing—that of the original unpartitioned models, while maintaining a favorable accuracy–latency trade-off. QCNet offers a practical and efficient solution for enabling accurate distributed intelligence on communication- and compute-limited IoT platforms.
ISSN:	21693536
DOI:	10.1109/ACCESS.2025.3627072