Image-specific Bit Allocation Optimization for Multiscale Feature Coding for Machines

As machines increasingly consume visual content instead of humans, developing compression methods tailored for machine vision models is critical. In this work, to minimize the coding bitrate while maintaining the machine vision task accuracies, we propose an image-specific bit allocation (ISBA) opti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2025 International Symposium on Machine Learning and Media Computing (MLMC) S. 1 - 5
Hauptverfasser: Liu, Junle, Zhang, Yun, Huang, Qinhao, Xu, Long
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 28.07.2025
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As machines increasingly consume visual content instead of humans, developing compression methods tailored for machine vision models is critical. In this work, to minimize the coding bitrate while maintaining the machine vision task accuracies, we propose an image-specific bit allocation (ISBA) optimization for multiscale feature coding for machines, where an image-specific task loss-rate model is proposed to characterize the relationship between task accuracy degradation and compression bitrate for individual image. Based on the ISBA, adaptive weights are assigned to multiscale features based on their instancespecific importance to machine vision tasks. Then, the proposed task loss-rate model effectively maps task accuracy to bitrate for each image. By formulating and solving an optimization function, our method allocates distinct compression qualities to different feature scales, yielding a more efficient encoding strategy. Experiments demonstrate that when combined with Efficient Learned Image Compression (ELIC), the ISBA demonstrates more effective compression performance than the MFIBA method, and achieves average bitrate savings of \mathbf{1 6. 8 4 3} \% for object detection, \mathbf{1 5. 8 2 5} \% for instance segmentation, and \mathbf{1 6. 8 4 3} \% for keypoint detection compared to the anchor ELIC baseline, validating its generalizability across diverse machine vision tasks.
DOI:10.1109/MLMC65154.2025.11189897