Image-specific Bit Allocation Optimization for Multiscale Feature Coding for Machines
As machines increasingly consume visual content instead of humans, developing compression methods tailored for machine vision models is critical. In this work, to minimize the coding bitrate while maintaining the machine vision task accuracies, we propose an image-specific bit allocation (ISBA) opti...
Saved in:
| Published in: | 2025 International Symposium on Machine Learning and Media Computing (MLMC) pp. 1 - 5 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
28.07.2025
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | As machines increasingly consume visual content instead of humans, developing compression methods tailored for machine vision models is critical. In this work, to minimize the coding bitrate while maintaining the machine vision task accuracies, we propose an image-specific bit allocation (ISBA) optimization for multiscale feature coding for machines, where an image-specific task loss-rate model is proposed to characterize the relationship between task accuracy degradation and compression bitrate for individual image. Based on the ISBA, adaptive weights are assigned to multiscale features based on their instancespecific importance to machine vision tasks. Then, the proposed task loss-rate model effectively maps task accuracy to bitrate for each image. By formulating and solving an optimization function, our method allocates distinct compression qualities to different feature scales, yielding a more efficient encoding strategy. Experiments demonstrate that when combined with Efficient Learned Image Compression (ELIC), the ISBA demonstrates more effective compression performance than the MFIBA method, and achieves average bitrate savings of \mathbf{1 6. 8 4 3} \% for object detection, \mathbf{1 5. 8 2 5} \% for instance segmentation, and \mathbf{1 6. 8 4 3} \% for keypoint detection compared to the anchor ELIC baseline, validating its generalizability across diverse machine vision tasks. |
|---|---|
| DOI: | 10.1109/MLMC65154.2025.11189897 |