Image-specific Bit Allocation Optimization for Multiscale Feature Coding for Machines

As machines increasingly consume visual content instead of humans, developing compression methods tailored for machine vision models is critical. In this work, to minimize the coding bitrate while maintaining the machine vision task accuracies, we propose an image-specific bit allocation (ISBA) opti...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	2025 International Symposium on Machine Learning and Media Computing (MLMC) s. 1 - 5
Hlavní autoři:	Liu, Junle, Zhang, Yun, Huang, Qinhao, Xu, Long
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 28.07.2025
Témata:	Accuracy Adaptation models bit allocation Bit rate Degradation feature coding Image coding Image coding for machines Instance segmentation keypoint detection Machine vision Object detection Optimization Visualization
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	As machines increasingly consume visual content instead of humans, developing compression methods tailored for machine vision models is critical. In this work, to minimize the coding bitrate while maintaining the machine vision task accuracies, we propose an image-specific bit allocation (ISBA) optimization for multiscale feature coding for machines, where an image-specific task loss-rate model is proposed to characterize the relationship between task accuracy degradation and compression bitrate for individual image. Based on the ISBA, adaptive weights are assigned to multiscale features based on their instancespecific importance to machine vision tasks. Then, the proposed task loss-rate model effectively maps task accuracy to bitrate for each image. By formulating and solving an optimization function, our method allocates distinct compression qualities to different feature scales, yielding a more efficient encoding strategy. Experiments demonstrate that when combined with Efficient Learned Image Compression (ELIC), the ISBA demonstrates more effective compression performance than the MFIBA method, and achieves average bitrate savings of \mathbf{1 6. 8 4 3} \% for object detection, \mathbf{1 5. 8 2 5} \% for instance segmentation, and \mathbf{1 6. 8 4 3} \% for keypoint detection compared to the anchor ELIC baseline, validating its generalizability across diverse machine vision tasks.
DOI:	10.1109/MLMC65154.2025.11189897