Enhancing Image Coding for Machines with Compressed Feature Residuals

As computer vision technologies have tremendously improved over the last decade, videos and images are often consumed by machines instead of humans which are the main target for traditional video codecs. In many use cases, although machines are the main consumers, human involvement is also required,...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2021 IEEE International Symposium on Multimedia (ISM) s. 217 - 225
Hlavní autoři: Seppala, Joni, Zhang, Honglei, Le, Nam, Youvalari, Ramin G., Cricri, Francesco, Tavakoli, Hamed Rezazadegan, Aksu, Emre, Hannuksela, Miska M., Rahtu, Esa
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.11.2021
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:As computer vision technologies have tremendously improved over the last decade, videos and images are often consumed by machines instead of humans which are the main target for traditional video codecs. In many use cases, although machines are the main consumers, human involvement is also required, or even mandatory. In this paper, we propose a novel image coding technique targeted for machines, while maintaining the capability for human consumption. Our proposed codec generates two bitstreams: one bitstream from a traditional codec, referred to as human bitstream, optimized for human consumption; the other bitstream, referred to as machine bitstream, generated from an end-to-end learned neural network-based codec and optimized for machine tasks. Instead of working on the image domain, the proposed machine bitstream is derived from feature residuals - the difference between the features extracted from the input image and the features extracted from the reconstructed image generated by the traditional codec. With the help of the machine bitstream, we can significantly improve machine task performance in the low bitrate range. Our system beats the state-of-the-art traditional codec, the Versatile Video Coding (VVC/H.266), achieving −40.5% in Bjontegaard delta bitrate reduction on average for bitrates up to 0.07 BPP.
DOI:10.1109/ISM52913.2021.00044