Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cos...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Signal, image and video processing Ročník 17; číslo 1; s. 285 - 293
Hlavní autoři: Sebai, D., Shah, A. Ulah
Médium: Journal Article
Jazyk:angličtina
Vydáno: London Springer London 01.02.2023
Springer Nature B.V
Témata:
ISSN:1863-1703, 1863-1711
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cost and power consumption, especially for broad bitrate ranges. To deal with such limitation, the variable-rate compression methods use the Lagrange multiplier to control the Rate/Distortion trade-offs in order not to require retraining of the neural network for each rate. However, they do not make an optimized bit allocation for the eye-catching foreground details, and do not consider the different degree of attention that the human eye has to each area of the image. Thus, other deep learning-based image compression approaches, which could outperform the above ones, are replied on the use of additional information. In this paper, we present a loss-conditional autoencoder tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy variable-rate compression. Our framework is a neural network-based scheme able to automatically optimize coding parameters with multi-term perceptual loss function based on semantic-important structural SIMilarity index. To ensure the rate adaptation, we suggest modulating the compression network on the bitwidth of its activations by quantizing them according to several bitwidth values. Experiments are presented on the JPEG AI dataset in which our method achieves competitive and higher visual quality for the same compressed size, when compared to conventional codecs and related work.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-022-02231-1