Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cos...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal, image and video processing Jg. 17; H. 1; S. 285 - 293
Hauptverfasser: Sebai, D., Shah, A. Ulah
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London Springer London 01.02.2023
Springer Nature B.V
Schlagworte:
ISSN:1863-1703, 1863-1711
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cost and power consumption, especially for broad bitrate ranges. To deal with such limitation, the variable-rate compression methods use the Lagrange multiplier to control the Rate/Distortion trade-offs in order not to require retraining of the neural network for each rate. However, they do not make an optimized bit allocation for the eye-catching foreground details, and do not consider the different degree of attention that the human eye has to each area of the image. Thus, other deep learning-based image compression approaches, which could outperform the above ones, are replied on the use of additional information. In this paper, we present a loss-conditional autoencoder tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy variable-rate compression. Our framework is a neural network-based scheme able to automatically optimize coding parameters with multi-term perceptual loss function based on semantic-important structural SIMilarity index. To ensure the rate adaptation, we suggest modulating the compression network on the bitwidth of its activations by quantizing them according to several bitwidth values. Experiments are presented on the JPEG AI dataset in which our method achieves competitive and higher visual quality for the same compressed size, when compared to conventional codecs and related work.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-022-02231-1