Ensemble Learning-Based Rate-Distortion Optimization for End-to-End Image Compression

End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is limited in using a single encoding/decoding model, whereas we explore the usage of multiple encoding/decoding models as an ensemble. We propose...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on circuits and systems for video technology Ročník 31; číslo 3; s. 1193 - 1207
Hlavní autoři:	Wang, Yefei, Liu, Dong, Ma, Siwei, Wu, Feng, Gao, Wen
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.03.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Adaptation models Coders Coding Decoding Distortion Encoding-Decoding Ensemble learning Entropy coding Floating point arithmetic Geometric transformation Image coding Image compression Modal choice Optimization Rate-distortion rate-distortion optimization Transforms
ISSN:	1051-8215, 1558-2205
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is limited in using a single encoding/decoding model, whereas we explore the usage of multiple encoding/decoding models as an ensemble. We propose several methods to obtain multiple models. First, we adopt the boosting strategy to train multiple networks with diversity as an ensemble. Second, we train an ensemble of multiple probability distribution models to reduce the distribution gap for efficient entropy coding. Third, we present a geometric transform-based self-ensemble method. The multiple models can be regarded as the multiple coding modes, similar to those in non-deep video coding schemes. We further adopt block-level model/mode selection at the encoder side to pursue rate-distortion optimization, where we use hierarchical block partitioning to improve the adaptation ability. Compared with single-model end-to-end compression, our proposed method improves the compression efficiency significantly, leading to 21% BD-rate reduction on the Kodak dataset, without increasing the decoding complexity. On the other hand, when keeping the same compression efficiency, our method can use much simplified decoding models, where the floating-point operations are reduced by 70%.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2020.3000331