A Robust Quality Enhancement Method Based on Joint Spatial-Temporal Priors for Video Coding

Quality enhancement of HEVC compressed videos has attracted a lot of attentions in recent years. In this article, we propose a robust multi-frame guided attention network (MGANet) to reconstruct high-quality frames based on HEVC compressed videos. In our network, we first use an advanced motion flow...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transactions on circuits and systems for video technology Ročník 31; číslo 6; s. 2401 - 2414
Hlavní autori:	Meng, Xiandong, Deng, Xuan, Zhu, Shuyuan, Zhang, Xinfeng, Zeng, Bing
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York IEEE 01.06.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Algorithms Coders Coding ConvLSTM Encoders-Decoders Encoding Frame design Frames (data processing) guided attention High efficiency video coding (HEVC) Image coding Image reconstruction optical flow quality enhancement Robustness Transforms Video compression
ISSN:	1051-8215, 1558-2205
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Quality enhancement of HEVC compressed videos has attracted a lot of attentions in recent years. In this article, we propose a robust multi-frame guided attention network (MGANet) to reconstruct high-quality frames based on HEVC compressed videos. In our network, we first use an advanced motion flow algorithm to estimate the motion information of input frames so as to guide the warping of adjacent frames. After performing the alignment, we find that large residuals still appear in the edge area of moving objects of the warped frames. Then, we design a temporal encoder based on a bi-directional convolutional long short term memory (ConvLSTM) with residual structure to further discover the variations between the current frame and its adjacent warped frames. Finally, we feed the extracted temporal information and a partitioned average image (PAI) to a multi-scale guided encoder-decoder subnet to reconstruct high-quality frames. Here, each PAI is generated according to the transform unit (TU) partitioning map that can be extracted directly from the coded bit-streams, thus enabling our network to focus on the TU boundaries while optimizing the global content. We present extensive experimental results to demonstrate the robustness of our method, especially for the high bit-rate coding case and large motion scenes. Due to the lightweight design structure, our proposed MGANet also has a very competitive inference time.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2020.3019919