Video coding using a self-adaptive redundant dictionary consisting of spatial and temporal prediction candidates

All standard video coders are based on the prediction plus transform representation of an image block, which predicts the current block using various intra- and inter-prediction modes and then represents the prediction error using a fixed orthonormal transform. We propose to directly represent a mea...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Proceedings (IEEE International Conference on Multimedia and Expo) s. 1 - 6
Hlavní autori:	Yuanyi Xue, Yao Wang
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 01.07.2014
Predmet:	Dictionaries Discrete cosine transforms Encoding Matching pursuit algorithms Quantization (signal) Vectors
ISSN:	1945-7871
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	All standard video coders are based on the prediction plus transform representation of an image block, which predicts the current block using various intra- and inter-prediction modes and then represents the prediction error using a fixed orthonormal transform. We propose to directly represent a mean-removed block using a redundant dictionary consisting of all possible inter-prediction candidates with integer motion vectors (mean-removed). In general the dictionary may also contain some intra-prediction candidates and some pre-designed fixed dictionary atoms. However, simulation results reported in this papers are obtained by using the inter-prediction candidates only. We determine the coefficients by minimizing the L0 norm of the coefficients subject to a constraint on the sparse approximation error. We show that using such a self-adaptive dictionary can lead to a very sparse representation, with significantly fewer non-zero coefficients than using the DCT transform on the prediction error. We further propose a modified orthogonal matching pursuit (OMP) algorithm which othonormalizes each new chosen atom with respect to all previously chosen and orthonormalized atoms. Each image block is represented by the quantized coefficients corresponding to the othonormalized atoms, to overcome the inefficiency associated with using non-orthonormal atoms. Each image block is represented by its mean, which is predictively coded, the indices of the chosen atoms, and the quantized coefficients. Each variable is coded based on its unconditional distribution. Simulation results show that the proposed coder can achieve significant gain over the H.264 coder (implemented using x264) and achieve similar performance comparing to the HEVC reference encoder (HM).
ISSN:	1945-7871
DOI:	10.1109/ICME.2014.6890314