Video coding using a self-adaptive redundant dictionary consisting of spatial and temporal prediction candidates

All standard video coders are based on the prediction plus transform representation of an image block, which predicts the current block using various intra- and inter-prediction modes and then represents the prediction error using a fixed orthonormal transform. We propose to directly represent a mea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings (IEEE International Conference on Multimedia and Expo) S. 1 - 6
Hauptverfasser: Yuanyi Xue, Yao Wang
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.07.2014
Schlagworte:
ISSN:1945-7871
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:All standard video coders are based on the prediction plus transform representation of an image block, which predicts the current block using various intra- and inter-prediction modes and then represents the prediction error using a fixed orthonormal transform. We propose to directly represent a mean-removed block using a redundant dictionary consisting of all possible inter-prediction candidates with integer motion vectors (mean-removed). In general the dictionary may also contain some intra-prediction candidates and some pre-designed fixed dictionary atoms. However, simulation results reported in this papers are obtained by using the inter-prediction candidates only. We determine the coefficients by minimizing the L0 norm of the coefficients subject to a constraint on the sparse approximation error. We show that using such a self-adaptive dictionary can lead to a very sparse representation, with significantly fewer non-zero coefficients than using the DCT transform on the prediction error. We further propose a modified orthogonal matching pursuit (OMP) algorithm which othonormalizes each new chosen atom with respect to all previously chosen and orthonormalized atoms. Each image block is represented by the quantized coefficients corresponding to the othonormalized atoms, to overcome the inefficiency associated with using non-orthonormal atoms. Each image block is represented by its mean, which is predictively coded, the indices of the chosen atoms, and the quantized coefficients. Each variable is coded based on its unconditional distribution. Simulation results show that the proposed coder can achieve significant gain over the H.264 coder (implemented using x264) and achieve similar performance comparing to the HEVC reference encoder (HM).
ISSN:1945-7871
DOI:10.1109/ICME.2014.6890314