Field programmable gate array implementation of variable-bins high efficiency video coding CABAC decoder with path delay optimisation

Context-based adaptive binary arithmetic coding (CABAC) is a single operation mode for entropy coding in the last video coding standard high-efficiency video coding. For high-resolution applications, the throughput of one bin/cycle is not sufficient and it is a very challenging task to implement pip...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IET image processing Ročník 13; číslo 6; s. 954 - 963
Hlavní autoři: Menasri, Wahiba, Skoudarli, Abdellah, Belhadj, Aichouche, Azzaz, Mohamed Salah
Médium: Journal Article
Jazyk:angličtina
Vydáno: The Institution of Engineering and Technology 01.05.2019
Témata:
ISSN:1751-9659, 1751-9667
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Context-based adaptive binary arithmetic coding (CABAC) is a single operation mode for entropy coding in the last video coding standard high-efficiency video coding. For high-resolution applications, the throughput of one bin/cycle is not sufficient and it is a very challenging task to implement pipeline and/or parallel CABAC decoding architecture by simply adding more stages. Indeed, the tight data dependencies make it difficult to parallelise and cause it to be a throughput bottleneck for video decoding. Consequently, in order to improve the CABAC decoder throughput, parallel and pipeline architectures are used in authors’ design. In this work, an algorithm-architecture adequation is proposed to implement a CABAC decoder on a field programmable gate array. Mainly, a new classification of 32 syntax elements is given to speed up the authors’ solution. Furthermore, the context selection and modelling of regular syntax elements are studied, designed and implemented. Finally, a novel technique of memories rearrangement to reduce the critical path delay required to process each binary symbol is proposed. As a result, the implementation can process 2.2 bins/cycle when operated at 123.49 MHz and exhibits an improved high-throughput of 271.678 Mbins/s. The hardware architecture is coded using hardware description language and synthesised using ISE Xilinx tools targeting the Virtex4 platform.
ISSN:1751-9659
1751-9667
DOI:10.1049/iet-ipr.2018.6336