A Variable-Clock-Cycle-Path VLSI Design of Binary Arithmetic Decoder for H.265/HEVC

The next-generation 8K ultra-high-definition video format involves an extremely high bit rate, which imposes a high throughput requirement on the entropy decoder component of a video decoder. Context adaptive binary arithmetic coding (CABAC) is the entropy coding tool in the latest video coding stan...

Full description

Saved in:
Bibliographic Details
Published in:IEEE Transactions on Circuits and Systems for Video Technology Vol. 28; no. 2; pp. 556 - 560
Main Authors: Zhou, Jinjia, Zhou, Dajiang, Zhang, Shuping, Kimura, Shinji, Goto, Satoshi
Format: Journal Article
Language:English
Japanese
Published: New York Institute of Electrical and Electronics Engineers (IEEE) 01.02.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1051-8215, 1558-2205
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The next-generation 8K ultra-high-definition video format involves an extremely high bit rate, which imposes a high throughput requirement on the entropy decoder component of a video decoder. Context adaptive binary arithmetic coding (CABAC) is the entropy coding tool in the latest video coding standards including H.265/High Efficiency Video Coding and H.264/Advanced Video Coding. Due to critical data dependencies at the algorithm level, a CABAC decoder is difficult to be accelerated by simply leveraging parallelism and pipelining. This letter presents a new very-large-scale integration arithmetic decoder, which is the most critical bottleneck in CABAC decoding. Our design features a variable-clock-cycle-path architecture that exploits the differences in critical path delay and in probability of occurrence between various types of binary symbols (bins). The proposed design also incorporates a novel data-forwarding technique (rLPS forwarding) and a fast path-selection technique (coarse bin type decision), and is enhanced with the capability of processing additional bypass bins. As a result, its maximum throughput achieves 1010 Mbins/s in 90-nm CMOS, when decoding 0.96 bin per clock cycle at a maximum clock rate of 1053 MHz, which outperforms previous works by 19.1%.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/tcsvt.2016.2614124