BVI-DVC: A Training Database for Deep Video Compression

Deep learning methods are increasingly being applied in the optimisation of video compression algorithms and can achieve significantly enhanced coding gains, compared to conventional approaches. Such approaches often employ Convolutional Neural Networks (CNNs) which are trained on databases with rel...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on multimedia Vol. 24; pp. 3847 - 3858
Main Authors:	Ma, Di, Zhang, Fan, Bull, David R.
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Artificial neural networks BVI-DVC CNN training Coding Computer architecture Deep learning Encoding Evaluation Machine learning Optimization Spatial databases Spatial resolution Training Video compression Video database
ISSN:	1520-9210, 1941-0077
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep learning methods are increasingly being applied in the optimisation of video compression algorithms and can achieve significantly enhanced coding gains, compared to conventional approaches. Such approaches often employ Convolutional Neural Networks (CNNs) which are trained on databases with relatively limited content coverage. In this paper, a new extensive and representative video database, BVI-DVC,is presented for training CNN-based video compression systems, with specific emphasis on machine learning tools that enhance conventional coding architectures, including spatial resolution and bit depth up-sampling, post-processing and in-loop filtering. BVI-DVC contains 800 sequences at various spatial resolutions from 270p to 2160p and has been evaluated on ten existing network architectures for four different coding tools. Experimental results show that this database produces significant improvements in terms of coding gains over five existing (commonly used) image/video training databases under the same training and evaluation configurations. The overall additional coding improvements by using the proposed database for all tested coding modules and CNN architectures are up to 10.3% based on the assessment of PSNR and 8.1% based on VMAF.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2021.3108943