High Parameter Frequency Resolution Encoding Scheme for Spatial Audio Objects Using Stacked Sparse Autoencoder

Object-based audio systems have become common in recent years as they provide the flexibility for many auditory scenarios, such as virtual reality games, interactive theater, and spatial audio communication. For saving bitrates, multiple audio objects are compressed into a mono downmix signal and si...

Full description

Saved in:
Bibliographic Details
Published in:Neural processing letters Vol. 54; no. 2; pp. 817 - 833
Main Authors: Wu, Yulin, Hu, Ruimin, Wang, Xiaochen, Hu, Chenhao, Ke, Shanfa
Format: Journal Article
Language:English
Published: New York Springer US 01.04.2022
Springer Nature B.V
Subjects:
ISSN:1370-4621, 1573-773X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Object-based audio systems have become common in recent years as they provide the flexibility for many auditory scenarios, such as virtual reality games, interactive theater, and spatial audio communication. For saving bitrates, multiple audio objects are compressed into a mono downmix signal and side information parameters. However, side information parameter frequency resolution is too low to cause aliasing distortion. To overcome this issue, a new encoding scheme based on high parameter frequency resolution (224 sub-bands in a frame) is proposed in this paper. The side information parameters with high frequency resolution are compressed and reconstructed via SSAE (stacked sparse autoencoder) neural network and further used for recovering the audio objects. The performance of the proposed method is compared against existing SAOC (spatial audio object coding) methods at the same overall bitrate, judged by both objective and subjective results. The evaluation shows that our approach can facilitate the high quality of spatial audio objects.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1370-4621
1573-773X
DOI:10.1007/s11063-021-10659-8