High Parameter Frequency Resolution Encoding Scheme for Spatial Audio Objects Using Stacked Sparse Autoencoder

Object-based audio systems have become common in recent years as they provide the flexibility for many auditory scenarios, such as virtual reality games, interactive theater, and spatial audio communication. For saving bitrates, multiple audio objects are compressed into a mono downmix signal and si...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Neural processing letters Ročník 54; číslo 2; s. 817 - 833
Hlavní autoři: Wu, Yulin, Hu, Ruimin, Wang, Xiaochen, Hu, Chenhao, Ke, Shanfa
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.04.2022
Springer Nature B.V
Témata:
ISSN:1370-4621, 1573-773X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Object-based audio systems have become common in recent years as they provide the flexibility for many auditory scenarios, such as virtual reality games, interactive theater, and spatial audio communication. For saving bitrates, multiple audio objects are compressed into a mono downmix signal and side information parameters. However, side information parameter frequency resolution is too low to cause aliasing distortion. To overcome this issue, a new encoding scheme based on high parameter frequency resolution (224 sub-bands in a frame) is proposed in this paper. The side information parameters with high frequency resolution are compressed and reconstructed via SSAE (stacked sparse autoencoder) neural network and further used for recovering the audio objects. The performance of the proposed method is compared against existing SAOC (spatial audio object coding) methods at the same overall bitrate, judged by both objective and subjective results. The evaluation shows that our approach can facilitate the high quality of spatial audio objects.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1370-4621
1573-773X
DOI:10.1007/s11063-021-10659-8