Dual-Branch Codec With Orthogonality Constraint and Knowledge Distillation for Noisy Environment

Audio codecs, by discretizing continuous audio signals into finite token sets, achieve high-quality reconstruction at low bitrates in clean environments. However, real-world speech often deviates from ideal conditions, particularly in noisy environments with low signal-to-noise ratios (SNRs), limiti...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE signal processing letters Ročník 32; s. 3017 - 3021
Hlavní autoři: Han, Yi, Chen, Hang, Liu, Lijuan, Du, Jun
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1070-9908, 1558-2361
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Audio codecs, by discretizing continuous audio signals into finite token sets, achieve high-quality reconstruction at low bitrates in clean environments. However, real-world speech often deviates from ideal conditions, particularly in noisy environments with low signal-to-noise ratios (SNRs), limiting the performance of existing codecs in restoring clean audio from noisy inputs. To address this challenge, this paper introduces a Dual-Branch Codec (DB-Codec). Leveraging the hierarchical decomposition capability of residual vector quantization (RVQ), we separate noise and speech into codebooks at different layers through dual-branch reconstruction and orthogonality constraints between noise and speech features. DB-Codec integrates enhancement and synthesis into a unified model, enabling flexible control over noise suppression or signal recovery at equivalent compression rates to conventional codecs. Experiments demonstrate that our DB-Codec achieve an average improvement of 0.83 in PESQ and 9.16 in STOI compared to traditional codecs under low SNR conditions.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2025.3591726