Dual-Branch Codec With Orthogonality Constraint and Knowledge Distillation for Noisy Environment

Audio codecs, by discretizing continuous audio signals into finite token sets, achieve high-quality reconstruction at low bitrates in clean environments. However, real-world speech often deviates from ideal conditions, particularly in noisy environments with low signal-to-noise ratios (SNRs), limiti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE signal processing letters Jg. 32; S. 3017 - 3021
Hauptverfasser: Han, Yi, Chen, Hang, Liu, Lijuan, Du, Jun
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1070-9908, 1558-2361
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Audio codecs, by discretizing continuous audio signals into finite token sets, achieve high-quality reconstruction at low bitrates in clean environments. However, real-world speech often deviates from ideal conditions, particularly in noisy environments with low signal-to-noise ratios (SNRs), limiting the performance of existing codecs in restoring clean audio from noisy inputs. To address this challenge, this paper introduces a Dual-Branch Codec (DB-Codec). Leveraging the hierarchical decomposition capability of residual vector quantization (RVQ), we separate noise and speech into codebooks at different layers through dual-branch reconstruction and orthogonality constraints between noise and speech features. DB-Codec integrates enhancement and synthesis into a unified model, enabling flexible control over noise suppression or signal recovery at equivalent compression rates to conventional codecs. Experiments demonstrate that our DB-Codec achieve an average improvement of 0.83 in PESQ and 9.16 in STOI compared to traditional codecs under low SNR conditions.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2025.3591726