Separating time-frequency sources from time-domain convolutive mixtures using non-negative matrix factorization

This paper addresses the problem of under-determined audio source separation in multichannel reverberant mixtures. We target a semiblind scenario assuming that the mixing filters are known. Source separation is performed from the time-domain mixture signals in order to accurately model the convoluti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE Workshop on Applications of Signal Processing to Audio and Acoustics : proceedings S. 264 - 268
Hauptverfasser: Leglaive, Simon, Badeau, Roland, Richard, Gael
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.10.2017
Schlagworte:
ISSN:1947-1629
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper addresses the problem of under-determined audio source separation in multichannel reverberant mixtures. We target a semiblind scenario assuming that the mixing filters are known. Source separation is performed from the time-domain mixture signals in order to accurately model the convolutive mixing process. The source signals are however modeled as latent variables in a time-frequency domain. In a previous paper we proposed to use the modified discrete cosine transform. The present paper generalizes the method to the use of the odd-frequency short-time Fourier transform. In this domain, the source coefficients are modeled as centered complex Gaussian random variables whose variances are structured by means of a non-negative matrix factorization model. The inference procedure relies on a variational expectation-maximization algorithm. In the experiments we discuss the choice of the source representation and we show that the proposed approach outperforms two methods from the literature.
ISSN:1947-1629
DOI:10.1109/WASPAA.2017.8170036