Separating time-frequency sources from time-domain convolutive mixtures using non-negative matrix factorization

This paper addresses the problem of under-determined audio source separation in multichannel reverberant mixtures. We target a semiblind scenario assuming that the mixing filters are known. Source separation is performed from the time-domain mixture signals in order to accurately model the convoluti...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE Workshop on Applications of Signal Processing to Audio and Acoustics : proceedings s. 264 - 268
Hlavní autoři: Leglaive, Simon, Badeau, Roland, Richard, Gael
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.10.2017
Témata:
ISSN:1947-1629
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This paper addresses the problem of under-determined audio source separation in multichannel reverberant mixtures. We target a semiblind scenario assuming that the mixing filters are known. Source separation is performed from the time-domain mixture signals in order to accurately model the convolutive mixing process. The source signals are however modeled as latent variables in a time-frequency domain. In a previous paper we proposed to use the modified discrete cosine transform. The present paper generalizes the method to the use of the odd-frequency short-time Fourier transform. In this domain, the source coefficients are modeled as centered complex Gaussian random variables whose variances are structured by means of a non-negative matrix factorization model. The inference procedure relies on a variational expectation-maximization algorithm. In the experiments we discuss the choice of the source representation and we show that the proposed approach outperforms two methods from the literature.
ISSN:1947-1629
DOI:10.1109/WASPAA.2017.8170036