Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization

In this paper we address speaker-independent multichannel speech enhancement in unknown noisy environments. Our work is based on a well-established multichannel local Gaussian modeling framework. We propose to use a neural network for modeling the speech spectro-temporal content. The parameters of t...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) s. 101 - 105
Hlavní autoři:	Leglaive, Simon, Girin, Laurent, Horaud, Radu
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 01.05.2019
Témata:	Adaptation models Autoencoders Computational modeling Inference algorithms local Gaussian modeling Monte Carlo expectation-maximization Monte Carlo methods Multichannel speech enhancement Noise Noise measurement non-negative matrix factorization Recording Signal processing algorithms Speech enhancement variational autoencoders
ISSN:	2379-190X
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In this paper we address speaker-independent multichannel speech enhancement in unknown noisy environments. Our work is based on a well-established multichannel local Gaussian modeling framework. We propose to use a neural network for modeling the speech spectro-temporal content. The parameters of this supervised model are learned using the framework of variational autoencoders. The noisy recording environment is supposed to be unknown, so the noise spectro-temporal modeling remains unsupervised and is based on non-negative matrix factorization (NMF). We develop a Monte Carlo expectation-maximization algorithm and we experimentally show that the proposed approach outperforms its NMF-based counterpart, where speech is modeled using supervised NMF.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2019.8683704