Blind Speech Extraction Based on Rank-Constrained Spatial Covariance Matrix Estimation With Multivariate Generalized Gaussian Distribution

In this article, we propose a new blind speech extraction (BSE) method that robustly extracts a directional speech from background diffuse noise by combining independent low-rank matrix analysis (ILRMA) and efficient rank-constrained spatial covariance matrix (SCM) estimation. To achieve more accura...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on audio, speech, and language processing Vol. 28; pp. 1948 - 1963
Main Authors: Kubo, Yuki, Takamune, Norihiro, Kitamura, Daichi, Saruwatari, Hiroshi
Format: Journal Article
Language:English
Published: Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2329-9290, 2329-9304
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this article, we propose a new blind speech extraction (BSE) method that robustly extracts a directional speech from background diffuse noise by combining independent low-rank matrix analysis (ILRMA) and efficient rank-constrained spatial covariance matrix (SCM) estimation. To achieve more accurate BSE than ILRMA, which assumes each source to be a point source (rank-1 spatial model), the proposed method restores the lost spatial basis for the full-rank SCM of diffuse noise. We adopt the multivariate complex generalized Gaussian distribution (GGD) as the statistical generative model to express various types of observed signal. To estimate the model parameters for an arbitrary shape parameter of the multivariate GGD, we derive a new inequality for rank-constrained SCMs. Also, we propose new acceleration methods to accomplish much faster extraction than conventional blind source separation methods. In BSE experiments using simulated and real recorded data, we confirm that the proposed method achieves more accurate and faster speech extraction than conventional methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2020.3003165