Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions

This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In t...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) s. 541 - 545
Hlavní autoři: Leglaive, Simon, Simsekli, Umut, Liutkus, Antoine, Girin, Laurent, Horaud, Radu
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2019
Témata:
ISSN:2379-190X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In this context, our contribution is to propose a noise model based on alpha-stable distributions, instead of the more conventional Gaussian non-negative matrix factorization approach found in previous studies. We develop a Monte Carlo expectation-maximization algorithm for estimating the model parameters at test time. Experimental results show the superiority of the proposed approach both in terms of perceptual quality and intelligibility of the enhanced speech signal.
AbstractList This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In this context, our contribution is to propose a noise model based on alpha-stable distributions, instead of the more conventional Gaussian non-negative matrix factorization approach found in previous studies. We develop a Monte Carlo expectation-maximization algorithm for estimating the model parameters at test time. Experimental results show the superiority of the proposed approach both in terms of perceptual quality and intelligibility of the enhanced speech signal.
Author Liutkus, Antoine
Horaud, Radu
Leglaive, Simon
Simsekli, Umut
Girin, Laurent
Author_xml – sequence: 1
  givenname: Simon
  surname: Leglaive
  fullname: Leglaive, Simon
  organization: Inria Grenoble Rhône-Alpes, France
– sequence: 2
  givenname: Umut
  surname: Simsekli
  fullname: Simsekli, Umut
  organization: LTCI, Télécom ParisTech, Université Paris-Saclay, France
– sequence: 3
  givenname: Antoine
  surname: Liutkus
  fullname: Liutkus, Antoine
  organization: Inria and LIRMM, France
– sequence: 4
  givenname: Laurent
  surname: Girin
  fullname: Girin, Laurent
  organization: Inria Grenoble Rhône-Alpes, France
– sequence: 5
  givenname: Radu
  surname: Horaud
  fullname: Horaud, Radu
  organization: Inria Grenoble Rhône-Alpes, France
BookMark eNotj11LwzAYhaMouM39gt3kD3TmbdN8XI45pzCYMBXvRtq8oZEuLU2G-O-tuKsDh8PDeabkJnQBCVkAWwIw_fCyXh0Or8ucgV4qofKSiysy11IBl1orKACuySQvpM5As887Mo3xizGmJFcTsj_0iHVDN6ExocYThkS_fWrohxm8Sb4LpqWrc-ow1J3FIVITLF21fWOymEzVIn30MQ2-Ov-N4z25daaNOL_kjLw_bd7Wz9luvx2f7rImZzpl2tWiFBVqqFzuwBSWGylcqRG5lLWTnMGoooUwgLyoK2UrbrG0zo29ssWMLP65HhGP_eBPZvg5XvyLX2TeUsE
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP.2019.8682546
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781479981311
1479981311
EISSN 2379-190X
EndPage 545
ExternalDocumentID 8682546
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-h209t-9fc656be91bf2f1a3d4a76f59ee477cf7401825966a1e43cb8db4de5dff1828d3
IEDL.DBID RIE
ISICitedReferencesCount 27
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000482554000109&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 05:56:09 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-h209t-9fc656be91bf2f1a3d4a76f59ee477cf7401825966a1e43cb8db4de5dff1828d3
OpenAccessLink https://inria.hal.science/hal-02005106
PageCount 5
ParticipantIDs ieee_primary_8682546
PublicationCentury 2000
PublicationDate 2019-May
PublicationDateYYYYMMDD 2019-05-01
PublicationDate_xml – month: 05
  year: 2019
  text: 2019-May
PublicationDecade 2010
PublicationTitle Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998)
PublicationTitleAbbrev ICASSP
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008748
Score 2.3167892
Snippet This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of...
SourceID ieee
SourceType Publisher
StartPage 541
SubjectTerms alpha-stable distribution
Context modeling
Deep learning
Expectation-maximization algorithms
Monte Carlo expectation-maximization
Monte Carlo methods
Noise measurement
Speech enhancement
Time-frequency analysis
variational autoencoders
Title Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions
URI https://ieeexplore.ieee.org/document/8682546
WOSCitedRecordID wos000482554000109&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4kEvPlrxTQ4ejd3dZjfJsdQWBamFaumtbJIJFWS3tFt_v8nuUhW8eAsDeTATksxkvm8AbpnhwsZpSCN0ZmAqQZoKGVBrY2mMZkyW5YBmz3w8FvO5nDTgboeFQcQy-QzvfbP8yze53vpQWVckJX17E5qcJxVWa3fqCs5EzSoUBrL7NOhPpxOfuuX2QtXtV_2U8voYHf5v4iPofOPwyGR3wxxDA7MTOPhBIdiGl-kKUS_JMFt6-_lxiI-tkplzgutAH-lvi9wTVvqkZZJmhvQ9wpa6h6H6QPLguXPrslebDryNhq-DR1oXSaDLKJAFlVa7J5lCGSob2TDtGZbyxGkakXHPOeQcKLd459WkIbKeVsIoZjA21jq5ML1TaGV5hmdA0lijR-qE2vl8NlYqRMudDSMmrIkCdQ5tr5rFquLBWNRaufhbfAn7XvtVcuAVtIr1Fq9hT38W75v1TWm8Lxdpnag
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0gmqgXP8D4bQ8eXdkuXbY9EoRARCQBCTeybafBxCwEFn-_7e4GNfHirWnSbTNv03am894A3DMdcRPG1AvQwsBkA72YC98zJhRaK8ZEVg5o0o8GAz6dimEJHrZcGETMks_w0TWzt3y9UBsXKqvxRibfvgO7IWOBn7O1tvsujxgvdIWoL2q9VnM0GrrkLfs35AN_VVDJDpDO0f-mPobqNxOPDLdnzAmUMDmFwx8ighV4HS0R1Zy0k7lD0H2HuOgqmVg3uAj1keYmXTjJSpe2TOJEk6bj2Hr2aig_kDw59dyi8NW6Cm-d9rjV9YoyCd488EXqCaPspUyioNIEhsZ1zeKoYW2NyCKnOmRdKLt469fEFFldSa4l0xhqY2w_1_UzKCeLBM-BxKFCx9Whynp9JpSSooksigHjRge-vICKM81smSthzAqrXP7dfQf73fFLf9bvDZ6v4MAhkacKXkM5XW3wBvbUZ_q-Xt1mQH4Bd7yg7w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=Speech+Enhancement+with+Variational+Autoencoders+and+Alpha-stable+Distributions&rft.au=Leglaive%2C+Simon&rft.au=Simsekli%2C+Umut&rft.au=Liutkus%2C+Antoine&rft.au=Girin%2C+Laurent&rft.date=2019-05-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=541&rft.epage=545&rft_id=info:doi/10.1109%2FICASSP.2019.8682546&rft.externalDocID=8682546