Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions
This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In t...
Uložené v:
| Vydané v: | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) s. 541 - 545 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.05.2019
|
| Predmet: | |
| ISSN: | 2379-190X |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In this context, our contribution is to propose a noise model based on alpha-stable distributions, instead of the more conventional Gaussian non-negative matrix factorization approach found in previous studies. We develop a Monte Carlo expectation-maximization algorithm for estimating the model parameters at test time. Experimental results show the superiority of the proposed approach both in terms of perceptual quality and intelligibility of the enhanced speech signal. |
|---|---|
| AbstractList | This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In this context, our contribution is to propose a noise model based on alpha-stable distributions, instead of the more conventional Gaussian non-negative matrix factorization approach found in previous studies. We develop a Monte Carlo expectation-maximization algorithm for estimating the model parameters at test time. Experimental results show the superiority of the proposed approach both in terms of perceptual quality and intelligibility of the enhanced speech signal. |
| Author | Liutkus, Antoine Horaud, Radu Leglaive, Simon Simsekli, Umut Girin, Laurent |
| Author_xml | – sequence: 1 givenname: Simon surname: Leglaive fullname: Leglaive, Simon organization: Inria Grenoble Rhône-Alpes, France – sequence: 2 givenname: Umut surname: Simsekli fullname: Simsekli, Umut organization: LTCI, Télécom ParisTech, Université Paris-Saclay, France – sequence: 3 givenname: Antoine surname: Liutkus fullname: Liutkus, Antoine organization: Inria and LIRMM, France – sequence: 4 givenname: Laurent surname: Girin fullname: Girin, Laurent organization: Inria Grenoble Rhône-Alpes, France – sequence: 5 givenname: Radu surname: Horaud fullname: Horaud, Radu organization: Inria Grenoble Rhône-Alpes, France |
| BookMark | eNotj11LwzAYhaMouM39gt3kD3TmbdN8XI45pzCYMBXvRtq8oZEuLU2G-O-tuKsDh8PDeabkJnQBCVkAWwIw_fCyXh0Or8ucgV4qofKSiysy11IBl1orKACuySQvpM5As887Mo3xizGmJFcTsj_0iHVDN6ExocYThkS_fWrohxm8Sb4LpqWrc-ow1J3FIVITLF21fWOymEzVIn30MQ2-Ov-N4z25daaNOL_kjLw_bd7Wz9luvx2f7rImZzpl2tWiFBVqqFzuwBSWGylcqRG5lLWTnMGoooUwgLyoK2UrbrG0zo29ssWMLP65HhGP_eBPZvg5XvyLX2TeUsE |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ICASSP.2019.8682546 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 9781479981311 1479981311 |
| EISSN | 2379-190X |
| EndPage | 545 |
| ExternalDocumentID | 8682546 |
| Genre | orig-research |
| GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS |
| ID | FETCH-LOGICAL-h209t-9fc656be91bf2f1a3d4a76f59ee477cf7401825966a1e43cb8db4de5dff1828d3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 27 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000482554000109&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 05:56:09 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-h209t-9fc656be91bf2f1a3d4a76f59ee477cf7401825966a1e43cb8db4de5dff1828d3 |
| OpenAccessLink | https://inria.hal.science/hal-02005106 |
| PageCount | 5 |
| ParticipantIDs | ieee_primary_8682546 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-May |
| PublicationDateYYYYMMDD | 2019-05-01 |
| PublicationDate_xml | – month: 05 year: 2019 text: 2019-May |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) |
| PublicationTitleAbbrev | ICASSP |
| PublicationYear | 2019 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0008748 |
| Score | 2.3166955 |
| Snippet | This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 541 |
| SubjectTerms | alpha-stable distribution Context modeling Deep learning Expectation-maximization algorithms Monte Carlo expectation-maximization Monte Carlo methods Noise measurement Speech enhancement Time-frequency analysis variational autoencoders |
| Title | Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions |
| URI | https://ieeexplore.ieee.org/document/8682546 |
| WOSCitedRecordID | wos000482554000109&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4kEvPlrxTQ4eje1uspvkWKpFQWqhKr2VTTKhgmxLu_X3m-wuVcGLtzCQBzMh88jMNwDXiTcpejZ2NHSBpVwYTrVARmODzs-JkVc4s09iNJLTqRo34GZbC4OIZfIZ3oZh-ZdvF2YTQmVdmZbw7U1oCpFWtVrbV1cKLmtUoainuo-D_mQyDqlb_i5U0371TynVx3D_fxsfQOe7Do-MtxrmEBqYH8HeDwjBNjxPlohmTu7zeZBfWIeE2Cp5805wHegj_U2xCICVIWmZZLkl_VBhS71hqD-Q3AXs3Lrt1boDr8P7l8EDrZsk0HncUwVVzniTTKOKtItdlDHLM5G6RCFyETCHvAPlD--9mixCzoyWVnOLiXXO06Vlx9DKFzmeALFaOOYs8kgLbiXPUsMSqWL0TojOWHoK7cCa2bLCwZjVXDn7m3wOu4H7VXLgBbSK1QYvYcd8Fu_r1VUpvC_VM52P |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4gmqgXH2h824NHK-xud9s9EoRARCQBDTeybafBxCwEFn-_7e4GNfHirZmkj8w0nUdnvgG4C61J0dC-oa4LLGVcMSo5BtRXaOwcH1mBM9vng4GYTOJhBe43tTCImCef4YMb5n_5eq7WLlRWF1EO374F2yFjfqOo1tq8u4IzUeIKeY243ms1R6OhS96yt6GY-KuDSq5AOgf_2_oQTr4r8chwo2OOoILpMez_ABGswctogahmpJ3OnATdOsRFV8mbdYPLUB9prrO5g6x0acskSTVpuhpbak1D-YHk0aHnlo2vVifw2mmPW11atkmgM78RZzQ2yhplEmNPGt94SaBZwiMTxoiMO9Qh60LZw1u_JvGQBUoKLZnGUBtj6UIHp1BN5ymeAdGSm8BoZJ7kTAuWRCoIReyjdUNkEkTnUHOsmS4KJIxpyZWLv8m3sNsdP_en_d7g6RL2nCSKVMErqGbLNV7DjvrM3lfLm1yQXzQpoNY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=Speech+Enhancement+with+Variational+Autoencoders+and+Alpha-stable+Distributions&rft.au=Leglaive%2C+Simon&rft.au=Simsekli%2C+Umut&rft.au=Liutkus%2C+Antoine&rft.au=Girin%2C+Laurent&rft.date=2019-05-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=541&rft.epage=545&rft_id=info:doi/10.1109%2FICASSP.2019.8682546&rft.externalDocID=8682546 |