Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders
Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. One advantage of this gen...
Gespeichert in:
| Veröffentlicht in: | IEEE/ACM transactions on audio, speech, and language processing Jg. 28; S. 1788 - 1800 |
|---|---|
| Hauptverfasser: | , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Piscataway
IEEE
01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Institute of Electrical and Electronics Engineers |
| Schlagworte: | |
| ISSN: | 2329-9290, 2329-9304 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. One advantage of this generative approach is that it does not require pairs of clean and noisy speech signals at training. In this article, we propose audio-visual variants of VAEs for single-channel and speaker-independent speech enhancement. We develop a conditional VAE (CVAE) where the audio speech generative process is conditioned on visual information of the lip region. At test time, the audio-visual speech generative model is combined with a noise model based on nonnegative matrix factorization, and speech enhancement relies on a Monte Carlo expectation-maximization algorithm. Experiments are conducted with the recently published NTCD-TIMIT dataset as well as the GRID corpus. The results confirm that the proposed audio-visual CVAE effectively fuses audio and visual information, and it improves the speech enhancement performance compared with the audio-only VAE model, especially when the speech signal is highly corrupted by noise. We also show that the proposed unsupervised audio-visual speech enhancement approach outperforms a state-of-the-art supervised deep learning method. |
|---|---|
| AbstractList | Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. One advantage of this generative approach is that it does not require pairs of clean and noisy speech signals at training. In this article, we propose audio-visual variants of VAEs for single-channel and speaker-independent speech enhancement. We develop a conditional VAE (CVAE) where the audio speech generative process is conditioned on visual information of the lip region. At test time, the audio-visual speech generative model is combined with a noise model based on nonnegative matrix factorization, and speech enhancement relies on a Monte Carlo expectation-maximization algorithm. Experiments are conducted with the recently published NTCD-TIMIT dataset as well as the GRID corpus. The results confirm that the proposed audio-visual CVAE effectively fuses audio and visual information, and it improves the speech enhancement performance compared with the audio-only VAE model, especially when the speech signal is highly corrupted by noise. We also show that the proposed unsupervised audio-visual speech enhancement approach outperforms a state-of-the-art supervised deep learning method. Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. One advantage of this generative approach is that it does not require pairs of clean and noisy speech signals at training. In this paper, we propose audio-visual variants of VAEs for single-channel and speaker-independent speech enhancement. We develop a conditional VAE (CVAE) where the audio speech generative process is conditioned on visual information of the lip region. At test time, the audio-visual speech generative model is combined with a noise model based on nonnegative matrix factorization, and speech enhancement relies on a Monte Carlo expectation-maximization algorithm. Experiments are conducted with the recently published NTCD-TIMIT dataset. The results confirm that the proposed audio-visual CVAE effectively fuse audio and visual information, and it improves the speech enhancement performance compared with the audio-only VAE model, especially when the speech signal is highly corrupted by noise. We also show that the proposed unsupervised audio-visual speech enhancement approach outperforms a state-of-the-art supervised deep learning method. |
| Author | Horaud, Radu Leglaive, Simon Alameda-Pineda, Xavier Sadeghi, Mostafa Girin, Laurent |
| Author_xml | – sequence: 1 givenname: Mostafa orcidid: 0000-0002-0272-8017 surname: Sadeghi fullname: Sadeghi, Mostafa email: mostafa.sadeghi@inria.fr organization: Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, France – sequence: 2 givenname: Simon orcidid: 0000-0002-8219-1298 surname: Leglaive fullname: Leglaive, Simon email: simon.leglaive@centralesupelec.fr organization: Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, France – sequence: 3 givenname: Xavier orcidid: 0000-0002-5354-1084 surname: Alameda-Pineda fullname: Alameda-Pineda, Xavier email: xavier.alameda-pineda@inria.fr organization: Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, France – sequence: 4 givenname: Laurent orcidid: 0000-0002-9214-8760 surname: Girin fullname: Girin, Laurent email: laurent.girin@grenoble-inp.fr organization: Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, France – sequence: 5 givenname: Radu orcidid: 0000-0001-5232-024X surname: Horaud fullname: Horaud, Radu email: radu.horaud@inria.fr organization: Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, France |
| BackLink | https://inria.hal.science/hal-02364900$$DView record in HAL |
| BookMark | eNp9kMtKAzEUhoMoqNUX0E3BlYupJ5e5ZDmUasWCgrbbkMmcsZGa1GRG8O2dXnThwlUO4fsO__lPyaHzDgm5oDCiFOTNS_k8exoxYDDiAJBKfkBOGGcykRzE4c_MJByT8xjfeoZCLmUuTshD2dXWJwsbO70aPq8RzXI4cUvtDL6ja4fzaN3rcOxdbVvrXQ8tdLB6P5dd65OJM77GEM_IUaNXEc_374DMbycv42kye7y7H5ezxPQ52iQTTQ1YMcoKzI3JqqrWQucZGN5QqLCoWQUZFHmBjEKKOhUsLRqZgayEriUfkOvd3qVeqXWw7zp8Ka-tmpYztfkDxjMhAT55z17t2HXwHx3GVr35LvTRo2KC5oIWnKc9VewoE3yMARtlbLu9sQ3arhQFtWlabZtWm6bVvuleZX_Un0T_Spc7ySLiryB7PM9S_g0DHoot |
| CODEN | ITASD8 |
| CitedBy_id | crossref_primary_10_1109_TASLP_2022_3153265 crossref_primary_10_1109_TCYB_2022_3163811 crossref_primary_10_1016_j_csl_2025_101887 crossref_primary_10_1038_s41597_024_02918_9 crossref_primary_10_1016_j_dsp_2022_103897 crossref_primary_10_1007_s10772_023_10018_z crossref_primary_10_1007_s11277_022_09852_2 crossref_primary_10_1109_TMM_2024_3352388 crossref_primary_10_1109_TSP_2021_3066038 crossref_primary_10_1109_TAI_2022_3220190 crossref_primary_10_1080_01691864_2022_2035253 crossref_primary_10_1145_3696445 crossref_primary_10_1007_s11263_022_01742_1 crossref_primary_10_1155_2021_9979606 crossref_primary_10_1109_TASLP_2022_3207349 crossref_primary_10_1109_TASLP_2021_3095656 crossref_primary_10_1109_TAI_2022_3169995 crossref_primary_10_3390_e24010055 crossref_primary_10_1109_ACCESS_2023_3253719 crossref_primary_10_3390_app13169217 crossref_primary_10_1016_j_eswa_2024_124852 crossref_primary_10_3390_en13174291 crossref_primary_10_1109_TASLP_2021_3066303 crossref_primary_10_1007_s11760_024_03245_7 crossref_primary_10_1109_TASLP_2021_3126925 crossref_primary_10_1016_j_compositesb_2024_111353 crossref_primary_10_1109_MSP_2023_3240008 crossref_primary_10_1007_s11042_022_13302_3 crossref_primary_10_1145_3635706 crossref_primary_10_1250_ast_e24_95 crossref_primary_10_1002_qre_3394 crossref_primary_10_1109_TIA_2021_3065194 crossref_primary_10_1109_TASLP_2023_3265202 |
| Cites_doi | 10.1016/S0165-1684(01)00128-1 10.1109/ICASSP.2018.8461326 10.1109/TASL.2013.2270369 10.1109/WASPAA.2019.8937218 10.1121/1.1907309 10.1109/ICASSP.2008.4518538 10.1121/1.1358887 10.1109/TASL.2010.2096212 10.21437/Interspeech.2016-211 10.1109/MLSP.2018.8516711 10.1109/ICASSP.2019.8682623 10.1109/TASL.2007.899233 10.21437/Interspeech.2019-1398 10.1109/TASLP.2014.2364452 10.1162/NECO_a_00168 10.1121/1.2229005 10.3109/03005368709077786 10.21437/Interspeech.2018-1955 10.1121/1.4799597 10.21236/ADA073139 10.1109/TSA.2005.851927 10.1023/A:1007665907178 10.1080/01621459.2017.1285773 10.1109/ICASSP.2013.6638354 10.1109/TASSP.1985.1164550 10.1109/TASSP.1979.1163209 10.21437/Interspeech.2018-2516 10.1109/ICASSP.2019.8683704 10.1007/978-3-319-22482-4_11 10.1109/SAM.2002.1191001 10.1109/ICASSP.2018.8461530 10.1080/01621459.1990.10474930 10.1109/ICASSP.2019.8683497 10.1162/neco.2008.04-08-771 10.23919/APSIPA.2018.8659591 10.1109/TASL.2013.2250961 10.1109/TASL.2011.2114881 10.1109/TETCI.2017.2784878 10.1162/neco_a_01217 10.1109/TSA.2005.858005 10.21437/Interspeech.2018-1400 10.1109/ICASSP.2001.941023 10.1109/ICASSP.2019.8682546 10.1201/9781420015836 10.1109/ICASSP.2011.5946317 10.1109/ICASSP.2019.8682797 10.1109/TASLP.2018.2842159 10.1109/TASSP.1984.1164453 10.1044/jshd.4004.481 10.1111/j.2517-6161.1977.tb01600.x 10.21437/Interspeech.2017-860 10.1109/TASLP.2014.2352935 10.1007/978-3-540-74494-8_52 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 Distributed under a Creative Commons Attribution 4.0 International License |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 – notice: Distributed under a Creative Commons Attribution 4.0 International License |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D 1XC VOOES |
| DOI | 10.1109/TASLP.2020.3000593 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library (IEL) (UW System Shared) CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Hyper Article en Ligne (HAL) Hyper Article en Ligne (HAL) (Open Access) |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 2329-9304 |
| EndPage | 1800 |
| ExternalDocumentID | oai:HAL:hal-02364900v3 10_1109_TASLP_2020_3000593 9110765 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Multidisciplinary Institute in Artificial Intelligence |
| GroupedDBID | 0R~ 4.4 6IK 97E AAJGR AAKMM AALFJ AARMG AASAJ AAWTH AAWTV ABAZT ABQJQ ABVLG ACIWK ACM ADBCU AEBYY AEFXT AEJOY AENSD AFWIH AFWXC AGQYO AGSQL AHBIQ AIKLT AKJIK AKQYR AKRVB ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CCLIF EBS EJD GUFHI HGAVV IFIPE IPLJI JAVBF LHSKQ M43 OCL PQQKQ RIA RIE RNS ROL AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D 1XC VOOES |
| ID | FETCH-LOGICAL-c329t-64fd0eb2128e7cc6bbda4a760c3f10be8d2b060878e2105ea54258f9609b4ad93 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 45 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000543714200003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2329-9290 |
| IngestDate | Tue Oct 14 20:54:41 EDT 2025 Sun Nov 09 06:47:49 EST 2025 Sat Nov 29 02:43:53 EST 2025 Tue Nov 18 22:38:10 EST 2025 Wed Aug 27 02:38:22 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Training Speech enhancement Visualization Lips Noise measurement Computer architecture |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c329t-64fd0eb2128e7cc6bbda4a760c3f10be8d2b060878e2105ea54258f9609b4ad93 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-0272-8017 0000-0001-5232-024X 0000-0002-9214-8760 0000-0002-8219-1298 0000-0002-5354-1084 |
| OpenAccessLink | https://inria.hal.science/hal-02364900 |
| PQID | 2417418335 |
| PQPubID | 85426 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_9110765 crossref_citationtrail_10_1109_TASLP_2020_3000593 crossref_primary_10_1109_TASLP_2020_3000593 hal_primary_oai_HAL_hal_02364900v3 proquest_journals_2417418335 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-01-01 |
| PublicationDateYYYYMMDD | 2020-01-01 |
| PublicationDate_xml | – month: 01 year: 2020 text: 2020-01-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE/ACM transactions on audio, speech, and language processing |
| PublicationTitleAbbrev | TASLP |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Institute of Electrical and Electronics Engineers |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) – name: Institute of Electrical and Electronics Engineers |
| References | ref13 garofolo (ref60) 1993 ref59 ref15 ref14 ref53 ref52 ref55 ref54 ref10 ref17 ref19 lu (ref41) 0 ref18 benesty (ref2) 2006 ref51 ref50 dempster (ref57) 1977; 39 hirsch (ref61) 2005 fisher iii (ref9) 0 ref46 ref45 ref48 ref42 ref44 ref43 ref49 ref8 ref4 ref3 ref6 ref5 kingma (ref65) 0 ref40 ref35 ref34 lim (ref1) 1983 ref37 goecke (ref11) 0 ref36 ref31 ref30 ref33 ref32 hershey (ref12) 0 ref39 higgins (ref56) 0 sohn (ref25) 0 kingma (ref47) 0 girin (ref7) 0 ref68 ref24 ref67 ref23 ref26 ref69 ref64 ref20 ref63 ref66 ref22 ref21 raj (ref38) 0 robert (ref58) 2005 ref28 ref27 ref29 ref62 gabbay (ref16) 0 |
| References_xml | – ident: ref35 doi: 10.1016/S0165-1684(01)00128-1 – ident: ref55 doi: 10.1109/ICASSP.2018.8461326 – ident: ref39 doi: 10.1109/TASL.2013.2270369 – ident: ref26 doi: 10.1109/WASPAA.2019.8937218 – start-page: ii?2025 year: 0 ident: ref11 article-title: Noisy audio feature enhancement using audio-visual speech data publication-title: Proc IEEE Int Conf Acoust Speech Signal Process – ident: ref4 doi: 10.1121/1.1907309 – ident: ref37 doi: 10.1109/ICASSP.2008.4518538 – ident: ref8 doi: 10.1121/1.1358887 – ident: ref51 doi: 10.1109/TASL.2010.2096212 – ident: ref43 doi: 10.21437/Interspeech.2016-211 – ident: ref20 doi: 10.1109/MLSP.2018.8516711 – ident: ref49 doi: 10.1109/ICASSP.2019.8682623 – ident: ref33 doi: 10.1109/TASL.2007.899233 – ident: ref24 doi: 10.21437/Interspeech.2019-1398 – ident: ref42 doi: 10.1109/TASLP.2014.2364452 – ident: ref59 doi: 10.1162/NECO_a_00168 – start-page: 772 year: 0 ident: ref9 article-title: Learning joint statistical models for audio-visual fusion and segregation publication-title: Proc Adv Neural Inf Process Syst – ident: ref28 doi: 10.1121/1.2229005 – ident: ref6 doi: 10.3109/03005368709077786 – ident: ref15 doi: 10.21437/Interspeech.2018-1955 – year: 2005 ident: ref61 article-title: FaNT-Filtering and noise adding tool – ident: ref62 doi: 10.1121/1.4799597 – year: 2006 ident: ref2 publication-title: Speech Enhancement – ident: ref30 doi: 10.21236/ADA073139 – ident: ref32 doi: 10.1109/TSA.2005.851927 – year: 1993 ident: ref60 article-title: DARPA TIMIT acoustic phonetic continuous speech corpus CDROM – year: 0 ident: ref56 publication-title: Proc Int Conf Learn Representations – ident: ref53 doi: 10.1023/A:1007665907178 – ident: ref54 doi: 10.1080/01621459.2017.1285773 – ident: ref13 doi: 10.1109/ICASSP.2013.6638354 – ident: ref34 doi: 10.1109/TASSP.1985.1164550 – ident: ref29 doi: 10.1109/TASSP.1979.1163209 – ident: ref18 doi: 10.21437/Interspeech.2018-2516 – start-page: 1173 year: 0 ident: ref12 article-title: Audio-visual sound separation via hidden Markov models publication-title: Proc Adv Neural Inf Process Syst – start-page: 1559 year: 0 ident: ref7 article-title: Noisy speech enhancement with filters estimated from the speaker's lips publication-title: Proc Eur Conf Speech Commun Technol – ident: ref22 doi: 10.1109/ICASSP.2019.8683704 – ident: ref46 doi: 10.1007/978-3-319-22482-4_11 – ident: ref10 doi: 10.1109/SAM.2002.1191001 – ident: ref19 doi: 10.1109/ICASSP.2018.8461530 – start-page: 3581 year: 0 ident: ref47 article-title: Semi-supervised learning with deep generative models publication-title: Proc Adv Neural Inf Process Syst – ident: ref52 doi: 10.1080/01621459.1990.10474930 – year: 0 ident: ref65 article-title: Adam: A method for stochastic optimization publication-title: Proc 3rd Int Conf Learn Representations – ident: ref50 doi: 10.1109/ICASSP.2019.8683497 – ident: ref36 doi: 10.1162/neco.2008.04-08-771 – start-page: 3483 year: 0 ident: ref25 article-title: Learning structured output representation using deep conditional generative models publication-title: Proc Adv Neural Inf Process Syst – ident: ref21 doi: 10.23919/APSIPA.2018.8659591 – ident: ref44 doi: 10.1109/TASL.2013.2250961 – ident: ref68 doi: 10.1109/TASL.2011.2114881 – ident: ref17 doi: 10.1109/TETCI.2017.2784878 – ident: ref48 doi: 10.1162/neco_a_01217 – year: 1983 ident: ref1 publication-title: Speech Enhancement – ident: ref66 doi: 10.1109/TSA.2005.858005 – year: 2005 ident: ref58 publication-title: Monte Carlo Statistical Methods – ident: ref14 doi: 10.21437/Interspeech.2018-1400 – ident: ref67 doi: 10.1109/ICASSP.2001.941023 – ident: ref23 doi: 10.1109/ICASSP.2019.8682546 – ident: ref3 doi: 10.1201/9781420015836 – ident: ref64 doi: 10.1109/ICASSP.2011.5946317 – ident: ref69 doi: 10.1109/ICASSP.2019.8682797 – ident: ref40 doi: 10.1109/TASLP.2018.2842159 – ident: ref31 doi: 10.1109/TASSP.1984.1164453 – ident: ref5 doi: 10.1044/jshd.4004.481 – volume: 39 start-page: 1 year: 1977 ident: ref57 article-title: Maximum likelihood from incomplete data via the EM algorithm publication-title: J Roy Statist Soc Ser B doi: 10.1111/j.2517-6161.1977.tb01600.x – ident: ref27 doi: 10.21437/Interspeech.2017-860 – ident: ref45 doi: 10.1109/TASLP.2014.2352935 – start-page: 1217 year: 0 ident: ref38 article-title: Phoneme-dependent NMF for speech enhancement in monaural mixtures publication-title: Proc Conf Int Speech Commun Assoc – start-page: 3051 year: 0 ident: ref16 article-title: Seeing through noise: Speaker separation and enhancement using visually-derived speech publication-title: Proc IEEE Int Conf Acoust Speech Signal Process – start-page: 436 year: 0 ident: ref41 article-title: Speech enhancement based on deep denoising autoencoder publication-title: Proc Conf Int Speech Commun Assoc – ident: ref63 doi: 10.1007/978-3-540-74494-8_52 |
| SSID | ssj0001079974 |
| Score | 2.4560304 |
| Snippet | Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been... |
| SourceID | hal proquest crossref ieee |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1788 |
| SubjectTerms | Algorithms Audio data Audio equipment Audio-visual speech enhancement Coders Computer architecture Computer Science Computer simulation Computer Vision and Pattern Recognition deep generative models Lips Machine Learning Monte Carlo expectation-maximization Noise measurement nonnegative matrix factorization Signal and Image Processing Sound Speech Speech enhancement Speech processing Training variational auto-encoders Visual signals Visualization |
| Title | Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders |
| URI | https://ieeexplore.ieee.org/document/9110765 https://www.proquest.com/docview/2417418335 https://inria.hal.science/hal-02364900 |
| Volume | 28 |
| WOSCitedRecordID | wos000543714200003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared) customDbUrl: eissn: 2329-9304 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001079974 issn: 2329-9290 databaseCode: RIE dateStart: 20140101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED828UEf_BbnF0V802i2rvl4LDIZOERQh2-lTa5sIK1sq3-_l6ybiiL41pZLCbkkd79L7ncA5zpNESOeszzXBFBkR7FUoWTWZLRfihSVr58yHMj7e_Xyoh8acLnMhUFEf_kMr9yjP8u3palcqOxaO7AioiY0pRTzXK3PeAqXWnvSZfIRNCOrzxc5MlxfP8WPgwdCgx0CqZ6TJPxmh5ojdwvSl1f5sSd7Q3O7-b8ubsFG7VAG8XwGbEMDix1Y_0IzuAt3cWXHJRuOpxVJPr4hmlHQK0ZO4e53gb83ENyU7vjahwaDIUHoOkwYxNWsZL3CJb9PpnvwfNt7uumzuogCMzQOMya6ueUEn8kOoTRGZJlNu6kU3IR5m2eobCfjgiupkNBfhGlEq1jljogu66ZWh_uwUpQFHkCQdzMeZcqSC0kgxBL4EcJEoaC3tjUGW9BeDGliaoZxV-jiNfFIg-vEqyFxakhqNbTgYtnmbc6v8af0GWlqKeiosfvxIHHfPBO-5vydhHadXpZStUpacLxQbFKv02lC_ouj7wnD6PD3Vkew5jowD7ocw8psUuEJrJr32Xg6OfVT8AOY39aM |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Ra9swED7SbLDuYevalWbrNjP61qlRYluWHk1JyagbCk1D34QtnUlg2CWJ-_t7UpxsY2OwN9ucjNBJuvtOuu8AzlSeI8a8ZGWpCKAkQ8lyiQmzpqD9UuQoff2UWZZMJvLhQd124NsuFwYR_eUzvHCP_izf1qZxobK-cmBFxHvwIo6iId9ka_2MqPBEKU-7TF6CYmT3-TZLhqv-NL3LbgkPDgmmelaS8DdLtDd39yB9gZU_dmVvaq7e_l8nD-BN61IG6WYOvIMOVofw-heiwSO4Thu7qNlssWpI8u4R0cyDUTV3Kne_C_zNgeCydgfYPjgYzAhEt4HCIG3WNRtVLv19uXoP91ej6eWYtWUUmKFxWDMRlZYTgCZLhIkxoihsHuWJ4CYsB7xAaYcFF1wmEgn_xZjHtI5l6ajoiii3KjyGblVXeAJBGRU8LqQlJ5JgiCX4I4SJQ0FvA2sM9mCwHVJtWo5xV-rih_ZYgyvt1aCdGnSrhh6c79o8bhg2_in9lTS1E3Tk2OM00-6b58JXnD-R0JHTy06qVUkPTreK1e1KXWnyYByBTxjGH_7e6gu8Gk9vMp19n1x_hH3XmU0I5hS662WDn-CleVovVsvPfjo-A-0Z2dM |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Audio-Visual+Speech+Enhancement+Using+Conditional+Variational+Auto-Encoders&rft.jtitle=IEEE%2FACM+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Sadeghi%2C+Mostafa&rft.au=Leglaive%2C+Simon&rft.au=Alameda-Pineda%2C+Xavier&rft.au=Girin%2C+Laurent&rft.date=2020-01-01&rft.pub=IEEE&rft.issn=2329-9290&rft.volume=28&rft.spage=1788&rft.epage=1800&rft_id=info:doi/10.1109%2FTASLP.2020.3000593&rft.externalDocID=9110765 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2329-9290&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2329-9290&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2329-9290&client=summon |