A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Neural network approaches to single-channel speech enhancement have received much recent attention. In particular, mask-based architectures have achieved significant performance improvements over conventional methods. This paper proposes a multiscale autoencoder (MSAE) for mask-based end-to-end neur...
Uloženo v:
| Vydáno v: | IEEE/ACM transactions on audio, speech, and language processing Ročník 32; s. 2418 - 2431 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2329-9290, 2329-9304 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Neural network approaches to single-channel speech enhancement have received much recent attention. In particular, mask-based architectures have achieved significant performance improvements over conventional methods. This paper proposes a multiscale autoencoder (MSAE) for mask-based end-to-end neural network speech enhancement. The MSAE performs spectral decomposition of an input waveform within separate band-limited branches, each operating with a different rate and scale, to extract a sequence of multiscale embeddings. The proposed framework features intuitive parameterization of the autoencoder, including a flexible spectral band design based on the Constant-Q transform. Additionally, the MSAE is constructed entirely of differentiable operators, allowing it to be implemented within an end-to-end neural network, and be discriminatively trained. The MSAE draws motivation both from recent multiscale network topologies and from traditional multiresolution transforms in speech processing. Experimental results show the MSAE to provide clear performance benefits relative to conventional single-branch autoencoders. Additionally, the proposed framework is shown to outperform a variety of state-of-the-art enhancement systems, both in terms of objective speech quality metrics and automatic speech recognition accuracy. |
|---|---|
| AbstractList | Neural network approaches to single-channel speech enhancement have received much recent attention. In particular, mask-based architectures have achieved significant performance improvements over conventional methods. This paper proposes a multiscale autoencoder (MSAE) for mask-based end-to-end neural network speech enhancement. The MSAE performs spectral decomposition of an input waveform within separate band-limited branches, each operating with a different rate and scale, to extract a sequence of multiscale embeddings. The proposed framework features intuitive parameterization of the autoencoder, including a flexible spectral band design based on the Constant-Q transform. Additionally, the MSAE is constructed entirely of differentiable operators, allowing it to be implemented within an end-to-end neural network, and be discriminatively trained. The MSAE draws motivation both from recent multiscale network topologies and from traditional multiresolution transforms in speech processing. Experimental results show the MSAE to provide clear performance benefits relative to conventional single-branch autoencoders. Additionally, the proposed framework is shown to outperform a variety of state-of-the-art enhancement systems, both in terms of objective speech quality metrics and automatic speech recognition accuracy. |
| Author | Borgstrom, Bengt J. Brandstein, Michael S. |
| Author_xml | – sequence: 1 givenname: Bengt J. orcidid: 0000-0001-8529-5378 surname: Borgstrom fullname: Borgstrom, Bengt J. email: jonas.borgstrom@ll.mit.edu organization: MIT Lincoln Laboratory, Artificial Intelligence Technology and Systems Group, Lexington, MA, USA – sequence: 2 givenname: Michael S. orcidid: 0009-0008-7883-3658 surname: Brandstein fullname: Brandstein, Michael S. email: msb@ll.mit.edu organization: MIT Lincoln Laboratory, Artificial Intelligence Technology and Systems Group, Lexington, MA, USA |
| BookMark | eNp9kEtLw0AUhQepYK39A-Ii4EYXqfPI6y5DaVVoVWjduAmTyR2ammbqZIL4700fgrhwdS7cc-7hfuekV5saCblkdMQYhbtlupi9jDjlwUiIBCLBT0ifCw4-CBr0fmYO9IwMm2ZNKWU0BoiDPnlLvXlbubJRskIvbZ3BWpkCrXczX6STW29q5QY_jX33tLHepC58Z_xOvCdsraw6cfvtYouoVp1hJWuFG6zdBTnVsmpweNQBeZ1OluMHf_Z8_zhOZ77iEDmfqziEXOQcoMi55hRimmsW0iJUrMgTweMoKJIwB5ogUzIqCqa5YlpKCJQOxIBcH-5urflosXHZ2rS27iqz7v0IOkZB1Ln4waWsaRqLOtvaciPtV8ZotsOY7TFmO4zZEWMXSv6EVOmkK03trCyr_6NXh2iJiL-6QhoAcPENfSGBOA |
| CODEN | ITASFA |
| CitedBy_id | crossref_primary_10_1109_ACCESS_2024_3444596 crossref_primary_10_3390_app14114488 |
| Cites_doi | 10.1109/WASPAA.2019.8937186 10.1109/ICASSP.2012.6288811 10.1109/cvpr.2017.243 10.21437/Interspeech.2020-2650 10.1109/ICASSP.2010.5495701 10.1109/TASLP.2014.2311329 10.1121/1.400476 10.1109/TASSP.1980.1163394 10.1109/TASLP.2020.2968738 10.1109/ICASSP40776.2020.9052968 10.1016/b978-0-12-374370-1.x0001-8 10.5555/3045118.3045167 10.1109/TASLP.2021.3079813 10.21437/Interspeech.2016-159 10.1109/ICASSP.2018.8462068 10.1038/nn0402-292 10.1109/TASLP.2022.3225649 10.1109/TASLP.2018.2877909 10.21437/Interspeech.2018-1223 10.1109/TASLP.2019.2915167 10.1109/TASLP.2015.2512042 10.1109/ICASSP39728.2021.9413740 10.1109/CVPR.2017.634 10.1109/TASLP.2018.2821903 10.1016/0167-6393(93)90095-3 10.1109/TASLP.2019.2913512 10.1109/TASLP.2018.2828980 10.1109/TASLP.2014.2304637 10.1109/ICASSP43922.2022.9746169 10.1109/TASLP.2018.2842159 10.21437/Interspeech.2018-1454 10.1145/3422622 10.1109/TASSP.1985.1164550 10.1097/aud.0b013e31820512bb 10.1109/ICASSP.2015.7178964 10.1109/ICASSP.2018.8461737 10.1109/ICASSP.2018.8462593 10.21437/interspeech.2020-3038 10.1109/ICASSP.2001.941023 10.1109/TASSP.1984.1164453 10.21437/Interspeech.2018-2290 10.1109/ICASSP40776.2020.9054401 10.1109/ICASSP39728.2021.9413901 10.1109/ICSDA.2013.6709856 10.1109/ACSSC.2018.8645535 10.1109/ICASSP.2012.6288857 10.1109/CVPR.2018.00745 10.21437/interspeech.2020-2409 10.21437/Interspeech.2021-2207 10.1007/978-3-319-24574-4_28 10.1109/97.1001645 10.21437/Odyssey.2022-30 10.1109/TASL.2010.2045180 10.1109/TASLP.2020.3043655 10.1109/CVPR.2016.90 10.21437/Interspeech.2021-599 10.1109/MSPEC.1970.5213512 10.1109/TASLP.2019.2955276 10.21437/Interspeech.2016-1384 10.1007/978-3-319-22482-4_11 10.21437/Interspeech.2019-1924 10.1002/9781118392683 10.1109/LSP.2019.2953810 10.6028/nist.ir.4930 10.1109/TASL.2007.911054 10.1186/s13636-021-00207-6 10.1109/ICDSP.2009.5201259 10.1109/ICASSP.2018.8462622 10.1109/CVPR.2018.00255 10.1109/PROC.1974.9484 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TASLP.2024.3389632 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2329-9304 |
| EndPage | 2431 |
| ExternalDocumentID | 10_1109_TASLP_2024_3389632 10504992 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: U.S. Department of Defense; Department of Defense grantid: FA8702-15-D-0001 funderid: 10.13039/100000005 |
| GroupedDBID | 0R~ 4.4 6IK 97E AAJGR AAKMM AALFJ AARMG AASAJ AAWTH AAWTV ABAZT ABQJQ ABVLG ACIWK ACM ADBCU AEBYY AEFXT AEJOY AENSD AFWIH AFWXC AGQYO AGSQL AHBIQ AIKLT AKJIK AKQYR AKRVB ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CCLIF EBS EJD GUFHI HGAVV IFIPE IPLJI JAVBF LHSKQ M43 OCL PQQKQ RIA RIE RNS ROL AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c296t-2c759b3b299db2f20970bf150d5c1db832764d85b908e1ca6dd1f2c1faa94cf43 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001209533900004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2329-9290 |
| IngestDate | Mon Jun 30 04:03:19 EDT 2025 Sat Nov 29 02:44:51 EST 2025 Tue Nov 18 21:45:26 EST 2025 Wed Aug 27 02:06:31 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c296t-2c759b3b299db2f20970bf150d5c1db832764d85b908e1ca6dd1f2c1faa94cf43 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-8529-5378 0009-0008-7883-3658 |
| PQID | 3046911046 |
| PQPubID | 85426 |
| PageCount | 14 |
| ParticipantIDs | crossref_primary_10_1109_TASLP_2024_3389632 crossref_citationtrail_10_1109_TASLP_2024_3389632 ieee_primary_10504992 proquest_journals_3046911046 |
| PublicationCentury | 2000 |
| PublicationDate | 20240000 2024-00-00 20240101 |
| PublicationDateYYYYMMDD | 2024-01-01 |
| PublicationDate_xml | – year: 2024 text: 20240000 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE/ACM transactions on audio, speech, and language processing |
| PublicationTitleAbbrev | TASLP |
| PublicationYear | 2024 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref57 ref12 ref56 ref15 ref59 ref14 ref58 ref53 ref52 ref11 ref10 ref54 Radford (ref79) 2023 Kingma (ref70) 2014 ref16 ref19 ref18 ref51 ref50 ref46 ref48 ref47 ref41 ref44 ref43 ref49 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 ref35 ref34 ref37 ref36 Quatieri (ref55) 2008 ref31 ref30 ref74 Hannun (ref78) 2014 ref33 ref77 ref32 ref76 Snyder (ref65) 2015 ref2 ref1 ref39 ref38 Smith (ref45) 2011 ref71 ref73 ref72 ref24 ref68 ref23 ref67 ref26 ref25 ref69 ref20 ref64 ref63 Fiscus (ref80) 2021 ref22 ref66 ref21 Ravanelli (ref75) 2021 ref28 ref27 ref29 Macartney (ref17) 2018 ref60 Li (ref42) 2021 ref62 ref61 |
| References_xml | – ident: ref26 doi: 10.1109/WASPAA.2019.8937186 – ident: ref3 doi: 10.1109/ICASSP.2012.6288811 – ident: ref49 doi: 10.1109/cvpr.2017.243 – ident: ref51 doi: 10.21437/Interspeech.2020-2650 – year: 2015 ident: ref65 article-title: Musan: A music, speech, and noise corpus – year: 2014 ident: ref70 article-title: Adam: A method for stochastic optimization – ident: ref72 doi: 10.1109/ICASSP.2010.5495701 – ident: ref7 doi: 10.1109/TASLP.2014.2311329 – ident: ref46 doi: 10.1121/1.400476 – ident: ref8 doi: 10.1109/TASSP.1980.1163394 – ident: ref31 doi: 10.1109/TASLP.2020.2968738 – year: 2021 ident: ref42 article-title: Real-time monaural speech enhancement with short-time discrete cosine transform – ident: ref30 doi: 10.1109/ICASSP40776.2020.9052968 – ident: ref47 doi: 10.1016/b978-0-12-374370-1.x0001-8 – volume-title: Discrete-Time Speech Signal Processing: Principles and Practice year: 2008 ident: ref55 – ident: ref57 doi: 10.5555/3045118.3045167 – ident: ref35 doi: 10.1109/TASLP.2021.3079813 – ident: ref14 doi: 10.21437/Interspeech.2016-159 – ident: ref20 doi: 10.1109/ICASSP.2018.8462068 – ident: ref43 doi: 10.1038/nn0402-292 – ident: ref38 doi: 10.1109/TASLP.2022.3225649 – ident: ref69 doi: 10.1109/TASLP.2018.2877909 – volume-title: Spectral Audio Signal Processing year: 2011 ident: ref45 – ident: ref16 doi: 10.21437/Interspeech.2018-1223 – ident: ref28 doi: 10.1109/TASLP.2019.2915167 – ident: ref40 doi: 10.1109/TASLP.2015.2512042 – ident: ref33 doi: 10.1109/ICASSP39728.2021.9413740 – ident: ref48 doi: 10.1109/CVPR.2017.634 – ident: ref21 doi: 10.1109/TASLP.2018.2821903 – ident: ref66 doi: 10.1016/0167-6393(93)90095-3 – ident: ref24 doi: 10.1109/TASLP.2019.2913512 – ident: ref56 doi: 10.1109/TASLP.2018.2828980 – ident: ref6 doi: 10.1109/TASLP.2014.2304637 – ident: ref39 doi: 10.1109/ICASSP43922.2022.9746169 – year: 2018 ident: ref17 article-title: Improved speech enhancement with the wave-u-net – ident: ref68 doi: 10.1109/TASLP.2018.2842159 – year: 2021 ident: ref75 article-title: SpeechBrain: A general-purpose speech toolkit – ident: ref64 doi: 10.21437/Interspeech.2018-1454 – ident: ref74 doi: 10.1145/3422622 – ident: ref10 doi: 10.1109/TASSP.1985.1164550 – ident: ref2 doi: 10.1097/aud.0b013e31820512bb – ident: ref60 doi: 10.1109/ICASSP.2015.7178964 – ident: ref22 doi: 10.1109/ICASSP.2018.8461737 – ident: ref15 doi: 10.1109/ICASSP.2018.8462593 – ident: ref77 doi: 10.21437/interspeech.2020-3038 – year: 2021 ident: ref80 article-title: SCTK, the NIST scoring toolkit – ident: ref71 doi: 10.1109/ICASSP.2001.941023 – ident: ref9 doi: 10.1109/TASSP.1984.1164453 – ident: ref19 doi: 10.21437/Interspeech.2018-2290 – ident: ref29 doi: 10.1109/ICASSP40776.2020.9054401 – ident: ref37 doi: 10.1109/ICASSP39728.2021.9413901 – ident: ref76 doi: 10.1109/ICSDA.2013.6709856 – ident: ref54 doi: 10.1109/ACSSC.2018.8645535 – ident: ref5 doi: 10.1109/ICASSP.2012.6288857 – ident: ref58 doi: 10.1109/CVPR.2018.00745 – ident: ref32 doi: 10.21437/interspeech.2020-2409 – ident: ref34 doi: 10.21437/Interspeech.2021-2207 – ident: ref53 doi: 10.1007/978-3-319-24574-4_28 – ident: ref11 doi: 10.1109/97.1001645 – ident: ref52 doi: 10.21437/Odyssey.2022-30 – ident: ref1 doi: 10.1109/TASL.2010.2045180 – ident: ref12 doi: 10.1109/TASLP.2020.3043655 – ident: ref59 doi: 10.1109/CVPR.2016.90 – ident: ref36 doi: 10.21437/Interspeech.2021-599 – ident: ref44 doi: 10.1109/MSPEC.1970.5213512 – ident: ref41 doi: 10.1109/TASLP.2019.2955276 – start-page: 28492 volume-title: Proc. Int. Conf. Mach. Learn. year: 2023 ident: ref79 article-title: Robust speech recognition via large-scale weak supervision – ident: ref63 doi: 10.21437/Interspeech.2016-1384 – ident: ref13 doi: 10.1007/978-3-319-22482-4_11 – ident: ref18 doi: 10.21437/Interspeech.2019-1924 – ident: ref4 doi: 10.1002/9781118392683 – ident: ref25 doi: 10.1109/LSP.2019.2953810 – ident: ref61 doi: 10.6028/nist.ir.4930 – ident: ref73 doi: 10.1109/TASL.2007.911054 – ident: ref27 doi: 10.1186/s13636-021-00207-6 – ident: ref62 doi: 10.1109/ICDSP.2009.5201259 – ident: ref23 doi: 10.1109/ICASSP.2018.8462622 – ident: ref50 doi: 10.1109/CVPR.2018.00255 – ident: ref67 doi: 10.1109/PROC.1974.9484 – year: 2014 ident: ref78 article-title: Deep speech: Scaling up end-to-end speech recognition |
| SSID | ssj0001079974 |
| Score | 2.298118 |
| Snippet | Neural network approaches to single-channel speech enhancement have received much recent attention. In particular, mask-based architectures have achieved... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 2418 |
| SubjectTerms | Automatic speech recognition Decoding Ear end-to-end neural networks multiscale representations mutliresolution transforms Network topologies Neural networks Parameterization Signal resolution Speech Speech enhancement Speech processing Speech recognition Time-frequency analysis Transforms Voice recognition Waveforms |
| Title | A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement |
| URI | https://ieeexplore.ieee.org/document/10504992 https://www.proquest.com/docview/3046911046 |
| Volume | 32 |
| WOSCitedRecordID | wos001209533900004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2329-9304 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001079974 issn: 2329-9290 databaseCode: RIE dateStart: 20140101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA86POjBz4nTKTl4UCQzbdOmORbZ8KBjsAnDS2k-yg6yjq3z7_clbXUiCl7aHvLSkN97yXvJ-0DoGmQozLM4JFYfICxShgifhiRmQorQRFIZ5YpN8OEwnk7FqA5Wd7EwxhjnfGZ69tPd5etCre1RGUh4aDV0WHG3OY-qYK2vAxXKhXBZl0FJEPBHQZsgGSruJ8n4aQTmoM96YJQB1_nfNiJXWeXHcuz2mMHBP0d3iPZrZRInFfpHaMvMj9HeRorBE_SaYBdjuwIsDE7WZWEzV2qzxDfP46R_iweNdxYG9RX355qUBYEXtmk7oPNh5SeOxwtj1AwazCyf2KG00cugP3l4JHU9BaJ8EZXEVzwUMpCwA2np5z4VnMocNEIdKk9LkG0eMR2HUtDYeCqLtPZyX3l5lgmmchacota8mJszhJUMoljHVKpAsxxo45wpyrgJIs24DjrIayY3VXWycVvz4i11RgcVqQMktYCkNSAddPdJs6hSbfzZum0h2GhZzX4HdRsQ01ocV6m9_oVVHZ7nv5BdoF3be3W40kWtcrk2l2hHvQNAyyvHaR8m4s6s |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA8yBfXgtzidmoMHRTLTNv3IscjGxK0MNmF4KctH2UHasQ__fl_STiei4KXtIa8J-b2XvJe8D4RuQIb8bBz5xOgDhAVSE-5Sn0SMC-7rQEgtbbGJMEmi0Yj3q2B1GwujtbbOZ7ppPu1dvirk0hyVgYT7RkOHFXfTlM5yynCtryMVGnJu8y6DmsChT05XYTKUPwzjQbcPBqHLmmCWAd-537YiW1vlx4Jsd5n2_j_Hd4D2KnUSxyX-h2hD50dody3J4DF6jbGNsp0DGhrHy0VhclcqPcO3vUHcusPtlX8WBgUWt3JFFgWBFzaJO-DnSekpjgdTreUEGkwMp5ihnKCXdmv42CFVRQUiXR4siCtDnwtPwB6khJu5lIdUZKATKl86SoB0hwFTkS84jbQjx4FSTuZKJxuPOZMZ805RLS9yfYawFF4QqYgK6SmWAW2UMUlZqL1AsVB5deSsJjeVVbpxU_XiLbVmB-WpBSQ1gKQVIHV0_0kzLZNt_Nn6xECw1rKc_TpqrEBMK4Gcp-YCGNZ1eJ7_QnaNtjvDXjftPiXPF2jH9FQetTRQbTFb6ku0Jd8BrNmV5boPEjrR8w |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Multiscale+Autoencoder+%28MSAE%29+Framework+for+End-to-End+Neural+Network+Speech+Enhancement&rft.jtitle=IEEE%2FACM+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Borgstrom%2C+Bengt+J.&rft.au=Brandstein%2C+Michael+S.&rft.date=2024&rft.pub=IEEE&rft.issn=2329-9290&rft.volume=32&rft.spage=2418&rft.epage=2431&rft_id=info:doi/10.1109%2FTASLP.2024.3389632&rft.externalDocID=10504992 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2329-9290&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2329-9290&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2329-9290&client=summon |