Deep Learning-Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients
We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. The deep learning-based NR approach used in thi...
Saved in:
| Published in: | Ear and hearing Vol. 39; no. 4; p. 795 |
|---|---|
| Main Authors: | , , , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
01.07.2018
|
| Subjects: | |
| ISSN: | 1538-4667, 1538-4667 |
| Online Access: | Get more information |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients.
The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing.
The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions.
When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise. |
|---|---|
| AbstractList | We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients.
The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing.
The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions.
When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise. We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients.OBJECTIVEWe investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients.The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing.DESIGNThe deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing.The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions.RESULTSThe experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions.When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.CONCLUSIONSWhen compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise. |
| Author | Chen, Fei Chen, Kuang-Chao Lee, Chin-Hui Lu, Xugang Po-Hung Li, Lieber Tsao, Yu Su, Yu-Ting Lai, Ying-Hui Chen, Li-Ching Chen, Yu-Hsuan |
| Author_xml | – sequence: 1 givenname: Ying-Hui surname: Lai fullname: Lai, Ying-Hui organization: Department of Biomedical Engineering, National Yang-Ming University, Taipei, Taiwan – sequence: 2 givenname: Yu surname: Tsao fullname: Tsao, Yu organization: Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan – sequence: 3 givenname: Xugang surname: Lu fullname: Lu, Xugang organization: National Institute of Information and Communications Technology, Japan – sequence: 4 givenname: Fei surname: Chen fullname: Chen, Fei organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China – sequence: 5 givenname: Yu-Ting surname: Su fullname: Su, Yu-Ting organization: Department of Mechatronic Engineering, National Taiwan Normal University, Taipei, Taiwan – sequence: 6 givenname: Kuang-Chao surname: Chen fullname: Chen, Kuang-Chao organization: Department of Otolaryngology, Cheng Hsin General Hospital, Taipei, Taiwan – sequence: 7 givenname: Yu-Hsuan surname: Chen fullname: Chen, Yu-Hsuan organization: Department of Internal Medicine, Cheng Hsin General Hospital, Taipei, Taiwan – sequence: 8 givenname: Li-Ching surname: Chen fullname: Chen, Li-Ching organization: Department of Otolaryngology, Cheng Hsin General Hospital, Taipei, Taiwan – sequence: 9 givenname: Lieber surname: Po-Hung Li fullname: Po-Hung Li, Lieber organization: Faculty of Medicine, School of Medicine, National Yang Ming University, Taipei, Taiwan – sequence: 10 givenname: Chin-Hui surname: Lee fullname: Lee, Chin-Hui organization: School of Electrical and Computer Engineering, Georgia Institute of Technology, Georgia, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/29360687$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNkF1LwzAUhoNM3If-A5FcetOZpvloL-fmx2AoqLsuaXqyRdqkNq2wf2_FCTs35-Xw8PBypmjkvAOErmMyj0km7xbb1ZycDk_kGZrEPEkjJoQcneQxmobwSUhMM8Eu0JhmiSAilRPkVgAN3oBqnXW76F4FKPGLtwHwG5S97qx3eNE0rVd6jzuP1_WQvwG_NwDDZe06qCq7s4WtbHfAxrd46fW-Goy_bKVcN5i0bSy4Llyic6OqAFfHPUPbx4eP5XO0eX1aLxebSLNMyEgKUDSVjCtmlAGZlEoZxg0rS27i1FDFgZgikUQUVEsJBZVSCqZpQlWaJnSGbv-8Q9mvHkKX1zbooaly4PuQx1lGUh5zSgf05oj2RQ1l3rS2Vu0h__8R_QEg_mxj |
| CitedBy_id | crossref_primary_10_1055_s_0042_1756166 crossref_primary_10_1016_j_bspc_2018_09_010 crossref_primary_10_3390_audiolres15030056 crossref_primary_10_1121_1_5119226 crossref_primary_10_1016_j_specom_2021_06_001 crossref_primary_10_1121_10_0002855 crossref_primary_10_3390_brainsci15050479 crossref_primary_10_1016_j_specom_2018_06_002 crossref_primary_10_1109_TBME_2023_3262677 crossref_primary_10_1121_10_0028007 crossref_primary_10_1109_TASLP_2020_2968738 crossref_primary_10_1109_TASLP_2020_2976193 crossref_primary_10_1038_s41598_020_64175_1 crossref_primary_10_1109_TNSRE_2020_3042655 crossref_primary_10_3390_sym13081310 crossref_primary_10_1097_MAO_0000000000003624 crossref_primary_10_1016_j_apacoust_2020_107631 crossref_primary_10_1121_10_0036356 crossref_primary_10_1016_j_conb_2019_06_008 crossref_primary_10_1007_s10162_021_00811_5 crossref_primary_10_1097_APO_0000000000000576 crossref_primary_10_1088_1741_2552_addb7b crossref_primary_10_1097_MAO_0000000000002440 crossref_primary_10_1109_JSYST_2023_3296432 crossref_primary_10_3390_signals1020008 crossref_primary_10_1080_14670100_2019_1631520 crossref_primary_10_3389_fmed_2021_740123 crossref_primary_10_3390_app11062477 crossref_primary_10_1007_s00405_025_09272_5 crossref_primary_10_3389_fnins_2020_00301 crossref_primary_10_1121_10_0026218 crossref_primary_10_1007_s10772_022_09972_x crossref_primary_10_1080_17434440_2021_1863782 crossref_primary_10_1007_s11227_021_04048_0 crossref_primary_10_3390_app12157600 crossref_primary_10_1109_TASLP_2021_3076363 crossref_primary_10_1121_10_0019341 crossref_primary_10_1155_2022_6576605 crossref_primary_10_1038_s42256_021_00394_z crossref_primary_10_1016_j_otc_2024_06_011 crossref_primary_10_1038_s41598_024_63675_8 crossref_primary_10_1146_annurev_bioeng_102623_121249 crossref_primary_10_1088_1741_2552_abe979 crossref_primary_10_1111_coa_14170 crossref_primary_10_1109_TETCI_2017_2784878 |
| ContentType | Journal Article |
| DBID | CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1097/AUD.0000000000000537 |
| DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Medicine |
| EISSN | 1538-4667 |
| ExternalDocumentID | 29360687 |
| Genre | Research Support, Non-U.S. Gov't Journal Article |
| GroupedDBID | --- --Z .-D .GJ .Z2 01R 0R~ 186 1J1 40H 4Q1 4Q2 4Q3 53G 5GY 5RE 5VS 6PF 71W 77Y 7O~ 85S AAAAV AAAXR AAGIX AAHPQ AAIQE AAMOA AAMTA AAQKA AARTV AASCR AASOK AASXQ AAWTL AAXQO ABASU ABBUW ABDIG ABDPE ABJNI ABVCZ ABXVJ ABZAD ACCJW ACDDN ACEWG ACGFO ACGFS ACIJW ACILI ACLDA ACOAL ACWDW ACWRI ACXJB ACXNZ ADFPA ADGGA ADHPY ADNKB AE3 AEETU AENEX AFDTB AFFNX AFUWQ AGINI AHOMT AHQNM AHRYX AHVBC AIJEX AINUH AJCLO AJIOK AJNWD AJNYG AJZMW AKCTQ AKULP ALKUP ALMA_UNASSIGNED_HOLDINGS ALMTX AMJPA AMKUR AMNEI AOHHW AWKKM BOYCO BQLVK BS7 BYPQX C45 CGR CS3 CUY CVF DIWNM DU5 DUNZO E.X EBS ECM EEVPB EIF EJD ERAAH EX3 F2K F2L F2M F2N F5P FCALG FL- FW0 GNXGY GQDEL H0~ HLJTE HZ~ H~9 IKREB IKYAY IN~ IPNFZ JF9 JG8 JK3 JK8 K8S KD2 KMI KOO L-C N9A NPM N~7 N~B N~M O9- OAG OAH OCUKA ODA OHT OL1 OLB OLG OLH OLU OLV OLW OLY OLZ OPUJH ORVUJ OUVQU OVD OVDNE OVIDH OVLEI OWU OWV OWW OWX OWY OWZ OXXIT P-K P2P PKN R58 RIG RLZ S4R S4S T8P TEORI TN5 TSPGW TWZ UCV V2I VVN W3M WOQ WOW X3V X3W XXN XYM YFH YYQ ZFV ZGI ZUP ZZMQN 7X8 ABPXF ABZZY ACZKN ADKSD ADSXY AFBFQ AOQMC |
| ID | FETCH-LOGICAL-c4967-76ea28745a4fafe73daaf45f4dd5f18f2a5e0fb3706b2c77eb277764c232a8832 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 58 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=00003446-201807000-00018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1538-4667 |
| IngestDate | Wed Oct 01 12:29:47 EDT 2025 Wed Feb 19 02:34:04 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c4967-76ea28745a4fafe73daaf45f4dd5f18f2a5e0fb3706b2c77eb277764c232a8832 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PMID | 29360687 |
| PQID | 1990851522 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_1990851522 pubmed_primary_29360687 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-July/August |
| PublicationDateYYYYMMDD | 2018-07-01 |
| PublicationDate_xml | – month: 07 year: 2018 text: 2018-July/August |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Ear and hearing |
| PublicationTitleAlternate | Ear Hear |
| PublicationYear | 2018 |
| SSID | ssj0012964 |
| Score | 2.4882689 |
| Snippet | We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 795 |
| SubjectTerms | Adult Child Cochlear Implantation Cochlear Implants Deafness - rehabilitation Deep Learning Female Humans Male Middle Aged Noise Signal-To-Noise Ratio Speech Perception Young Adult |
| Title | Deep Learning-Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/29360687 https://www.proquest.com/docview/1990851522 |
| Volume | 39 |
| WOSCitedRecordID | wos00003446-201807000-00018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8NAEF7Uinjx_agvVvAa2jSbTHKSWi0KNhS10lvY7EMLkkRTBf-9s8mGngTBHHIISQg7k9lv9pv9hpALyRmX3QDTEhEphwkQTspd7ageZ14AEY8q9vz5HuI4nE6jsV1wK21ZZRMTq0Atc2HWyDsuhk1EBwgXLot3x3SNMuyqbaGxTFoeQhnj1TBdsAiGUqz1UkOHBQE0W-ci6PQn17V0YXP4HvwOMqvJZrj538_cIhsWZtJ-7RfbZEllO2RtZIn0XZJdK1VQK6764lzhXCZpnM9KRR-MmKsxF-1bvXE6z2m9-KDoY6EUXrmzSp51be03RehLB7l4NU0ozL1vaDB8k5gVZr9luUcmw5unwa1jWy84gkUYOiFQvFLC50xzrcCTnGvmayalr91Q97ivujr1oBukPQGA-TkABEwgQOMhRol9spLlmTokNE19HzN24fkcWJjKVKpUYlJj_AAxs9sm581IJujahq_gmco_y2Qxlm1yUJsjKWoNjgRRCqZeIRz94eljso4wJ6yLbE9IS-OPrU7Jqviaz8qPs8pn8ByPRz8Clsxq |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Learning-Based+Noise+Reduction+Approach+to+Improve+Speech+Intelligibility+for+Cochlear+Implant+Recipients&rft.jtitle=Ear+and+hearing&rft.au=Lai%2C+Ying-Hui&rft.au=Tsao%2C+Yu&rft.au=Lu%2C+Xugang&rft.au=Chen%2C+Fei&rft.date=2018-07-01&rft.issn=1538-4667&rft.eissn=1538-4667&rft.volume=39&rft.issue=4&rft.spage=795&rft_id=info:doi/10.1097%2FAUD.0000000000000537&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1538-4667&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1538-4667&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1538-4667&client=summon |