Deep Learning-Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients

We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. The deep learning-based NR approach used in thi...

Full description

Saved in:
Bibliographic Details
Published in:Ear and hearing Vol. 39; no. 4; p. 795
Main Authors: Lai, Ying-Hui, Tsao, Yu, Lu, Xugang, Chen, Fei, Su, Yu-Ting, Chen, Kuang-Chao, Chen, Yu-Hsuan, Chen, Li-Ching, Po-Hung Li, Lieber, Lee, Chin-Hui
Format: Journal Article
Language:English
Published: United States 01.07.2018
Subjects:
ISSN:1538-4667, 1538-4667
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing. The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions. When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.
AbstractList We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing. The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions. When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.
We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients.OBJECTIVEWe investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients.The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing.DESIGNThe deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing.The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions.RESULTSThe experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions.When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.CONCLUSIONSWhen compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.
Author Chen, Fei
Chen, Kuang-Chao
Lee, Chin-Hui
Lu, Xugang
Po-Hung Li, Lieber
Tsao, Yu
Su, Yu-Ting
Lai, Ying-Hui
Chen, Li-Ching
Chen, Yu-Hsuan
Author_xml – sequence: 1
  givenname: Ying-Hui
  surname: Lai
  fullname: Lai, Ying-Hui
  organization: Department of Biomedical Engineering, National Yang-Ming University, Taipei, Taiwan
– sequence: 2
  givenname: Yu
  surname: Tsao
  fullname: Tsao, Yu
  organization: Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan
– sequence: 3
  givenname: Xugang
  surname: Lu
  fullname: Lu, Xugang
  organization: National Institute of Information and Communications Technology, Japan
– sequence: 4
  givenname: Fei
  surname: Chen
  fullname: Chen, Fei
  organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
– sequence: 5
  givenname: Yu-Ting
  surname: Su
  fullname: Su, Yu-Ting
  organization: Department of Mechatronic Engineering, National Taiwan Normal University, Taipei, Taiwan
– sequence: 6
  givenname: Kuang-Chao
  surname: Chen
  fullname: Chen, Kuang-Chao
  organization: Department of Otolaryngology, Cheng Hsin General Hospital, Taipei, Taiwan
– sequence: 7
  givenname: Yu-Hsuan
  surname: Chen
  fullname: Chen, Yu-Hsuan
  organization: Department of Internal Medicine, Cheng Hsin General Hospital, Taipei, Taiwan
– sequence: 8
  givenname: Li-Ching
  surname: Chen
  fullname: Chen, Li-Ching
  organization: Department of Otolaryngology, Cheng Hsin General Hospital, Taipei, Taiwan
– sequence: 9
  givenname: Lieber
  surname: Po-Hung Li
  fullname: Po-Hung Li, Lieber
  organization: Faculty of Medicine, School of Medicine, National Yang Ming University, Taipei, Taiwan
– sequence: 10
  givenname: Chin-Hui
  surname: Lee
  fullname: Lee, Chin-Hui
  organization: School of Electrical and Computer Engineering, Georgia Institute of Technology, Georgia, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/29360687$$D View this record in MEDLINE/PubMed
BookMark eNpNkF1LwzAUhoNM3If-A5FcetOZpvloL-fmx2AoqLsuaXqyRdqkNq2wf2_FCTs35-Xw8PBypmjkvAOErmMyj0km7xbb1ZycDk_kGZrEPEkjJoQcneQxmobwSUhMM8Eu0JhmiSAilRPkVgAN3oBqnXW76F4FKPGLtwHwG5S97qx3eNE0rVd6jzuP1_WQvwG_NwDDZe06qCq7s4WtbHfAxrd46fW-Goy_bKVcN5i0bSy4Llyic6OqAFfHPUPbx4eP5XO0eX1aLxebSLNMyEgKUDSVjCtmlAGZlEoZxg0rS27i1FDFgZgikUQUVEsJBZVSCqZpQlWaJnSGbv-8Q9mvHkKX1zbooaly4PuQx1lGUh5zSgf05oj2RQ1l3rS2Vu0h__8R_QEg_mxj
CitedBy_id crossref_primary_10_1055_s_0042_1756166
crossref_primary_10_1016_j_bspc_2018_09_010
crossref_primary_10_3390_audiolres15030056
crossref_primary_10_1121_1_5119226
crossref_primary_10_1016_j_specom_2021_06_001
crossref_primary_10_1121_10_0002855
crossref_primary_10_3390_brainsci15050479
crossref_primary_10_1016_j_specom_2018_06_002
crossref_primary_10_1109_TBME_2023_3262677
crossref_primary_10_1121_10_0028007
crossref_primary_10_1109_TASLP_2020_2968738
crossref_primary_10_1109_TASLP_2020_2976193
crossref_primary_10_1038_s41598_020_64175_1
crossref_primary_10_1109_TNSRE_2020_3042655
crossref_primary_10_3390_sym13081310
crossref_primary_10_1097_MAO_0000000000003624
crossref_primary_10_1016_j_apacoust_2020_107631
crossref_primary_10_1121_10_0036356
crossref_primary_10_1016_j_conb_2019_06_008
crossref_primary_10_1007_s10162_021_00811_5
crossref_primary_10_1097_APO_0000000000000576
crossref_primary_10_1088_1741_2552_addb7b
crossref_primary_10_1097_MAO_0000000000002440
crossref_primary_10_1109_JSYST_2023_3296432
crossref_primary_10_3390_signals1020008
crossref_primary_10_1080_14670100_2019_1631520
crossref_primary_10_3389_fmed_2021_740123
crossref_primary_10_3390_app11062477
crossref_primary_10_1007_s00405_025_09272_5
crossref_primary_10_3389_fnins_2020_00301
crossref_primary_10_1121_10_0026218
crossref_primary_10_1007_s10772_022_09972_x
crossref_primary_10_1080_17434440_2021_1863782
crossref_primary_10_1007_s11227_021_04048_0
crossref_primary_10_3390_app12157600
crossref_primary_10_1109_TASLP_2021_3076363
crossref_primary_10_1121_10_0019341
crossref_primary_10_1155_2022_6576605
crossref_primary_10_1038_s42256_021_00394_z
crossref_primary_10_1016_j_otc_2024_06_011
crossref_primary_10_1038_s41598_024_63675_8
crossref_primary_10_1146_annurev_bioeng_102623_121249
crossref_primary_10_1088_1741_2552_abe979
crossref_primary_10_1111_coa_14170
crossref_primary_10_1109_TETCI_2017_2784878
ContentType Journal Article
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1097/AUD.0000000000000537
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Medicine
EISSN 1538-4667
ExternalDocumentID 29360687
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID ---
--Z
.-D
.GJ
.Z2
01R
0R~
186
1J1
40H
4Q1
4Q2
4Q3
53G
5GY
5RE
5VS
6PF
71W
77Y
7O~
85S
AAAAV
AAAXR
AAGIX
AAHPQ
AAIQE
AAMOA
AAMTA
AAQKA
AARTV
AASCR
AASOK
AASXQ
AAWTL
AAXQO
ABASU
ABBUW
ABDIG
ABDPE
ABJNI
ABVCZ
ABXVJ
ABZAD
ACCJW
ACDDN
ACEWG
ACGFO
ACGFS
ACIJW
ACILI
ACLDA
ACOAL
ACWDW
ACWRI
ACXJB
ACXNZ
ADFPA
ADGGA
ADHPY
ADNKB
AE3
AEETU
AENEX
AFDTB
AFFNX
AFUWQ
AGINI
AHOMT
AHQNM
AHRYX
AHVBC
AIJEX
AINUH
AJCLO
AJIOK
AJNWD
AJNYG
AJZMW
AKCTQ
AKULP
ALKUP
ALMA_UNASSIGNED_HOLDINGS
ALMTX
AMJPA
AMKUR
AMNEI
AOHHW
AWKKM
BOYCO
BQLVK
BS7
BYPQX
C45
CGR
CS3
CUY
CVF
DIWNM
DU5
DUNZO
E.X
EBS
ECM
EEVPB
EIF
EJD
ERAAH
EX3
F2K
F2L
F2M
F2N
F5P
FCALG
FL-
FW0
GNXGY
GQDEL
H0~
HLJTE
HZ~
H~9
IKREB
IKYAY
IN~
IPNFZ
JF9
JG8
JK3
JK8
K8S
KD2
KMI
KOO
L-C
N9A
NPM
N~7
N~B
N~M
O9-
OAG
OAH
OCUKA
ODA
OHT
OL1
OLB
OLG
OLH
OLU
OLV
OLW
OLY
OLZ
OPUJH
ORVUJ
OUVQU
OVD
OVDNE
OVIDH
OVLEI
OWU
OWV
OWW
OWX
OWY
OWZ
OXXIT
P-K
P2P
PKN
R58
RIG
RLZ
S4R
S4S
T8P
TEORI
TN5
TSPGW
TWZ
UCV
V2I
VVN
W3M
WOQ
WOW
X3V
X3W
XXN
XYM
YFH
YYQ
ZFV
ZGI
ZUP
ZZMQN
7X8
ABPXF
ABZZY
ACZKN
ADKSD
ADSXY
AFBFQ
AOQMC
ID FETCH-LOGICAL-c4967-76ea28745a4fafe73daaf45f4dd5f18f2a5e0fb3706b2c77eb277764c232a8832
IEDL.DBID 7X8
ISICitedReferencesCount 58
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=00003446-201807000-00018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1538-4667
IngestDate Wed Oct 01 12:29:47 EDT 2025
Wed Feb 19 02:34:04 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4967-76ea28745a4fafe73daaf45f4dd5f18f2a5e0fb3706b2c77eb277764c232a8832
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 29360687
PQID 1990851522
PQPubID 23479
ParticipantIDs proquest_miscellaneous_1990851522
pubmed_primary_29360687
PublicationCentury 2000
PublicationDate 2018-July/August
PublicationDateYYYYMMDD 2018-07-01
PublicationDate_xml – month: 07
  year: 2018
  text: 2018-July/August
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Ear and hearing
PublicationTitleAlternate Ear Hear
PublicationYear 2018
SSID ssj0012964
Score 2.4882689
Snippet We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 795
SubjectTerms Adult
Child
Cochlear Implantation
Cochlear Implants
Deafness - rehabilitation
Deep Learning
Female
Humans
Male
Middle Aged
Noise
Signal-To-Noise Ratio
Speech Perception
Young Adult
Title Deep Learning-Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients
URI https://www.ncbi.nlm.nih.gov/pubmed/29360687
https://www.proquest.com/docview/1990851522
Volume 39
WOSCitedRecordID wos00003446-201807000-00018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8NAEF7Uinjx_agvVvAa2jSbTHKSWi0KNhS10lvY7EMLkkRTBf-9s8mGngTBHHIISQg7k9lv9pv9hpALyRmX3QDTEhEphwkQTspd7ageZ14AEY8q9vz5HuI4nE6jsV1wK21ZZRMTq0Atc2HWyDsuhk1EBwgXLot3x3SNMuyqbaGxTFoeQhnj1TBdsAiGUqz1UkOHBQE0W-ci6PQn17V0YXP4HvwOMqvJZrj538_cIhsWZtJ-7RfbZEllO2RtZIn0XZJdK1VQK6764lzhXCZpnM9KRR-MmKsxF-1bvXE6z2m9-KDoY6EUXrmzSp51be03RehLB7l4NU0ozL1vaDB8k5gVZr9luUcmw5unwa1jWy84gkUYOiFQvFLC50xzrcCTnGvmayalr91Q97ivujr1oBukPQGA-TkABEwgQOMhRol9spLlmTokNE19HzN24fkcWJjKVKpUYlJj_AAxs9sm581IJujahq_gmco_y2Qxlm1yUJsjKWoNjgRRCqZeIRz94eljso4wJ6yLbE9IS-OPrU7Jqviaz8qPs8pn8ByPRz8Clsxq
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Learning-Based+Noise+Reduction+Approach+to+Improve+Speech+Intelligibility+for+Cochlear+Implant+Recipients&rft.jtitle=Ear+and+hearing&rft.au=Lai%2C+Ying-Hui&rft.au=Tsao%2C+Yu&rft.au=Lu%2C+Xugang&rft.au=Chen%2C+Fei&rft.date=2018-07-01&rft.issn=1538-4667&rft.eissn=1538-4667&rft.volume=39&rft.issue=4&rft.spage=795&rft_id=info:doi/10.1097%2FAUD.0000000000000537&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1538-4667&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1538-4667&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1538-4667&client=summon