Bone-conducted speech enhancement using deep denoising autoencoder

Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Speech communication Jg. 104; S. 106 - 112
Hauptverfasser: Liu, Hung-Ping, Tsao, Yu, Fuh, Chiou-Shann
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Amsterdam Elsevier B.V 01.11.2018
Elsevier Science Ltd
Schlagworte:
ISSN:0167-6393, 1872-7182
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%).
AbstractList Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%).
Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%).
Author Fuh, Chiou-Shann
Liu, Hung-Ping
Tsao, Yu
Author_xml – sequence: 1
  givenname: Hung-Ping
  surname: Liu
  fullname: Liu, Hung-Ping
  email: howardliu1223@gmail.com
  organization: Graduate Institute of Biomedical Electronics and Bioinformatics, Nation Taiwan University, Taipei, Taiwan
– sequence: 2
  givenname: Yu
  surname: Tsao
  fullname: Tsao, Yu
  email: yu.tsao@citi.sinica.edu.tw
  organization: Research Center for Information Technology Innovation, Academia Sinica, Taiwan
– sequence: 3
  givenname: Chiou-Shann
  surname: Fuh
  fullname: Fuh, Chiou-Shann
  organization: Graduate Institute of Biomedical Electronics and Bioinformatics, Nation Taiwan University, Taipei, Taiwan
BookMark eNqFkD1PwzAQhi0EEm3hHzBEYk44O4mdMiDRii-pEgvMluNcqKvWLraDxL_HJUwMMNjWSfc-d36m5Ng6i4RcUCgoUH61KcIetdsVDGhTAC8A2BGZ0EawXNCGHZNJahM5L-flKZmGsAGAqmnYhCwWCZVrZ7tBR-yyBEK9ztCuldW4QxuzIRj7lnWI-3RZZ75LNUSHVrsO_Rk56dU24PnPOyOv93cvy8d89fzwtLxd5boCiDlrAXXPu1phI4RqaDql4pWivBYVcF6mnWrNqGhF2fY1ajWHeZvmaqX6qi9n5HLk7r17HzBEuXGDt2mkZLSuGBM8QWakGru0dyF47OXem53yn5KCPNiSGznakgdbErhMtlLs-ldMm6iicTZ6Zbb_hW_GMKbvfxj0MmiT7GBnPOooO2f-BnwB7ZGK_g
CitedBy_id crossref_primary_10_1016_j_apacoust_2025_110924
crossref_primary_10_1121_10_0010316
crossref_primary_10_1016_j_apacoust_2022_109058
crossref_primary_10_3390_s21051878
crossref_primary_10_1016_j_apacoust_2023_109576
crossref_primary_10_1109_TIFS_2023_3346647
crossref_primary_10_3390_s23010035
crossref_primary_10_1016_j_specom_2025_103223
crossref_primary_10_1016_j_sigpro_2024_109615
crossref_primary_10_1186_s13636_025_00418_1
crossref_primary_10_3390_a16030153
crossref_primary_10_2139_ssrn_5037701
crossref_primary_10_1155_2022_4473952
crossref_primary_10_1145_3699757
crossref_primary_10_1109_LSP_2020_3000968
crossref_primary_10_1109_TASLP_2020_2976193
crossref_primary_10_1007_s00034_024_02733_y
crossref_primary_10_1109_TNSRE_2020_3042655
crossref_primary_10_1109_JSEN_2025_3585068
crossref_primary_10_3390_s20185050
crossref_primary_10_1109_ACCESS_2024_3414435
crossref_primary_10_3233_JIFS_201014
crossref_primary_10_58399_TGMU6907
crossref_primary_10_1016_j_asoc_2022_109618
crossref_primary_10_1109_LSP_2023_3347149
crossref_primary_10_1109_TASLP_2023_3313433
crossref_primary_10_1007_s12652_021_03222_9
crossref_primary_10_1109_ACCESS_2025_3564137
crossref_primary_10_1121_10_0028339
crossref_primary_10_1007_s12652_020_02598_4
crossref_primary_10_1109_TASLP_2023_3337988
crossref_primary_10_1007_s11277_021_08313_6
crossref_primary_10_1109_ACCESS_2020_3021061
crossref_primary_10_1016_j_apacoust_2024_110293
crossref_primary_10_1541_ieejeiss_145_693
crossref_primary_10_1109_ACCESS_2020_2970143
Cites_doi 10.1097/AUD.0000000000000537
10.1109/TASLP.2018.2821903
10.1109/TASLP.2014.2352935
10.1109/TASLP.2014.2304637
10.1109/LSP.2013.2291240
10.1371/journal.pone.0133519
10.1016/j.specom.2014.02.001
10.1109/TASLP.2014.2364452
10.1109/TBME.2016.2613960
10.21437/Interspeech.2016-211
10.1109/ACCESS.2017.2766675
10.1097/00003446-200508000-00002
10.1109/LSP.2003.808549
10.21437/Interspeech.2016-1284
10.1109/TASL.2011.2114881
ContentType Journal Article
Copyright 2018 Elsevier B.V.
Copyright Elsevier Science Ltd. Nov 2018
Copyright_xml – notice: 2018 Elsevier B.V.
– notice: Copyright Elsevier Science Ltd. Nov 2018
DBID AAYXX
CITATION
7SC
7SP
7T9
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.specom.2018.06.002
DatabaseName CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Linguistics and Language Behavior Abstracts (LLBA)
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Linguistics and Language Behavior Abstracts (LLBA)
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

DeliveryMethod fulltext_linktorsrc
Discipline Languages & Literatures
Social Welfare & Social Work
Psychology
EISSN 1872-7182
EndPage 112
ExternalDocumentID 10_1016_j_specom_2018_06_002
S016763931730345X
GroupedDBID --K
--M
-~X
.DC
.~1
07C
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
9JO
AACTN
AADFP
AAEDT
AAEDW
AAFJI
AAGJA
AAGJQ
AAGUQ
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABIVO
ABJNI
ABMAC
ABMMH
ABOYX
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACXNI
ACZNC
ADBBV
ADEZE
ADIYS
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AFYLN
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
AKYCK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOMHK
AOUOD
ASPBG
AVARZ
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OKEIE
OZT
P-8
P-9
P2P
PC.
PQQKQ
PRBVW
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSB
SSO
SST
SSV
SSY
SSZ
T5K
WUQ
XFK
XJE
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
7SC
7SP
7T9
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c400t-2b0ecf6d5ae877a817a83a64a1657406630045c217b73bf5eca909bdeecaaf4f3
ISICitedReferencesCount 46
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000455419500011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0167-6393
IngestDate Sun Nov 09 12:51:53 EST 2025
Tue Nov 18 22:23:30 EST 2025
Sat Nov 29 06:17:49 EST 2025
Fri Feb 23 02:28:30 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Bone-conduction microphone
Deep denoising autoencoder
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c400t-2b0ecf6d5ae877a817a83a64a1657406630045c217b73bf5eca909bdeecaaf4f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 2154227666
PQPubID 2038294
PageCount 7
ParticipantIDs proquest_journals_2154227666
crossref_primary_10_1016_j_specom_2018_06_002
crossref_citationtrail_10_1016_j_specom_2018_06_002
elsevier_sciencedirect_doi_10_1016_j_specom_2018_06_002
PublicationCentury 2000
PublicationDate November 2018
2018-11-00
20181101
PublicationDateYYYYMMDD 2018-11-01
PublicationDate_xml – month: 11
  year: 2018
  text: November 2018
PublicationDecade 2010
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle Speech communication
PublicationYear 2018
Publisher Elsevier B.V
Elsevier Science Ltd
Publisher_xml – name: Elsevier B.V
– name: Elsevier Science Ltd
References Haykin (bib0014) 1995
Daniel, Tan (bib0003) 2017
Chen, Watanabe, Erdogan, Hershey (bib0002) 2015
Glorot, Bordes, Bengio (bib0012) 2011
Odelowo, Anderson (bib0031) 2017
Liu, Zhang, Acero, Droppo, Huang (bib0024) 2004
Lu, Tsao, Matsuda, Hori (bib0026) 2014
Taal, Hendriks, Heusdens, Jensen (bib0037) 2011; 19
van Hoesel (bib0040) 2005; 26
Weninger, Erdogan, Watanabe (bib0046) 2015
Xia, Bao (bib0047) 2014; 60
Sun, Du, Dai, Lee (bib0036) 2017
Huang, Li, Siniscalchi, Chen, Wu, Lee (bib0016) 2015
Kolbœk, Tan, Jensen (bib0018) 2016; 2016
Lai (bib0021) 2017; 64
.
Wang, Chen (bib0042) 2017
Santiago, Antonio, Joan (bib0032) 2017
Huang, M.W., 2005. Development of Taiwan Mandarin hearing in noise test. Master thesis, Department of speech language pathology and audiology, National Taipei University of Nursing and Health Sciences.
Zheng, Liu, Zhang, Sinclair, Droppo, Deng, Acero, Huang (bib0051) 2003
Xu, Du, Dai, Lee (bib0049) 2015; 23
Erdogan, Hershey, Watanabe, Le Roux (bib0006) 2015
Kuo, Yu, Yan (bib0019) 2015
Fu, Hu, Tsao, Lu (bib0009) 2017
Mimura, Sakai, Kawahara (bib0030) 2017
Wang, Narayanan, Wang (bib0045) 2014; 22
ANSI, 1997. American National Standard: Methods for Calculation of the Speech Intelligibility Index: Acoustical Society of America.
Flanagan (bib0007) 2013
Thang, Kimura, Unoki, Akagi (bib0039) 2006
Google 2017. Cloud Speech API
Lai (bib0020) 2015; 10
Shimamura, Tomikura (bib0033) 2005
Donahue, Li, Prabhavalkar (bib0005) 2017
Martens (bib0028) 2010
Fu, Wang, Tsao, Lu, Kawai (bib0010) 2018; 26
Hussain, Siniscalchi, Lee, Wang, Tsao, Liao (bib0017) 2017; 5
Zhang, Liu, Sinclair, Acero, Deng, Droppo, Huang, Zheng (bib0050) 2004
Loizou (bib0025) 2007
Li, Deng, Gong, Haeb-Umbach (bib0023) 2014; 22
Meng, Li, Gong, Juang (bib0029) 2018
Shivakumar, Georgiou (bib0035) 2016
Dekens, Verhelst, Capman, Beaugendre (bib0004) 2010
Lu, Tsao, Matsuda, Hori (bib0027) 2013
Shimamura, Mamiya, Tamiya (bib0034) 2006
Wand, Schmidhuber (bib0041) 2017
Fu, Tsao, Lu (bib0008) 2016
Chai, Du, Wang (bib0100) 2017
Graciarena, Franco, Sonmez, Bratt (bib0013) 2003; 10
Tajiri, Kameoka, Toda (bib0038) 2017
Wang, Wang (bib0044) 2012
Lai (bib0022) 2018
Wang, Rao, Sun, Xie, Chng, Li (bib0043) 2018
Xu, Du, Dai, Lee (bib0048) 2014; 21
Tajiri (10.1016/j.specom.2018.06.002_bib0038) 2017
Zhang (10.1016/j.specom.2018.06.002_bib0050) 2004
Fu (10.1016/j.specom.2018.06.002_bib0008) 2016
Kolbœk (10.1016/j.specom.2018.06.002_bib0018) 2016; 2016
Wang (10.1016/j.specom.2018.06.002_bib0042) 2017
Weninger (10.1016/j.specom.2018.06.002_bib0046) 2015
Fu (10.1016/j.specom.2018.06.002_bib0010) 2018; 26
Shivakumar (10.1016/j.specom.2018.06.002_bib0035) 2016
10.1016/j.specom.2018.06.002_bib0011
Lai (10.1016/j.specom.2018.06.002_bib0022) 2018
Li (10.1016/j.specom.2018.06.002_bib0023) 2014; 22
10.1016/j.specom.2018.06.002_bib0015
Wand (10.1016/j.specom.2018.06.002_bib0041) 2017
Zheng (10.1016/j.specom.2018.06.002_bib0051) 2003
Thang (10.1016/j.specom.2018.06.002_bib0039) 2006
Erdogan (10.1016/j.specom.2018.06.002_bib0006) 2015
Shimamura (10.1016/j.specom.2018.06.002_bib0033) 2005
Wang (10.1016/j.specom.2018.06.002_bib0043) 2018
Liu (10.1016/j.specom.2018.06.002_bib0024) 2004
Xu (10.1016/j.specom.2018.06.002_bib0049) 2015; 23
Sun (10.1016/j.specom.2018.06.002_bib0036) 2017
Loizou (10.1016/j.specom.2018.06.002_bib0025) 2007
Flanagan (10.1016/j.specom.2018.06.002_bib0007) 2013
Glorot (10.1016/j.specom.2018.06.002_bib0012) 2011
Chen (10.1016/j.specom.2018.06.002_bib0002) 2015
Hussain (10.1016/j.specom.2018.06.002_bib0017) 2017; 5
Lai (10.1016/j.specom.2018.06.002_bib0020) 2015; 10
Santiago (10.1016/j.specom.2018.06.002_bib0032) 2017
Taal (10.1016/j.specom.2018.06.002_bib0037) 2011; 19
Dekens (10.1016/j.specom.2018.06.002_bib0004) 2010
Lu (10.1016/j.specom.2018.06.002_bib0027) 2013
Graciarena (10.1016/j.specom.2018.06.002_bib0013) 2003; 10
Haykin (10.1016/j.specom.2018.06.002_bib0014) 1995
Donahue (10.1016/j.specom.2018.06.002_sbref0004) 2017
Xu (10.1016/j.specom.2018.06.002_bib0048) 2014; 21
Odelowo (10.1016/j.specom.2018.06.002_bib0031) 2017
Lu (10.1016/j.specom.2018.06.002_bib0026) 2014
Chai (10.1016/j.specom.2018.06.002_bib0100) 2017
Lai (10.1016/j.specom.2018.06.002_bib0021) 2017; 64
van Hoesel (10.1016/j.specom.2018.06.002_bib0040) 2005; 26
Daniel (10.1016/j.specom.2018.06.002_bib0003) 2017
Mimura (10.1016/j.specom.2018.06.002_bib0030) 2017
Wang (10.1016/j.specom.2018.06.002_bib0045) 2014; 22
Fu (10.1016/j.specom.2018.06.002_bib0009) 2017
Kuo (10.1016/j.specom.2018.06.002_bib0019) 2015
Meng (10.1016/j.specom.2018.06.002_bib0029) 2018
Xia (10.1016/j.specom.2018.06.002_bib0047) 2014; 60
Huang (10.1016/j.specom.2018.06.002_bib0016) 2015
10.1016/j.specom.2018.06.002_bib0001
Martens (10.1016/j.specom.2018.06.002_bib0028) 2010
Shimamura (10.1016/j.specom.2018.06.002_bib0034) 2006
Wang (10.1016/j.specom.2018.06.002_bib0044) 2012
References_xml – year: 2013
  ident: bib0007
  article-title: Speech Analysis Synthesis and Perception
– volume: 10
  year: 2015
  ident: bib0020
  article-title: Effects of adaptation rate and noise suppression on the intelligibility of compressed-envelope based speech
  publication-title: PloS One
– start-page: 3743
  year: 2016
  end-page: 3747
  ident: bib0035
  article-title: Perception optimized deep denoising autoencoders for speech enhancement
  publication-title: Proceedings of the Interspeech
– volume: 22
  start-page: 1849
  year: 2014
  end-page: 1858
  ident: bib0045
  article-title: On training targets for supervised speech separation
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process
– start-page: 628
  year: 2006
  end-page: 632
  ident: bib0034
  article-title: Improving bone-conducted speech quality via neural network
  publication-title: Proceedings of the ISSPIT
– start-page: 224
  year: 2012
  end-page: 232
  ident: bib0044
  article-title: Cocktail party processing via structured prediction
  publication-title: Proceedings of the NIPS
– volume: 26
  start-page: 1570
  year: 2018
  end-page: 1584
  ident: bib0010
  article-title: End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks
  publication-title: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
– start-page: 137
  year: 2015
  end-page: 140
  ident: bib0019
  article-title: The bone conduction microphone parameter measurement architecture and its speech recognition performance analysis
  publication-title: Proceedings of the JIMET
– volume: 5
  start-page: 25542
  year: 2017
  end-page: 25554
  ident: bib0017
  article-title: Experimental study on extreme learning machine applications for speech enhancement
  publication-title: IEEE Access
– year: 2017
  ident: bib0042
  article-title: Supervised Speech Separation Based on Deep Learning: An Overview
– start-page: 407
  year: 2006
  end-page: 417
  ident: bib0039
  article-title: A study on restoration of bone-conducted speech with MTF-based and LP-based models
  publication-title: J. Signal Process.
– start-page: 885
  year: 2014
  end-page: 889
  ident: bib0026
  article-title: Ensemble modeling of denoising autoencoder for speech spectrum restoration
  publication-title: Proceedings of the Interspeech
– year: 2007
  ident: bib0025
  article-title: Speech Enhancement: Theory and Practice
– year: 2017
  ident: bib0030
  article-title: Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks
  publication-title: Proceedings of the ASRU
– start-page: 1978
  year: 2010
  end-page: 1982
  ident: bib0004
  article-title: Improved speech recognition in noisy environments by using a throat microphone for accurate voicing detection
  publication-title: Proceedings of the EUSIPCO
– start-page: 363
  year: 2004
  end-page: 366
  ident: bib0024
  article-title: Direct filtering for air- and bone-conductive microphones
  publication-title: Proceedings of the MMSP
– volume: 22
  start-page: 745
  year: 2014
  end-page: 777
  ident: bib0023
  article-title: An overview of noise-robust automatic speech recognition
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process.
– year: 2015
  ident: bib0016
  article-title: Rapid adaptation for deep neural networks through multi-task learning
  publication-title: Proceedings of the Interspeech 2015
– reference: Google 2017. Cloud Speech API,
– volume: 19
  start-page: 2125
  year: 2011
  end-page: 2136
  ident: bib0037
  article-title: An algorithm for intelligibility prediction of time–frequency weighted noisy speech
  publication-title: IEEE Trans. Audio Speech Lang. Process.
– start-page: 4960
  year: 2017
  end-page: 4964
  ident: bib0038
  article-title: A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals
  publication-title: Proceeding of the ICASSP
– year: 2018
  ident: bib0022
  article-title: Deep learning-based noise reduction approach to improve speech intelligibility for cochlear implant recipients
  publication-title: Ear Hear
– start-page: 136
  year: 2017
  end-page: 140
  ident: bib0036
  article-title: Multiple-target deep learning for LSTM-RNN based speech enhancement
  publication-title: Proceedings of the HSCMA
– start-page: 708
  year: 2015
  end-page: 712
  ident: bib0006
  article-title: Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks
  publication-title: Proceedings of the ICASSP
– start-page: 249
  year: 2003
  end-page: 254
  ident: bib0051
  article-title: Air- and bone-conductive inte-grated microphones for robust speech detection and enhancement
  publication-title: Proceedings of the ASRU
– start-page: 3642
  year: 2017
  end-page: 3646
  ident: bib0032
  article-title: SEGAN: Speech enhancement generative adversarial network
  publication-title: Interspeech
– year: 2017
  ident: bib0100
  article-title: Gaussian density guided deep neural network for single-channel speech enhancement
  publication-title: IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017
– volume: 60
  start-page: 13
  year: 2014
  end-page: 29
  ident: bib0047
  article-title: Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification
  publication-title: Speech Commun.
– volume: 26
  start-page: 381
  year: 2005
  end-page: 388
  ident: bib0040
  article-title: Amplitude-mapping effects on speech intelligibility with unilateral and bilateral cochlear implants
  publication-title: Ear Hear.
– year: 2017
  ident: bib0009
  article-title: Complex spectrogram enhancement by convolutional neural network with multi-metrics learning
  publication-title: Proceedings of the MLSP
– volume: 64
  start-page: 1568
  year: 2017
  end-page: 1578
  ident: bib0021
  article-title: A deep denoising autoencoder approach to improving the intelligibility of vocoded speech in cochlear implant simulation
  publication-title: IEEE Trans. Biomed. Eng.
– year: 2018
  ident: bib0029
  article-title: Adversarial teacher-student learning for unsupervised domain adaptation
  publication-title: Proceedings of the ICASSP
– start-page: 735
  year: 2010
  end-page: 742
  ident: bib0028
  article-title: Deep learning via Hessian-free optimization
  publication-title: Proceedings of the ICML
– start-page: 781
  year: 2004
  end-page: 784
  ident: bib0050
  article-title: Multi-sensory microphones for robust speech detection, enhancement, and recognition
  publication-title: Proceedings of the ICASSP
– year: 1995
  ident: bib0014
  article-title: Advances in Spectrum Analysis and Array Processing, 3
– volume: 2016
  start-page: 305
  year: 2016
  end-page: 311
  ident: bib0018
  article-title: Speech enhancement using long short-term memory based recurrent neural networks for noise robust speaker verification
  publication-title: Proceedings of the SLT
– reference: ANSI, 1997. American National Standard: Methods for Calculation of the Speech Intelligibility Index: Acoustical Society of America.
– start-page: 3274
  year: 2015
  end-page: 3278
  ident: bib0002
  article-title: Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks
  publication-title: Proceedings of the Interspeech
– start-page: 3768
  year: 2016
  end-page: 3772
  ident: bib0008
  article-title: SNR-aware convolutional neural network modeling for speech enhancement
  publication-title: Proceedings of the Interspeech
– volume: 21
  start-page: 65
  year: 2014
  end-page: 68
  ident: bib0048
  article-title: An experimental study on speech enhancement based on deep neural networks
  publication-title: IEEE Signal Process. Lett.
– start-page: 436
  year: 2013
  end-page: 440
  ident: bib0027
  article-title: Speech enhancement based on deep denoising autoencoder
  publication-title: Proceedings of the Interspeech
– start-page: 200
  year: 2017
  end-page: 204
  ident: bib0031
  article-title: Speech enhancement using extreme learning machines
  publication-title: Proceedings of the WASPAA
– reference: .
– start-page: 2008
  year: 2017
  end-page: 2012
  ident: bib0003
  article-title: Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification
  publication-title: Proceedings of the Interspeech
– volume: 23
  start-page: 7
  year: 2015
  end-page: 19
  ident: bib0049
  article-title: A regression approach to speech enhancement based on deep neural networks,
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process
– start-page: 315
  year: 2011
  end-page: 323
  ident: bib0012
  article-title: Deep sparse rectifier neural networks
  publication-title: Proceedings of the AISTATS
– volume: 10
  start-page: 72
  year: 2003
  end-page: 74
  ident: bib0013
  article-title: Combining standard and throat microphones for robust speech recognition
  publication-title: IEEE Signal Process. Lett.
– year: 2018
  ident: bib0043
  article-title: Unsupervised domain adaptation via domain adversarial training for speaker recognition
  publication-title: Proceedings of the ICASSP.
– start-page: 1
  year: 2005
  end-page: 4
  ident: bib0033
  article-title: Quality improvement of bone-conducted speech
  publication-title: Proceedings of the ECCTD
– start-page: 91
  year: 2015
  end-page: 99
  ident: bib0046
  article-title: , Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR
  publication-title: Proceedings of the LVA/ICA
– year: 2017
  ident: bib0005
  article-title: Exploring speech enhancement with generative adversarial networks for robust speech recognition
– year: 2017
  ident: bib0041
  article-title: Improving Speaker-Independent Lipreading with Domain-Adversarial Training
– reference: Huang, M.W., 2005. Development of Taiwan Mandarin hearing in noise test. Master thesis, Department of speech language pathology and audiology, National Taipei University of Nursing and Health Sciences.
– year: 2018
  ident: 10.1016/j.specom.2018.06.002_bib0022
  article-title: Deep learning-based noise reduction approach to improve speech intelligibility for cochlear implant recipients
  publication-title: Ear Hear
  doi: 10.1097/AUD.0000000000000537
– volume: 26
  start-page: 1570
  issue: 9
  year: 2018
  ident: 10.1016/j.specom.2018.06.002_bib0010
  article-title: End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks
  publication-title: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
  doi: 10.1109/TASLP.2018.2821903
– volume: 22
  start-page: 1849
  issue: 12
  year: 2014
  ident: 10.1016/j.specom.2018.06.002_bib0045
  article-title: On training targets for supervised speech separation
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process
  doi: 10.1109/TASLP.2014.2352935
– volume: 22
  start-page: 745
  issue: 4
  year: 2014
  ident: 10.1016/j.specom.2018.06.002_bib0023
  article-title: An overview of noise-robust automatic speech recognition
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASLP.2014.2304637
– volume: 21
  start-page: 65
  issue: 1
  year: 2014
  ident: 10.1016/j.specom.2018.06.002_bib0048
  article-title: An experimental study on speech enhancement based on deep neural networks
  publication-title: IEEE Signal Process. Lett.
  doi: 10.1109/LSP.2013.2291240
– start-page: 200
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0031
  article-title: Speech enhancement using extreme learning machines
– start-page: 436
  year: 2013
  ident: 10.1016/j.specom.2018.06.002_bib0027
  article-title: Speech enhancement based on deep denoising autoencoder
– volume: 10
  year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0020
  article-title: Effects of adaptation rate and noise suppression on the intelligibility of compressed-envelope based speech
  publication-title: PloS One
  doi: 10.1371/journal.pone.0133519
– year: 2017
  ident: 10.1016/j.specom.2018.06.002_sbref0004
  article-title: Exploring speech enhancement with generative adversarial networks for robust speech recognition
– start-page: 628
  year: 2006
  ident: 10.1016/j.specom.2018.06.002_bib0034
  article-title: Improving bone-conducted speech quality via neural network
– start-page: 407
  year: 2006
  ident: 10.1016/j.specom.2018.06.002_bib0039
  article-title: A study on restoration of bone-conducted speech with MTF-based and LP-based models
  publication-title: J. Signal Process.
– year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0042
– start-page: 708
  year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0006
  article-title: Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks
– year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0041
– year: 1995
  ident: 10.1016/j.specom.2018.06.002_bib0014
– start-page: 315
  year: 2011
  ident: 10.1016/j.specom.2018.06.002_bib0012
  article-title: Deep sparse rectifier neural networks
– start-page: 1978
  year: 2010
  ident: 10.1016/j.specom.2018.06.002_bib0004
  article-title: Improved speech recognition in noisy environments by using a throat microphone for accurate voicing detection
– volume: 60
  start-page: 13
  year: 2014
  ident: 10.1016/j.specom.2018.06.002_bib0047
  article-title: Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification
  publication-title: Speech Commun.
  doi: 10.1016/j.specom.2014.02.001
– start-page: 137
  year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0019
  article-title: The bone conduction microphone parameter measurement architecture and its speech recognition performance analysis
– year: 2013
  ident: 10.1016/j.specom.2018.06.002_bib0007
– ident: 10.1016/j.specom.2018.06.002_bib0011
– year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0100
  article-title: Gaussian density guided deep neural network for single-channel speech enhancement
– ident: 10.1016/j.specom.2018.06.002_bib0015
– year: 2018
  ident: 10.1016/j.specom.2018.06.002_bib0029
  article-title: Adversarial teacher-student learning for unsupervised domain adaptation
– ident: 10.1016/j.specom.2018.06.002_bib0001
– volume: 23
  start-page: 7
  year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0049
  article-title: A regression approach to speech enhancement based on deep neural networks,
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process
  doi: 10.1109/TASLP.2014.2364452
– year: 2007
  ident: 10.1016/j.specom.2018.06.002_bib0025
– year: 2018
  ident: 10.1016/j.specom.2018.06.002_bib0043
  article-title: Unsupervised domain adaptation via domain adversarial training for speaker recognition
– volume: 64
  start-page: 1568
  issue: 7
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0021
  article-title: A deep denoising autoencoder approach to improving the intelligibility of vocoded speech in cochlear implant simulation
  publication-title: IEEE Trans. Biomed. Eng.
  doi: 10.1109/TBME.2016.2613960
– start-page: 3768
  year: 2016
  ident: 10.1016/j.specom.2018.06.002_bib0008
  article-title: SNR-aware convolutional neural network modeling for speech enhancement
  doi: 10.21437/Interspeech.2016-211
– start-page: 3274
  year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0002
  article-title: Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks
– start-page: 136
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0036
  article-title: Multiple-target deep learning for LSTM-RNN based speech enhancement
– start-page: 363
  year: 2004
  ident: 10.1016/j.specom.2018.06.002_bib0024
  article-title: Direct filtering for air- and bone-conductive microphones
– year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0030
  article-title: Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks
– volume: 2016
  start-page: 305
  year: 2016
  ident: 10.1016/j.specom.2018.06.002_bib0018
  article-title: Speech enhancement using long short-term memory based recurrent neural networks for noise robust speaker verification
– year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0009
  article-title: Complex spectrogram enhancement by convolutional neural network with multi-metrics learning
– start-page: 91
  year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0046
  article-title: , Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR
– start-page: 885
  year: 2014
  ident: 10.1016/j.specom.2018.06.002_bib0026
  article-title: Ensemble modeling of denoising autoencoder for speech spectrum restoration
– start-page: 249
  year: 2003
  ident: 10.1016/j.specom.2018.06.002_bib0051
  article-title: Air- and bone-conductive inte-grated microphones for robust speech detection and enhancement
– start-page: 2008
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0003
  article-title: Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification
– volume: 5
  start-page: 25542
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0017
  article-title: Experimental study on extreme learning machine applications for speech enhancement
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2017.2766675
– volume: 26
  start-page: 381
  year: 2005
  ident: 10.1016/j.specom.2018.06.002_bib0040
  article-title: Amplitude-mapping effects on speech intelligibility with unilateral and bilateral cochlear implants
  publication-title: Ear Hear.
  doi: 10.1097/00003446-200508000-00002
– start-page: 4960
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0038
  article-title: A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals
– volume: 10
  start-page: 72
  issue: 3
  year: 2003
  ident: 10.1016/j.specom.2018.06.002_bib0013
  article-title: Combining standard and throat microphones for robust speech recognition
  publication-title: IEEE Signal Process. Lett.
  doi: 10.1109/LSP.2003.808549
– start-page: 781
  year: 2004
  ident: 10.1016/j.specom.2018.06.002_bib0050
  article-title: Multi-sensory microphones for robust speech detection, enhancement, and recognition
– start-page: 3743
  year: 2016
  ident: 10.1016/j.specom.2018.06.002_bib0035
  article-title: Perception optimized deep denoising autoencoders for speech enhancement
  doi: 10.21437/Interspeech.2016-1284
– start-page: 735
  year: 2010
  ident: 10.1016/j.specom.2018.06.002_bib0028
  article-title: Deep learning via Hessian-free optimization
– volume: 19
  start-page: 2125
  issue: 7
  year: 2011
  ident: 10.1016/j.specom.2018.06.002_bib0037
  article-title: An algorithm for intelligibility prediction of time–frequency weighted noisy speech
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2011.2114881
– start-page: 1
  year: 2005
  ident: 10.1016/j.specom.2018.06.002_bib0033
  article-title: Quality improvement of bone-conducted speech
– start-page: 224
  year: 2012
  ident: 10.1016/j.specom.2018.06.002_bib0044
  article-title: Cocktail party processing via structured prediction
– start-page: 3642
  year: 2017
  ident: 10.1016/j.specom.2018.06.002_bib0032
  article-title: SEGAN: Speech enhancement generative adversarial network
– year: 2015
  ident: 10.1016/j.specom.2018.06.002_bib0016
  article-title: Rapid adaptation for deep neural networks through multi-task learning
SSID ssj0004882
Score 2.424156
Snippet Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 106
SubjectTerms Automatic speech recognition
Bone-conduction microphone
Bones
Deep denoising autoencoder
Denoising
Frequencies
Frequency response
Intelligibility
Microphones
Noise
Noise reduction
Resistance
Skull
Sound pressure
Speech
Speech enhancement
Speech processing
Speech recognition
Speeches
Vibrations
Voice recognition
Title Bone-conducted speech enhancement using deep denoising autoencoder
URI https://dx.doi.org/10.1016/j.specom.2018.06.002
https://www.proquest.com/docview/2154227666
Volume 104
WOSCitedRecordID wos000455419500011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect database
  customDbUrl:
  eissn: 1872-7182
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0004882
  issn: 0167-6393
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELdg42EvCMrXoCA_IF6QUT6c2nnc0BCgaZrUIcpTZDuO2m1KqqZB23-_80fcjAoNkHiolViJY_Uud7-73AdCb7ltsM0kkTKbECooHCWKkUSAxORSgQ1NbbMJdnLCZ7P81Ic1t7adAKtrfnWVL_8rqWEOiG1SZ_-C3GFRmIBjIDqMQHYY_4jwh02tCVi5ppAroMl2qbWav9f13NDXfvrvrH-g1HoJQ90s7Kno1o0paln6cF0PWKfudjXMIwlBPIvOai6QF-S0V4HGC9AK64D90QXu6NyXnPmi6cgUdlIPvQ0x92l3wQW2lQbjvJIgbQHqOEmlnSTlDKB7zG-L2ogOhGUcTQZ6N3bh1Fsi3XkXzj-YzNPG1A6IXcXVKNmosBBYODVbMTuJQXKlNJvdR7sJy3KQd7sHX45mXzc5s9w2Egtb79Mqbezf9rN-B1t-UeAWlZw9Qg-9OYEPHBs8Rvd0PULPj70TusXv8HGom92O0F7Qd9cjNHap2fi7vqzESsO1_USzuniCDm8zEnaMhAeMhC0jYcNIODASHjDSU_Tt09HZx8_Ed9wgCmT5miQy0qqalJnQnDHBY_ilYkJFPMkYNejUmAAKzFjJUlllWok8yiU8RwlR0Sp9hnZq2NwLhKtIlnkaqaSiGZVK8CgD5FlqWLJUABv3Udr_oYXy5ehNV5TLoo87PC8cGQpDhsKGXyb7iIS7lq4cyx3Xs55WhYeUDioWwF533DnuSVv4t7stAB_TJGFg8r_854Vfob3NizVGO-tVp1-jB-rnetGu3ng2vQH8uqaq
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Bone-conducted+speech+enhancement+using+deep+denoising+autoencoder&rft.jtitle=Speech+communication&rft.au=Liu%2C+Hung-Ping&rft.au=Tsao%2C+Yu&rft.au=Fuh%2C+Chiou-Shann&rft.date=2018-11-01&rft.pub=Elsevier+B.V&rft.issn=0167-6393&rft.eissn=1872-7182&rft.volume=104&rft.spage=106&rft.epage=112&rft_id=info:doi/10.1016%2Fj.specom.2018.06.002&rft.externalDocID=S016763931730345X
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6393&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6393&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6393&client=summon