Bone-conducted speech enhancement using deep denoising autoencoder
Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech...
Gespeichert in:
| Veröffentlicht in: | Speech communication Jg. 104; S. 106 - 112 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Amsterdam
Elsevier B.V
01.11.2018
Elsevier Science Ltd |
| Schlagworte: | |
| ISSN: | 0167-6393, 1872-7182 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%). |
|---|---|
| AbstractList | Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%). Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%). |
| Author | Fuh, Chiou-Shann Liu, Hung-Ping Tsao, Yu |
| Author_xml | – sequence: 1 givenname: Hung-Ping surname: Liu fullname: Liu, Hung-Ping email: howardliu1223@gmail.com organization: Graduate Institute of Biomedical Electronics and Bioinformatics, Nation Taiwan University, Taipei, Taiwan – sequence: 2 givenname: Yu surname: Tsao fullname: Tsao, Yu email: yu.tsao@citi.sinica.edu.tw organization: Research Center for Information Technology Innovation, Academia Sinica, Taiwan – sequence: 3 givenname: Chiou-Shann surname: Fuh fullname: Fuh, Chiou-Shann organization: Graduate Institute of Biomedical Electronics and Bioinformatics, Nation Taiwan University, Taipei, Taiwan |
| BookMark | eNqFkD1PwzAQhi0EEm3hHzBEYk44O4mdMiDRii-pEgvMluNcqKvWLraDxL_HJUwMMNjWSfc-d36m5Ng6i4RcUCgoUH61KcIetdsVDGhTAC8A2BGZ0EawXNCGHZNJahM5L-flKZmGsAGAqmnYhCwWCZVrZ7tBR-yyBEK9ztCuldW4QxuzIRj7lnWI-3RZZ75LNUSHVrsO_Rk56dU24PnPOyOv93cvy8d89fzwtLxd5boCiDlrAXXPu1phI4RqaDql4pWivBYVcF6mnWrNqGhF2fY1ajWHeZvmaqX6qi9n5HLk7r17HzBEuXGDt2mkZLSuGBM8QWakGru0dyF47OXem53yn5KCPNiSGznakgdbErhMtlLs-ldMm6iicTZ6Zbb_hW_GMKbvfxj0MmiT7GBnPOooO2f-BnwB7ZGK_g |
| CitedBy_id | crossref_primary_10_1016_j_apacoust_2025_110924 crossref_primary_10_1121_10_0010316 crossref_primary_10_1016_j_apacoust_2022_109058 crossref_primary_10_3390_s21051878 crossref_primary_10_1016_j_apacoust_2023_109576 crossref_primary_10_1109_TIFS_2023_3346647 crossref_primary_10_3390_s23010035 crossref_primary_10_1016_j_specom_2025_103223 crossref_primary_10_1016_j_sigpro_2024_109615 crossref_primary_10_1186_s13636_025_00418_1 crossref_primary_10_3390_a16030153 crossref_primary_10_2139_ssrn_5037701 crossref_primary_10_1155_2022_4473952 crossref_primary_10_1145_3699757 crossref_primary_10_1109_LSP_2020_3000968 crossref_primary_10_1109_TASLP_2020_2976193 crossref_primary_10_1007_s00034_024_02733_y crossref_primary_10_1109_TNSRE_2020_3042655 crossref_primary_10_1109_JSEN_2025_3585068 crossref_primary_10_3390_s20185050 crossref_primary_10_1109_ACCESS_2024_3414435 crossref_primary_10_3233_JIFS_201014 crossref_primary_10_58399_TGMU6907 crossref_primary_10_1016_j_asoc_2022_109618 crossref_primary_10_1109_LSP_2023_3347149 crossref_primary_10_1109_TASLP_2023_3313433 crossref_primary_10_1007_s12652_021_03222_9 crossref_primary_10_1109_ACCESS_2025_3564137 crossref_primary_10_1121_10_0028339 crossref_primary_10_1007_s12652_020_02598_4 crossref_primary_10_1109_TASLP_2023_3337988 crossref_primary_10_1007_s11277_021_08313_6 crossref_primary_10_1109_ACCESS_2020_3021061 crossref_primary_10_1016_j_apacoust_2024_110293 crossref_primary_10_1541_ieejeiss_145_693 crossref_primary_10_1109_ACCESS_2020_2970143 |
| Cites_doi | 10.1097/AUD.0000000000000537 10.1109/TASLP.2018.2821903 10.1109/TASLP.2014.2352935 10.1109/TASLP.2014.2304637 10.1109/LSP.2013.2291240 10.1371/journal.pone.0133519 10.1016/j.specom.2014.02.001 10.1109/TASLP.2014.2364452 10.1109/TBME.2016.2613960 10.21437/Interspeech.2016-211 10.1109/ACCESS.2017.2766675 10.1097/00003446-200508000-00002 10.1109/LSP.2003.808549 10.21437/Interspeech.2016-1284 10.1109/TASL.2011.2114881 |
| ContentType | Journal Article |
| Copyright | 2018 Elsevier B.V. Copyright Elsevier Science Ltd. Nov 2018 |
| Copyright_xml | – notice: 2018 Elsevier B.V. – notice: Copyright Elsevier Science Ltd. Nov 2018 |
| DBID | AAYXX CITATION 7SC 7SP 7T9 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.specom.2018.06.002 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Linguistics and Language Behavior Abstracts (LLBA) Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Languages & Literatures Social Welfare & Social Work Psychology |
| EISSN | 1872-7182 |
| EndPage | 112 |
| ExternalDocumentID | 10_1016_j_specom_2018_06_002 S016763931730345X |
| GroupedDBID | --K --M -~X .DC .~1 07C 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN 9JO AACTN AADFP AAEDT AAEDW AAFJI AAGJA AAGJQ AAGUQ AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABFNM ABIVO ABJNI ABMAC ABMMH ABOYX ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACXNI ACZNC ADBBV ADEZE ADIYS ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AFYLN AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV AKYCK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOMHK AOUOD ASPBG AVARZ AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 M41 MO0 N9A O-L O9- OAUVE OKEIE OZT P-8 P-9 P2P PC. PQQKQ PRBVW Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SEW SPC SPCBC SSB SSO SST SSV SSY SSZ T5K WUQ XFK XJE ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 7SP 7T9 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c400t-2b0ecf6d5ae877a817a83a64a1657406630045c217b73bf5eca909bdeecaaf4f3 |
| ISICitedReferencesCount | 46 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000455419500011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0167-6393 |
| IngestDate | Sun Nov 09 12:51:53 EST 2025 Tue Nov 18 22:23:30 EST 2025 Sat Nov 29 06:17:49 EST 2025 Fri Feb 23 02:28:30 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Bone-conduction microphone Deep denoising autoencoder |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c400t-2b0ecf6d5ae877a817a83a64a1657406630045c217b73bf5eca909bdeecaaf4f3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 2154227666 |
| PQPubID | 2038294 |
| PageCount | 7 |
| ParticipantIDs | proquest_journals_2154227666 crossref_primary_10_1016_j_specom_2018_06_002 crossref_citationtrail_10_1016_j_specom_2018_06_002 elsevier_sciencedirect_doi_10_1016_j_specom_2018_06_002 |
| PublicationCentury | 2000 |
| PublicationDate | November 2018 2018-11-00 20181101 |
| PublicationDateYYYYMMDD | 2018-11-01 |
| PublicationDate_xml | – month: 11 year: 2018 text: November 2018 |
| PublicationDecade | 2010 |
| PublicationPlace | Amsterdam |
| PublicationPlace_xml | – name: Amsterdam |
| PublicationTitle | Speech communication |
| PublicationYear | 2018 |
| Publisher | Elsevier B.V Elsevier Science Ltd |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier Science Ltd |
| References | Haykin (bib0014) 1995 Daniel, Tan (bib0003) 2017 Chen, Watanabe, Erdogan, Hershey (bib0002) 2015 Glorot, Bordes, Bengio (bib0012) 2011 Odelowo, Anderson (bib0031) 2017 Liu, Zhang, Acero, Droppo, Huang (bib0024) 2004 Lu, Tsao, Matsuda, Hori (bib0026) 2014 Taal, Hendriks, Heusdens, Jensen (bib0037) 2011; 19 van Hoesel (bib0040) 2005; 26 Weninger, Erdogan, Watanabe (bib0046) 2015 Xia, Bao (bib0047) 2014; 60 Sun, Du, Dai, Lee (bib0036) 2017 Huang, Li, Siniscalchi, Chen, Wu, Lee (bib0016) 2015 Kolbœk, Tan, Jensen (bib0018) 2016; 2016 Lai (bib0021) 2017; 64 . Wang, Chen (bib0042) 2017 Santiago, Antonio, Joan (bib0032) 2017 Huang, M.W., 2005. Development of Taiwan Mandarin hearing in noise test. Master thesis, Department of speech language pathology and audiology, National Taipei University of Nursing and Health Sciences. Zheng, Liu, Zhang, Sinclair, Droppo, Deng, Acero, Huang (bib0051) 2003 Xu, Du, Dai, Lee (bib0049) 2015; 23 Erdogan, Hershey, Watanabe, Le Roux (bib0006) 2015 Kuo, Yu, Yan (bib0019) 2015 Fu, Hu, Tsao, Lu (bib0009) 2017 Mimura, Sakai, Kawahara (bib0030) 2017 Wang, Narayanan, Wang (bib0045) 2014; 22 ANSI, 1997. American National Standard: Methods for Calculation of the Speech Intelligibility Index: Acoustical Society of America. Flanagan (bib0007) 2013 Thang, Kimura, Unoki, Akagi (bib0039) 2006 Google 2017. Cloud Speech API Lai (bib0020) 2015; 10 Shimamura, Tomikura (bib0033) 2005 Donahue, Li, Prabhavalkar (bib0005) 2017 Martens (bib0028) 2010 Fu, Wang, Tsao, Lu, Kawai (bib0010) 2018; 26 Hussain, Siniscalchi, Lee, Wang, Tsao, Liao (bib0017) 2017; 5 Zhang, Liu, Sinclair, Acero, Deng, Droppo, Huang, Zheng (bib0050) 2004 Loizou (bib0025) 2007 Li, Deng, Gong, Haeb-Umbach (bib0023) 2014; 22 Meng, Li, Gong, Juang (bib0029) 2018 Shivakumar, Georgiou (bib0035) 2016 Dekens, Verhelst, Capman, Beaugendre (bib0004) 2010 Lu, Tsao, Matsuda, Hori (bib0027) 2013 Shimamura, Mamiya, Tamiya (bib0034) 2006 Wand, Schmidhuber (bib0041) 2017 Fu, Tsao, Lu (bib0008) 2016 Chai, Du, Wang (bib0100) 2017 Graciarena, Franco, Sonmez, Bratt (bib0013) 2003; 10 Tajiri, Kameoka, Toda (bib0038) 2017 Wang, Wang (bib0044) 2012 Lai (bib0022) 2018 Wang, Rao, Sun, Xie, Chng, Li (bib0043) 2018 Xu, Du, Dai, Lee (bib0048) 2014; 21 Tajiri (10.1016/j.specom.2018.06.002_bib0038) 2017 Zhang (10.1016/j.specom.2018.06.002_bib0050) 2004 Fu (10.1016/j.specom.2018.06.002_bib0008) 2016 Kolbœk (10.1016/j.specom.2018.06.002_bib0018) 2016; 2016 Wang (10.1016/j.specom.2018.06.002_bib0042) 2017 Weninger (10.1016/j.specom.2018.06.002_bib0046) 2015 Fu (10.1016/j.specom.2018.06.002_bib0010) 2018; 26 Shivakumar (10.1016/j.specom.2018.06.002_bib0035) 2016 10.1016/j.specom.2018.06.002_bib0011 Lai (10.1016/j.specom.2018.06.002_bib0022) 2018 Li (10.1016/j.specom.2018.06.002_bib0023) 2014; 22 10.1016/j.specom.2018.06.002_bib0015 Wand (10.1016/j.specom.2018.06.002_bib0041) 2017 Zheng (10.1016/j.specom.2018.06.002_bib0051) 2003 Thang (10.1016/j.specom.2018.06.002_bib0039) 2006 Erdogan (10.1016/j.specom.2018.06.002_bib0006) 2015 Shimamura (10.1016/j.specom.2018.06.002_bib0033) 2005 Wang (10.1016/j.specom.2018.06.002_bib0043) 2018 Liu (10.1016/j.specom.2018.06.002_bib0024) 2004 Xu (10.1016/j.specom.2018.06.002_bib0049) 2015; 23 Sun (10.1016/j.specom.2018.06.002_bib0036) 2017 Loizou (10.1016/j.specom.2018.06.002_bib0025) 2007 Flanagan (10.1016/j.specom.2018.06.002_bib0007) 2013 Glorot (10.1016/j.specom.2018.06.002_bib0012) 2011 Chen (10.1016/j.specom.2018.06.002_bib0002) 2015 Hussain (10.1016/j.specom.2018.06.002_bib0017) 2017; 5 Lai (10.1016/j.specom.2018.06.002_bib0020) 2015; 10 Santiago (10.1016/j.specom.2018.06.002_bib0032) 2017 Taal (10.1016/j.specom.2018.06.002_bib0037) 2011; 19 Dekens (10.1016/j.specom.2018.06.002_bib0004) 2010 Lu (10.1016/j.specom.2018.06.002_bib0027) 2013 Graciarena (10.1016/j.specom.2018.06.002_bib0013) 2003; 10 Haykin (10.1016/j.specom.2018.06.002_bib0014) 1995 Donahue (10.1016/j.specom.2018.06.002_sbref0004) 2017 Xu (10.1016/j.specom.2018.06.002_bib0048) 2014; 21 Odelowo (10.1016/j.specom.2018.06.002_bib0031) 2017 Lu (10.1016/j.specom.2018.06.002_bib0026) 2014 Chai (10.1016/j.specom.2018.06.002_bib0100) 2017 Lai (10.1016/j.specom.2018.06.002_bib0021) 2017; 64 van Hoesel (10.1016/j.specom.2018.06.002_bib0040) 2005; 26 Daniel (10.1016/j.specom.2018.06.002_bib0003) 2017 Mimura (10.1016/j.specom.2018.06.002_bib0030) 2017 Wang (10.1016/j.specom.2018.06.002_bib0045) 2014; 22 Fu (10.1016/j.specom.2018.06.002_bib0009) 2017 Kuo (10.1016/j.specom.2018.06.002_bib0019) 2015 Meng (10.1016/j.specom.2018.06.002_bib0029) 2018 Xia (10.1016/j.specom.2018.06.002_bib0047) 2014; 60 Huang (10.1016/j.specom.2018.06.002_bib0016) 2015 10.1016/j.specom.2018.06.002_bib0001 Martens (10.1016/j.specom.2018.06.002_bib0028) 2010 Shimamura (10.1016/j.specom.2018.06.002_bib0034) 2006 Wang (10.1016/j.specom.2018.06.002_bib0044) 2012 |
| References_xml | – year: 2013 ident: bib0007 article-title: Speech Analysis Synthesis and Perception – volume: 10 year: 2015 ident: bib0020 article-title: Effects of adaptation rate and noise suppression on the intelligibility of compressed-envelope based speech publication-title: PloS One – start-page: 3743 year: 2016 end-page: 3747 ident: bib0035 article-title: Perception optimized deep denoising autoencoders for speech enhancement publication-title: Proceedings of the Interspeech – volume: 22 start-page: 1849 year: 2014 end-page: 1858 ident: bib0045 article-title: On training targets for supervised speech separation publication-title: IEEE/ACM Trans. Audio Speech Lang. Process – start-page: 628 year: 2006 end-page: 632 ident: bib0034 article-title: Improving bone-conducted speech quality via neural network publication-title: Proceedings of the ISSPIT – start-page: 224 year: 2012 end-page: 232 ident: bib0044 article-title: Cocktail party processing via structured prediction publication-title: Proceedings of the NIPS – volume: 26 start-page: 1570 year: 2018 end-page: 1584 ident: bib0010 article-title: End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks publication-title: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) – start-page: 137 year: 2015 end-page: 140 ident: bib0019 article-title: The bone conduction microphone parameter measurement architecture and its speech recognition performance analysis publication-title: Proceedings of the JIMET – volume: 5 start-page: 25542 year: 2017 end-page: 25554 ident: bib0017 article-title: Experimental study on extreme learning machine applications for speech enhancement publication-title: IEEE Access – year: 2017 ident: bib0042 article-title: Supervised Speech Separation Based on Deep Learning: An Overview – start-page: 407 year: 2006 end-page: 417 ident: bib0039 article-title: A study on restoration of bone-conducted speech with MTF-based and LP-based models publication-title: J. Signal Process. – start-page: 885 year: 2014 end-page: 889 ident: bib0026 article-title: Ensemble modeling of denoising autoencoder for speech spectrum restoration publication-title: Proceedings of the Interspeech – year: 2007 ident: bib0025 article-title: Speech Enhancement: Theory and Practice – year: 2017 ident: bib0030 article-title: Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks publication-title: Proceedings of the ASRU – start-page: 1978 year: 2010 end-page: 1982 ident: bib0004 article-title: Improved speech recognition in noisy environments by using a throat microphone for accurate voicing detection publication-title: Proceedings of the EUSIPCO – start-page: 363 year: 2004 end-page: 366 ident: bib0024 article-title: Direct filtering for air- and bone-conductive microphones publication-title: Proceedings of the MMSP – volume: 22 start-page: 745 year: 2014 end-page: 777 ident: bib0023 article-title: An overview of noise-robust automatic speech recognition publication-title: IEEE/ACM Trans. Audio Speech Lang. Process. – year: 2015 ident: bib0016 article-title: Rapid adaptation for deep neural networks through multi-task learning publication-title: Proceedings of the Interspeech 2015 – reference: Google 2017. Cloud Speech API, – volume: 19 start-page: 2125 year: 2011 end-page: 2136 ident: bib0037 article-title: An algorithm for intelligibility prediction of time–frequency weighted noisy speech publication-title: IEEE Trans. Audio Speech Lang. Process. – start-page: 4960 year: 2017 end-page: 4964 ident: bib0038 article-title: A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals publication-title: Proceeding of the ICASSP – year: 2018 ident: bib0022 article-title: Deep learning-based noise reduction approach to improve speech intelligibility for cochlear implant recipients publication-title: Ear Hear – start-page: 136 year: 2017 end-page: 140 ident: bib0036 article-title: Multiple-target deep learning for LSTM-RNN based speech enhancement publication-title: Proceedings of the HSCMA – start-page: 708 year: 2015 end-page: 712 ident: bib0006 article-title: Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks publication-title: Proceedings of the ICASSP – start-page: 249 year: 2003 end-page: 254 ident: bib0051 article-title: Air- and bone-conductive inte-grated microphones for robust speech detection and enhancement publication-title: Proceedings of the ASRU – start-page: 3642 year: 2017 end-page: 3646 ident: bib0032 article-title: SEGAN: Speech enhancement generative adversarial network publication-title: Interspeech – year: 2017 ident: bib0100 article-title: Gaussian density guided deep neural network for single-channel speech enhancement publication-title: IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017 – volume: 60 start-page: 13 year: 2014 end-page: 29 ident: bib0047 article-title: Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification publication-title: Speech Commun. – volume: 26 start-page: 381 year: 2005 end-page: 388 ident: bib0040 article-title: Amplitude-mapping effects on speech intelligibility with unilateral and bilateral cochlear implants publication-title: Ear Hear. – year: 2017 ident: bib0009 article-title: Complex spectrogram enhancement by convolutional neural network with multi-metrics learning publication-title: Proceedings of the MLSP – volume: 64 start-page: 1568 year: 2017 end-page: 1578 ident: bib0021 article-title: A deep denoising autoencoder approach to improving the intelligibility of vocoded speech in cochlear implant simulation publication-title: IEEE Trans. Biomed. Eng. – year: 2018 ident: bib0029 article-title: Adversarial teacher-student learning for unsupervised domain adaptation publication-title: Proceedings of the ICASSP – start-page: 735 year: 2010 end-page: 742 ident: bib0028 article-title: Deep learning via Hessian-free optimization publication-title: Proceedings of the ICML – start-page: 781 year: 2004 end-page: 784 ident: bib0050 article-title: Multi-sensory microphones for robust speech detection, enhancement, and recognition publication-title: Proceedings of the ICASSP – year: 1995 ident: bib0014 article-title: Advances in Spectrum Analysis and Array Processing, 3 – volume: 2016 start-page: 305 year: 2016 end-page: 311 ident: bib0018 article-title: Speech enhancement using long short-term memory based recurrent neural networks for noise robust speaker verification publication-title: Proceedings of the SLT – reference: ANSI, 1997. American National Standard: Methods for Calculation of the Speech Intelligibility Index: Acoustical Society of America. – start-page: 3274 year: 2015 end-page: 3278 ident: bib0002 article-title: Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks publication-title: Proceedings of the Interspeech – start-page: 3768 year: 2016 end-page: 3772 ident: bib0008 article-title: SNR-aware convolutional neural network modeling for speech enhancement publication-title: Proceedings of the Interspeech – volume: 21 start-page: 65 year: 2014 end-page: 68 ident: bib0048 article-title: An experimental study on speech enhancement based on deep neural networks publication-title: IEEE Signal Process. Lett. – start-page: 436 year: 2013 end-page: 440 ident: bib0027 article-title: Speech enhancement based on deep denoising autoencoder publication-title: Proceedings of the Interspeech – start-page: 200 year: 2017 end-page: 204 ident: bib0031 article-title: Speech enhancement using extreme learning machines publication-title: Proceedings of the WASPAA – reference: . – start-page: 2008 year: 2017 end-page: 2012 ident: bib0003 article-title: Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification publication-title: Proceedings of the Interspeech – volume: 23 start-page: 7 year: 2015 end-page: 19 ident: bib0049 article-title: A regression approach to speech enhancement based on deep neural networks, publication-title: IEEE/ACM Trans. Audio Speech Lang. Process – start-page: 315 year: 2011 end-page: 323 ident: bib0012 article-title: Deep sparse rectifier neural networks publication-title: Proceedings of the AISTATS – volume: 10 start-page: 72 year: 2003 end-page: 74 ident: bib0013 article-title: Combining standard and throat microphones for robust speech recognition publication-title: IEEE Signal Process. Lett. – year: 2018 ident: bib0043 article-title: Unsupervised domain adaptation via domain adversarial training for speaker recognition publication-title: Proceedings of the ICASSP. – start-page: 1 year: 2005 end-page: 4 ident: bib0033 article-title: Quality improvement of bone-conducted speech publication-title: Proceedings of the ECCTD – start-page: 91 year: 2015 end-page: 99 ident: bib0046 article-title: , Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR publication-title: Proceedings of the LVA/ICA – year: 2017 ident: bib0005 article-title: Exploring speech enhancement with generative adversarial networks for robust speech recognition – year: 2017 ident: bib0041 article-title: Improving Speaker-Independent Lipreading with Domain-Adversarial Training – reference: Huang, M.W., 2005. Development of Taiwan Mandarin hearing in noise test. Master thesis, Department of speech language pathology and audiology, National Taipei University of Nursing and Health Sciences. – year: 2018 ident: 10.1016/j.specom.2018.06.002_bib0022 article-title: Deep learning-based noise reduction approach to improve speech intelligibility for cochlear implant recipients publication-title: Ear Hear doi: 10.1097/AUD.0000000000000537 – volume: 26 start-page: 1570 issue: 9 year: 2018 ident: 10.1016/j.specom.2018.06.002_bib0010 article-title: End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks publication-title: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) doi: 10.1109/TASLP.2018.2821903 – volume: 22 start-page: 1849 issue: 12 year: 2014 ident: 10.1016/j.specom.2018.06.002_bib0045 article-title: On training targets for supervised speech separation publication-title: IEEE/ACM Trans. Audio Speech Lang. Process doi: 10.1109/TASLP.2014.2352935 – volume: 22 start-page: 745 issue: 4 year: 2014 ident: 10.1016/j.specom.2018.06.002_bib0023 article-title: An overview of noise-robust automatic speech recognition publication-title: IEEE/ACM Trans. Audio Speech Lang. Process. doi: 10.1109/TASLP.2014.2304637 – volume: 21 start-page: 65 issue: 1 year: 2014 ident: 10.1016/j.specom.2018.06.002_bib0048 article-title: An experimental study on speech enhancement based on deep neural networks publication-title: IEEE Signal Process. Lett. doi: 10.1109/LSP.2013.2291240 – start-page: 200 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0031 article-title: Speech enhancement using extreme learning machines – start-page: 436 year: 2013 ident: 10.1016/j.specom.2018.06.002_bib0027 article-title: Speech enhancement based on deep denoising autoencoder – volume: 10 year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0020 article-title: Effects of adaptation rate and noise suppression on the intelligibility of compressed-envelope based speech publication-title: PloS One doi: 10.1371/journal.pone.0133519 – year: 2017 ident: 10.1016/j.specom.2018.06.002_sbref0004 article-title: Exploring speech enhancement with generative adversarial networks for robust speech recognition – start-page: 628 year: 2006 ident: 10.1016/j.specom.2018.06.002_bib0034 article-title: Improving bone-conducted speech quality via neural network – start-page: 407 year: 2006 ident: 10.1016/j.specom.2018.06.002_bib0039 article-title: A study on restoration of bone-conducted speech with MTF-based and LP-based models publication-title: J. Signal Process. – year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0042 – start-page: 708 year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0006 article-title: Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks – year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0041 – year: 1995 ident: 10.1016/j.specom.2018.06.002_bib0014 – start-page: 315 year: 2011 ident: 10.1016/j.specom.2018.06.002_bib0012 article-title: Deep sparse rectifier neural networks – start-page: 1978 year: 2010 ident: 10.1016/j.specom.2018.06.002_bib0004 article-title: Improved speech recognition in noisy environments by using a throat microphone for accurate voicing detection – volume: 60 start-page: 13 year: 2014 ident: 10.1016/j.specom.2018.06.002_bib0047 article-title: Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification publication-title: Speech Commun. doi: 10.1016/j.specom.2014.02.001 – start-page: 137 year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0019 article-title: The bone conduction microphone parameter measurement architecture and its speech recognition performance analysis – year: 2013 ident: 10.1016/j.specom.2018.06.002_bib0007 – ident: 10.1016/j.specom.2018.06.002_bib0011 – year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0100 article-title: Gaussian density guided deep neural network for single-channel speech enhancement – ident: 10.1016/j.specom.2018.06.002_bib0015 – year: 2018 ident: 10.1016/j.specom.2018.06.002_bib0029 article-title: Adversarial teacher-student learning for unsupervised domain adaptation – ident: 10.1016/j.specom.2018.06.002_bib0001 – volume: 23 start-page: 7 year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0049 article-title: A regression approach to speech enhancement based on deep neural networks, publication-title: IEEE/ACM Trans. Audio Speech Lang. Process doi: 10.1109/TASLP.2014.2364452 – year: 2007 ident: 10.1016/j.specom.2018.06.002_bib0025 – year: 2018 ident: 10.1016/j.specom.2018.06.002_bib0043 article-title: Unsupervised domain adaptation via domain adversarial training for speaker recognition – volume: 64 start-page: 1568 issue: 7 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0021 article-title: A deep denoising autoencoder approach to improving the intelligibility of vocoded speech in cochlear implant simulation publication-title: IEEE Trans. Biomed. Eng. doi: 10.1109/TBME.2016.2613960 – start-page: 3768 year: 2016 ident: 10.1016/j.specom.2018.06.002_bib0008 article-title: SNR-aware convolutional neural network modeling for speech enhancement doi: 10.21437/Interspeech.2016-211 – start-page: 3274 year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0002 article-title: Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks – start-page: 136 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0036 article-title: Multiple-target deep learning for LSTM-RNN based speech enhancement – start-page: 363 year: 2004 ident: 10.1016/j.specom.2018.06.002_bib0024 article-title: Direct filtering for air- and bone-conductive microphones – year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0030 article-title: Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks – volume: 2016 start-page: 305 year: 2016 ident: 10.1016/j.specom.2018.06.002_bib0018 article-title: Speech enhancement using long short-term memory based recurrent neural networks for noise robust speaker verification – year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0009 article-title: Complex spectrogram enhancement by convolutional neural network with multi-metrics learning – start-page: 91 year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0046 article-title: , Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR – start-page: 885 year: 2014 ident: 10.1016/j.specom.2018.06.002_bib0026 article-title: Ensemble modeling of denoising autoencoder for speech spectrum restoration – start-page: 249 year: 2003 ident: 10.1016/j.specom.2018.06.002_bib0051 article-title: Air- and bone-conductive inte-grated microphones for robust speech detection and enhancement – start-page: 2008 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0003 article-title: Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification – volume: 5 start-page: 25542 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0017 article-title: Experimental study on extreme learning machine applications for speech enhancement publication-title: IEEE Access doi: 10.1109/ACCESS.2017.2766675 – volume: 26 start-page: 381 year: 2005 ident: 10.1016/j.specom.2018.06.002_bib0040 article-title: Amplitude-mapping effects on speech intelligibility with unilateral and bilateral cochlear implants publication-title: Ear Hear. doi: 10.1097/00003446-200508000-00002 – start-page: 4960 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0038 article-title: A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals – volume: 10 start-page: 72 issue: 3 year: 2003 ident: 10.1016/j.specom.2018.06.002_bib0013 article-title: Combining standard and throat microphones for robust speech recognition publication-title: IEEE Signal Process. Lett. doi: 10.1109/LSP.2003.808549 – start-page: 781 year: 2004 ident: 10.1016/j.specom.2018.06.002_bib0050 article-title: Multi-sensory microphones for robust speech detection, enhancement, and recognition – start-page: 3743 year: 2016 ident: 10.1016/j.specom.2018.06.002_bib0035 article-title: Perception optimized deep denoising autoencoders for speech enhancement doi: 10.21437/Interspeech.2016-1284 – start-page: 735 year: 2010 ident: 10.1016/j.specom.2018.06.002_bib0028 article-title: Deep learning via Hessian-free optimization – volume: 19 start-page: 2125 issue: 7 year: 2011 ident: 10.1016/j.specom.2018.06.002_bib0037 article-title: An algorithm for intelligibility prediction of time–frequency weighted noisy speech publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2011.2114881 – start-page: 1 year: 2005 ident: 10.1016/j.specom.2018.06.002_bib0033 article-title: Quality improvement of bone-conducted speech – start-page: 224 year: 2012 ident: 10.1016/j.specom.2018.06.002_bib0044 article-title: Cocktail party processing via structured prediction – start-page: 3642 year: 2017 ident: 10.1016/j.specom.2018.06.002_bib0032 article-title: SEGAN: Speech enhancement generative adversarial network – year: 2015 ident: 10.1016/j.specom.2018.06.002_bib0016 article-title: Rapid adaptation for deep neural networks through multi-task learning |
| SSID | ssj0004882 |
| Score | 2.424156 |
| Snippet | Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 106 |
| SubjectTerms | Automatic speech recognition Bone-conduction microphone Bones Deep denoising autoencoder Denoising Frequencies Frequency response Intelligibility Microphones Noise Noise reduction Resistance Skull Sound pressure Speech Speech enhancement Speech processing Speech recognition Speeches Vibrations Voice recognition |
| Title | Bone-conducted speech enhancement using deep denoising autoencoder |
| URI | https://dx.doi.org/10.1016/j.specom.2018.06.002 https://www.proquest.com/docview/2154227666 |
| Volume | 104 |
| WOSCitedRecordID | wos000455419500011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: ScienceDirect database customDbUrl: eissn: 1872-7182 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004882 issn: 0167-6393 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELdg42EvCMrXoCA_IF6QUT6c2nnc0BCgaZrUIcpTZDuO2m1KqqZB23-_80fcjAoNkHiolViJY_Uud7-73AdCb7ltsM0kkTKbECooHCWKkUSAxORSgQ1NbbMJdnLCZ7P81Ic1t7adAKtrfnWVL_8rqWEOiG1SZ_-C3GFRmIBjIDqMQHYY_4jwh02tCVi5ppAroMl2qbWav9f13NDXfvrvrH-g1HoJQ90s7Kno1o0paln6cF0PWKfudjXMIwlBPIvOai6QF-S0V4HGC9AK64D90QXu6NyXnPmi6cgUdlIPvQ0x92l3wQW2lQbjvJIgbQHqOEmlnSTlDKB7zG-L2ogOhGUcTQZ6N3bh1Fsi3XkXzj-YzNPG1A6IXcXVKNmosBBYODVbMTuJQXKlNJvdR7sJy3KQd7sHX45mXzc5s9w2Egtb79Mqbezf9rN-B1t-UeAWlZw9Qg-9OYEPHBs8Rvd0PULPj70TusXv8HGom92O0F7Qd9cjNHap2fi7vqzESsO1_USzuniCDm8zEnaMhAeMhC0jYcNIODASHjDSU_Tt09HZx8_Ed9wgCmT5miQy0qqalJnQnDHBY_ilYkJFPMkYNejUmAAKzFjJUlllWok8yiU8RwlR0Sp9hnZq2NwLhKtIlnkaqaSiGZVK8CgD5FlqWLJUABv3Udr_oYXy5ehNV5TLoo87PC8cGQpDhsKGXyb7iIS7lq4cyx3Xs55WhYeUDioWwF533DnuSVv4t7stAB_TJGFg8r_854Vfob3NizVGO-tVp1-jB-rnetGu3ng2vQH8uqaq |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Bone-conducted+speech+enhancement+using+deep+denoising+autoencoder&rft.jtitle=Speech+communication&rft.au=Liu%2C+Hung-Ping&rft.au=Tsao%2C+Yu&rft.au=Fuh%2C+Chiou-Shann&rft.date=2018-11-01&rft.pub=Elsevier+B.V&rft.issn=0167-6393&rft.eissn=1872-7182&rft.volume=104&rft.spage=106&rft.epage=112&rft_id=info:doi/10.1016%2Fj.specom.2018.06.002&rft.externalDocID=S016763931730345X |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6393&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6393&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6393&client=summon |