Optimisation of phonetic aware speech recognition through multi-objective evolutionary algorithms
Recent advances in the availability of computational resources allow for more sophisticated approaches to speech recognition than ever before. This study considers Artificial Neural Network and Hidden Markov Model methods of classification for Human Speech Recognition through Diphthong Vowel sounds...
Saved in:
| Published in: | Expert systems with applications Vol. 153; p. 113402 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
Elsevier Ltd
01.09.2020
Elsevier BV |
| Subjects: | |
| ISSN: | 0957-4174, 1873-6793 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Recent advances in the availability of computational resources allow for more sophisticated approaches to speech recognition than ever before. This study considers Artificial Neural Network and Hidden Markov Model methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet rather than the classical approach of the classification of whole words and phrases, with a specific focus on both single and multi-objective evolutionary optimisation of bioinspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico and the recordings are transformed into a static dataset of statistics by way of their Mel-Frequency Cepstral Coefficients (MFCC) at sliding window length of 200ms as well as a reshaped MFCC timeseries format for forecast-based models. An deep neural network with evolutionary optimised topology achieves 90.77% phoneme classification accuracy in comparison to the best HMM that achieves 86.23% accuracy with 150 hidden units, when only accuracy is considered in a single-objective optimisation approach. The obtained solutions are far more complex than the HMM taking around 248 seconds to train on powerful hardware versus 160 for the HMM. A multi-objective approach is explored due to this. In the multi-objective approaches of scalarisation presented, within which real-time resource usage is also considered towards solution fitness, far more optimal solutions are produced which train far quicker than the forecast approach (69 seconds) with classification ability retained (86.73%). Weightings towards either maximising accuracy or reducing resource usage from 0.1 to 0.9 are suggested depending on the resources available, since many future IoT devices and autonomous robots may have limited access to cloud resources at a premium in comparison to the GPU used in this experiment. |
|---|---|
| AbstractList | Recent advances in the availability of computational resources allow for more sophisticated approaches to speech recognition than ever before. This study considers Artificial Neural Network and Hidden Markov Model methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet rather than the classical approach of the classification of whole words and phrases, with a specific focus on both single and multi-objective evolutionary optimisation of bioinspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico and the recordings are transformed into a static dataset of statistics by way of their Mel-Frequency Cepstral Coefficients (MFCC) at sliding window length of 200ms as well as a reshaped MFCC timeseries format for forecast-based models. An deep neural network with evolutionary optimised topology achieves 90.77% phoneme classification accuracy in comparison to the best HMM that achieves 86.23% accuracy with 150 hidden units, when only accuracy is considered in a single-objective optimisation approach. The obtained solutions are far more complex than the HMM taking around 248 seconds to train on powerful hardware versus 160 for the HMM. A multi-objective approach is explored due to this. In the multi-objective approaches of scalarisation presented, within which real-time resource usage is also considered towards solution fitness, far more optimal solutions are produced which train far quicker than the forecast approach (69 seconds) with classification ability retained (86.73%). Weightings towards either maximising accuracy or reducing resource usage from 0.1 to 0.9 are suggested depending on the resources available, since many future IoT devices and autonomous robots may have limited access to cloud resources at a premium in comparison to the GPU used in this experiment. |
| ArticleNumber | 113402 |
| Author | Ekárt, Anikó Bird, Jordan J. Wanner, Elizabeth Faria, Diego R. |
| Author_xml | – sequence: 1 givenname: Jordan J. orcidid: 0000-0002-9858-1231 surname: Bird fullname: Bird, Jordan J. email: birdj1@aston.ac.uk organization: Aston Robotics, Vision and Intelligent Systems (ARVIS), United Kingdom – sequence: 2 givenname: Elizabeth surname: Wanner fullname: Wanner, Elizabeth email: efwanner@decom.cefetmg.br organization: Computer Science Department, School of Engineering and Applied Science, Aston University, United Kingdom – sequence: 3 givenname: Anikó surname: Ekárt fullname: Ekárt, Anikó email: a.ekart@aston.ac.uk organization: Computer Science Department, School of Engineering and Applied Science, Aston University, United Kingdom – sequence: 4 givenname: Diego R. orcidid: 0000-0002-2771-1713 surname: Faria fullname: Faria, Diego R. email: d.faria@aston.ac.uk organization: Aston Robotics, Vision and Intelligent Systems (ARVIS), United Kingdom |
| BookMark | eNp9kLtqwzAUhkVJoUnaF-hk6OxUkhXJhi4l9AaBLO0sFPk4lnEsV5IT-vaVk04dMokj_u9cvhmadLYDhO4JXhBM-GOzAH9UC4pp_CAZw_QKTUkuspSLIpugKS6WImVEsBs0877BmAiMxRSpTR_M3ngVjO0SWyV9HTsHoxN1VA4S3wPoOnGg7a4zp1ConR12dbIf2mBSu21AB3OABA62HcaEcj-JanfWmVDv_S26rlTr4e7vnaOv15fP1Xu63rx9rJ7Xqc5oHtJqWRZ5yYSgjIBimuSUcaYKvd0Sjjke1yWQVVqpTFO-JFVRVrQiQFUsqMjm6OHct3f2ewAfZGMH18WRkjLGuMiWnMcUPae0s947qGTvzD5uLAmWo0rZyFGlHFXKs8oI5f8gbcLJWHDKtJfRpzMK8fSDASe9NtBpKE10GmRpzSX8F637ku0 |
| CitedBy_id | crossref_primary_10_1080_15397734_2020_1781655 crossref_primary_10_1007_s11042_022_13594_5 crossref_primary_10_1016_j_dsp_2022_103450 crossref_primary_10_1155_2022_7281892 crossref_primary_10_1109_ACCESS_2020_3034762 crossref_primary_10_1016_j_ejor_2022_08_032 crossref_primary_10_1049_tje2_12082 crossref_primary_10_1007_s11042_023_16438_y crossref_primary_10_3390_bdcc8120195 crossref_primary_10_1016_j_sasc_2025_200304 |
| Cites_doi | 10.1073/pnas.81.10.3088 10.1021/ci00027a006 10.3390/computers8040076 10.1080/01621459.1937.10503522 10.1109/TASL.2007.913036 10.1016/j.csda.2008.03.024 10.1109/5.18626 10.1214/aoms/1177729694 10.1515/jisys-2018-0372 10.1126/science.270.5234.303 10.1109/TASLP.2014.2367814 10.1109/4235.585893 10.1121/1.1915893 10.1214/aoms/1177699147 10.1016/j.ipm.2008.09.003 10.1016/j.specom.2011.11.004 10.1016/j.eswa.2019.112840 10.1162/106365602320169811 10.1016/j.jpdc.2017.09.006 10.1007/s10710-018-9339-y 10.1136/jamia.2000.0070462 10.1038/nature14539 |
| ContentType | Journal Article |
| Copyright | 2020 Elsevier Ltd Copyright Elsevier BV Sep 1, 2020 |
| Copyright_xml | – notice: 2020 Elsevier Ltd – notice: Copyright Elsevier BV Sep 1, 2020 |
| DBID | AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.eswa.2020.113402 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science Statistics |
| EISSN | 1873-6793 |
| ExternalDocumentID | 10_1016_j_eswa_2020_113402 S0957417420302268 |
| GroupedDBID | --K --M .DC .~1 0R~ 13V 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9JO AAAKF AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AAXUO AAYFN ABBOA ABFNM ABMAC ABMVD ABUCO ABYKQ ACDAQ ACGFS ACHRH ACNTT ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGJBL AGUBO AGUMN AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALEQD ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM AXJTR BJAXD BKOJK BLXMC BNSAS CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX IHE J1W JJJVA KOM LG9 LY1 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 ROL RPZ SDF SDG SDP SDS SES SPC SPCBC SSB SSD SSL SST SSV SSZ T5K TN5 ~G- 29G 9DU AAAKG AAQXK AATTM AAXKI AAYWO AAYXX ABJNI ABKBG ABUFD ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS EJD FEDTE FGOYB G-2 HLZ HVGLF HZ~ R2- SBC SET SEW WUQ XPP ZMT ~HD 7SC 7T9 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c328t-f5d98d477241ea4c182464a9cbb1606070071e3fcaa3c2651f9df2f1e2a651273 |
| ISICitedReferencesCount | 13 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000533513600005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0957-4174 |
| IngestDate | Sat Nov 08 21:02:31 EST 2025 Tue Nov 18 22:04:52 EST 2025 Sat Nov 29 07:11:28 EST 2025 Fri Feb 23 02:47:01 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Multi-objective evolutionary computation Phoneme classification Applied hyperheuristics Speech recognition |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c328t-f5d98d477241ea4c182464a9cbb1606070071e3fcaa3c2651f9df2f1e2a651273 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-2771-1713 0000-0002-9858-1231 |
| PQID | 2444673566 |
| PQPubID | 2045477 |
| ParticipantIDs | proquest_journals_2444673566 crossref_primary_10_1016_j_eswa_2020_113402 crossref_citationtrail_10_1016_j_eswa_2020_113402 elsevier_sciencedirect_doi_10_1016_j_eswa_2020_113402 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-09-01 2020-09-00 20200901 |
| PublicationDateYYYYMMDD | 2020-09-01 |
| PublicationDate_xml | – month: 09 year: 2020 text: 2020-09-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | Expert systems with applications |
| PublicationYear | 2020 |
| Publisher | Elsevier Ltd Elsevier BV |
| Publisher_xml | – name: Elsevier Ltd – name: Elsevier BV |
| References | Martín, Lara-Cabrera, Fuentes-Hurtado, Naranjo, Camacho (bib0033) 2018; 117 Wilges, Mateus, Silveira, Nassar (bib0059) 2007 Cao, Fan (bib0010) 2010 Hastie, Buja, Tibshirani (bib0020) 1995 Rabiner (bib0043) 1989; 77 Zoughi, Homayounpour, Deypir (bib0062) 2020; 139 Kullback, Leibler (bib0025) 1951; 22 Phan, Maaß, Mazur, Mertins (bib0040) 2015; 23 Moore (bib0035) 2001 Shannon, Zeng, Kamath, Wygonski, Ekelid (bib0049) 1995; 270 Pipiras, Maskeliūnas, Damaševičius (bib0041) 2019; 8 Stevens, Volkmann, Newman (bib0053) 1937; 8 Huang, X. D., Ariki, Y., & Jack, M. A. (1990). Hidden Markov models for speech recognition,. Nemenyi (bib0037) 1962; 18 Verlic, Zorman, Mertik (bib0057) 2005 Baum, Petrie (bib0004) 1966; 37 Qureshi, Nakamura, Yoshikawa, Ishiguro (bib0042) 2016 Foster, Alami, Gestranius, Lemon, Niemelä, Odobez (bib0013) 2016 Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (dtw) techniques. arXiv Waibel, Hanazawa, Hinton, Shikano, Lang (bib0058) 1990 Xue, Zhao (bib0061) 2008; 16 . Bourlard, Morgan (bib0009) 2012; 247 Ager, Cvetkovic, Sollich (bib0001) 2013 Hopfield (bib0021) 1984; 81 Li, Yu (bib0028) 2008; 52 Bird, Ekart, Faria (bib0007) 2019 Margulies (bib0032) 2016; 68 Passricha, Aggarwal (bib0039) 2019; 29 Wolpert, Macready (bib0060) 1997; 1 Rogers, Dalby (bib0044) 1996; 96 Rong, Li, Chen (bib0045) 2009; 45 Lahoual, Frejus (bib0026) 2019 Li, X. et al. (2018). Bayesian classification and change point detection for functional data.,. Friedman (bib0014) 1937; 32 Baugh, Cable (bib0003) 1993 Su, Jelinek, Khudanpur (bib0054) 2007 López, Quesada, Guerrero (bib0030) 2017 Togelius, Karakovskiy, Koutník, Schmidhuber (bib0056) 2009 Shpigelman, Weiss, Reiter (bib0050) 2009 Hudson, Cohen (bib0023) 2006 Sahidullah, Saha (bib0048) 2012; 54 Bird, Wanner, Ekart, Faria (bib0008) 2019 Graves, Jaitly, Mohamed (bib0018) 2013 Loyn (bib0031) 2014 Graves, Mohamed, Hinton (bib0019) 2013 Lerch (bib0027) 2012 Rosenblatt (bib0046) 1961 Assunçao, F., Lourenço, N., Machado, P., & Ribeiro, B. (2018). Denser: Deep evolutionary network structured representation. arXiv Merriam-Webster (2018). New dictionary words. Gagniuc (bib0016) 2017 Bayes, T., Price, R., & Canton, J. (1763). An essay towards solving a problem in the doctrine of chances,. Steinkraus, Buck, Simard (bib0052) 2005 Juang, Rabiner (bib0024) 2005; 1 Garofolo (bib0017) 1993 Devine, Gaehde, Curtis (bib0011) 2000; 7 Stanley, Miikkulainen (bib0051) 2002; 10 Fang, Wang, Yamagishi, Echizen (bib0012) 2019 Bengio, Goodfellow, Courville (bib0006) 2015; 521 Nunes, Dihl, Fraga, Woszezenki, Oliveira, Fransisco (bib0038) 2002 Russell, Norvig (bib0047) 2016 Fromkin, Rodman, Hyams (bib0015) 2006 Tetko, Livingstone, Luik (bib0055) 1995; 35 Bourlard (10.1016/j.eswa.2020.113402_bib0009) 2012; 247 Bird (10.1016/j.eswa.2020.113402_bib0008) 2019 Wolpert (10.1016/j.eswa.2020.113402_bib0060) 1997; 1 10.1016/j.eswa.2020.113402_bib0022 Juang (10.1016/j.eswa.2020.113402_bib0024) 2005; 1 Wilges (10.1016/j.eswa.2020.113402_bib0059) 2007 Steinkraus (10.1016/j.eswa.2020.113402_bib0052) 2005 Graves (10.1016/j.eswa.2020.113402_bib0018) 2013 Kullback (10.1016/j.eswa.2020.113402_bib0025) 1951; 22 Rabiner (10.1016/j.eswa.2020.113402_bib0043) 1989; 77 10.1016/j.eswa.2020.113402_bib0034 Phan (10.1016/j.eswa.2020.113402_bib0040) 2015; 23 10.1016/j.eswa.2020.113402_bib0036 Passricha (10.1016/j.eswa.2020.113402_bib0039) 2019; 29 Baum (10.1016/j.eswa.2020.113402_bib0004) 1966; 37 Hopfield (10.1016/j.eswa.2020.113402_bib0021) 1984; 81 Qureshi (10.1016/j.eswa.2020.113402_bib0042) 2016 Fromkin (10.1016/j.eswa.2020.113402_bib0015) 2006 Lerch (10.1016/j.eswa.2020.113402_bib0027) 2012 Devine (10.1016/j.eswa.2020.113402_bib0011) 2000; 7 Lahoual (10.1016/j.eswa.2020.113402_bib0026) 2019 Cao (10.1016/j.eswa.2020.113402_bib0010) 2010 10.1016/j.eswa.2020.113402_bib0029 Martín (10.1016/j.eswa.2020.113402_bib0033) 2018; 117 Rosenblatt (10.1016/j.eswa.2020.113402_bib0046) 1961 10.1016/j.eswa.2020.113402_bib0002 Moore (10.1016/j.eswa.2020.113402_bib0035) 2001 Verlic (10.1016/j.eswa.2020.113402_bib0057) 2005 Friedman (10.1016/j.eswa.2020.113402_bib0014) 1937; 32 Stevens (10.1016/j.eswa.2020.113402_bib0053) 1937; 8 Zoughi (10.1016/j.eswa.2020.113402_bib0062) 2020; 139 Gagniuc (10.1016/j.eswa.2020.113402_bib0016) 2017 Hastie (10.1016/j.eswa.2020.113402_bib0020) 1995 Shpigelman (10.1016/j.eswa.2020.113402_bib0050) 2009 Li (10.1016/j.eswa.2020.113402_bib0028) 2008; 52 Pipiras (10.1016/j.eswa.2020.113402_bib0041) 2019; 8 Bengio (10.1016/j.eswa.2020.113402_bib0006) 2015; 521 Hudson (10.1016/j.eswa.2020.113402_bib0023) 2006 Nunes (10.1016/j.eswa.2020.113402_bib0038) 2002 Foster (10.1016/j.eswa.2020.113402_bib0013) 2016 Togelius (10.1016/j.eswa.2020.113402_bib0056) 2009 Rong (10.1016/j.eswa.2020.113402_bib0045) 2009; 45 Bird (10.1016/j.eswa.2020.113402_bib0007) 2019 Tetko (10.1016/j.eswa.2020.113402_bib0055) 1995; 35 Ager (10.1016/j.eswa.2020.113402_bib0001) 2013 Garofolo (10.1016/j.eswa.2020.113402_bib0017) 1993 Stanley (10.1016/j.eswa.2020.113402_bib0051) 2002; 10 Nemenyi (10.1016/j.eswa.2020.113402_bib0037) 1962; 18 Su (10.1016/j.eswa.2020.113402_bib0054) 2007 Xue (10.1016/j.eswa.2020.113402_bib0061) 2008; 16 Margulies (10.1016/j.eswa.2020.113402_bib0032) 2016; 68 Graves (10.1016/j.eswa.2020.113402_bib0019) 2013 Russell (10.1016/j.eswa.2020.113402_bib0047) 2016 Waibel (10.1016/j.eswa.2020.113402_bib0058) 1990 Rogers (10.1016/j.eswa.2020.113402_bib0044) 1996; 96 Baugh (10.1016/j.eswa.2020.113402_bib0003) 1993 Shannon (10.1016/j.eswa.2020.113402_bib0049) 1995; 270 10.1016/j.eswa.2020.113402_bib0005 Fang (10.1016/j.eswa.2020.113402_bib0012) 2019 López (10.1016/j.eswa.2020.113402_bib0030) 2017 Sahidullah (10.1016/j.eswa.2020.113402_bib0048) 2012; 54 Loyn (10.1016/j.eswa.2020.113402_bib0031) 2014 |
| References_xml | – year: 1961 ident: bib0046 article-title: Principles of neurodynamics. perceptrons and the theory of brain mechanisms publication-title: Technical Report – volume: 139 start-page: 112840 year: 2020 ident: bib0062 article-title: Adaptive windows multiple deep residual networks for speech recognition publication-title: Expert Systems with Applications – volume: 68 start-page: 1045 year: 2016 ident: bib0032 article-title: Surveillance by algorithm: The nsa, computerized intelligence collection, and human rights publication-title: Florida Law Review – volume: 77 start-page: 257 year: 1989 end-page: 286 ident: bib0043 article-title: A tutorial on hidden markov models and selected applications in speech recognition publication-title: Proceedings of the IEEE – volume: 117 start-page: 180 year: 2018 end-page: 191 ident: bib0033 article-title: Evodeep: A new evolutionary approach for automatic deep neural networks parametrisation publication-title: Journal of Parallel and Distributed Computing – reference: Huang, X. D., Ariki, Y., & Jack, M. A. (1990). Hidden Markov models for speech recognition,. – reference: Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (dtw) techniques. arXiv: – volume: 18 start-page: 263 year: 1962 ident: bib0037 article-title: Distribution-free multiple comparisons publication-title: Biometrics – volume: 45 start-page: 315 year: 2009 end-page: 328 ident: bib0045 article-title: Acoustic feature selection for automatic emotion recognition from speech publication-title: Information processing & management – volume: 8 start-page: 76 year: 2019 ident: bib0041 article-title: Lithuanian speech recognition using purely phonetic deep learning publication-title: Computers – volume: 96 year: 1996 ident: bib0044 article-title: Prediction of foreign-accented speech intelligibility from segmental contrast measures publication-title: Journal of the Acoustical Society of America – volume: 29 start-page: 1261 year: 2019 end-page: 1274 ident: bib0039 article-title: A hybrid of deep cnn and bidirectional lstm for automatic speech recognition publication-title: Journal of Intelligent Systems – volume: 521 start-page: 436 year: 2015 end-page: 444 ident: bib0006 article-title: Deep learning publication-title: Nature – year: 2013 ident: bib0001 article-title: Phoneme classification in high-dimensional linear feature domains publication-title: Computing Research Repository – volume: 52 start-page: 4790 year: 2008 end-page: 4800 ident: bib0028 article-title: Classification of functional data: A segmentation approach publication-title: Computational Statistics & Data Analysis – volume: 247 year: 2012 ident: bib0009 article-title: Connectionist speech recognition: A hybrid approach – year: 1993 ident: bib0003 article-title: A history of the English language – start-page: CS08 year: 2019 ident: bib0026 article-title: When users assist the voice assistants: From supervision to failure resolution publication-title: Extended abstracts of the 2019 chi conference on human factors in computing systems – volume: 7 start-page: 462 year: 2000 end-page: 468 ident: bib0011 article-title: Comparative evaluation of three continuous speech recognition software packages in the generation of medical reports publication-title: Journal of the American Medical Informatics Association – start-page: 362 year: 2019 end-page: 363 ident: bib0008 article-title: Phoneme aware speech recognition through evolutionary optimisation publication-title: The genetic and evolutionary computation conference – year: 2007 ident: bib0054 article-title: Large-scale random forest language models for speech recognition publication-title: Eighth annual conference of the international speech communication association – start-page: 753 year: 2016 end-page: 763 ident: bib0013 article-title: The mummer project: Engaging human-robot interaction in real-world public spaces publication-title: International conference on social robotics – year: 1993 ident: bib0017 article-title: Timit acoustic phonetic continuous speech corpus – start-page: 745 year: 2016 end-page: 751 ident: bib0042 article-title: Robot gains social intelligence through multimodal deep reinforcement learning publication-title: 2016 IEEE-RAS 16th international conference on humanoid robots (humanoids) – start-page: 393 year: 1990 end-page: 404 ident: bib0058 article-title: Phoneme recognition using time-delay neural networks publication-title: Readings in speech recognition – start-page: 65 year: 2009 end-page: 69 ident: bib0050 article-title: e-empowerment of young adults with special needs behind the computer screen i am not disable publication-title: 2009 virtual rehabilitation international conference – start-page: 6332 year: 2006 end-page: 6335 ident: bib0023 article-title: Intelligent agent model for remote support of rural healthcare for the elderly publication-title: 28th annual international conference of the IEEE engineering in medicine and biology science, embs 06 – start-page: 186 year: 2007 end-page: 188 ident: bib0059 article-title: An animated pedagogical agent as a learning management system manipulating intelligent learning objects publication-title: 7th IEEE international conference on advanced learning technologies, icalt 2007 – volume: 16 start-page: 519 year: 2008 end-page: 528 ident: bib0061 article-title: Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition publication-title: IEEE Transactions on Audio, Speech, and Language Processing – start-page: 273 year: 2013 end-page: 278 ident: bib0018 article-title: Hybrid speech recognition with deep bidirectional LSTM publication-title: 2013 IEEE workshop on automatic speech recognition and understanding – start-page: 241 year: 2017 end-page: 250 ident: bib0030 article-title: Alexa vs. siri vs. cortana vs. google assistant: a comparison of speech-based natural user interfaces publication-title: International conference on applied human factors and ergonomics – year: 2014 ident: bib0031 article-title: Anglo Saxon England and the Norman Conquest – volume: 54 start-page: 543 year: 2012 end-page: 565 ident: bib0048 article-title: Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition publication-title: Speech Communication – year: 2016 ident: bib0047 article-title: Artificial intelligence: A modern approach – reference: Li, X. et al. (2018). Bayesian classification and change point detection for functional data.,. – volume: 270 start-page: 303 year: 1995 end-page: 304 ident: bib0049 article-title: Speech recognition with primarily temporal cues publication-title: Science – volume: 35 start-page: 826 year: 1995 end-page: 833 ident: bib0055 article-title: Neural network studies. 1. comparison of overfitting and overtraining publication-title: Journal of Chemical Information and Computer Sciences – year: 2019 ident: bib0007 article-title: Evolutionary optimisation of fully connected artificial neural network topology publication-title: Sai computing conference 2019 – volume: 37 start-page: 1554 year: 1966 end-page: 1563 ident: bib0004 article-title: Statistical inference for probabilistic functions of finite state markov chains publication-title: The Annals of Mathematical Statistics – reference: Bayes, T., Price, R., & Canton, J. (1763). An essay towards solving a problem in the doctrine of chances,. – start-page: 73 year: 1995 end-page: 102 ident: bib0020 article-title: Penalized discriminant analysis publication-title: The Annals of Statistics – reference: Merriam-Webster (2018). New dictionary words. – start-page: 53 year: 2002 end-page: 61 ident: bib0038 article-title: Animated pedagogical agent in the intelligent virtual teaching environment publication-title: Interactive Educational MUltimedia – volume: 1 start-page: 67 year: 1997 end-page: 82 ident: bib0060 article-title: No free lunch theorems for optimization publication-title: IEEE Transactions on Evolutionary Computation – year: 2006 ident: bib0015 article-title: An introduction to language – volume: 1 start-page: 67 year: 2005 ident: bib0024 article-title: Automatic speech recognition–a brief history of the technology development publication-title: Georgia Institute of Technology. Atlanta Rutgers University and the University of California. Santa Barbara – volume: 23 start-page: 20 year: 2015 end-page: 31 ident: bib0040 article-title: Random regression forests for acoustic event detection and classification publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing – year: 2017 ident: bib0016 article-title: Markov chains: From theory to implementation and experimentation – volume: 81 start-page: 3088 year: 1984 end-page: 3092 ident: bib0021 article-title: Neurons with graded response have collective computational properties like those of two-state neurons publication-title: Proceedings of the National Academy of Sciences – start-page: 1115 year: 2005 end-page: 1120 ident: bib0052 article-title: Using gpus for machine learning algorithms publication-title: Eighth international conference on document analysis and recognition (ICDAR’05) – start-page: 191 year: 2010 end-page: 195 ident: bib0010 article-title: Signal classification using random forest with kernels publication-title: 2010 sixth advanced international conference on telecommunications – volume: 8 start-page: 185 year: 1937 end-page: 190 ident: bib0053 article-title: A scale for the measurement of the psychological magnitude pitch publication-title: The Journal of the Acoustical Society of America – year: 2012 ident: bib0027 article-title: An introduction to audio content analysis: Applications in signal processing and music informatics – start-page: 6795 year: 2019 end-page: 6799 ident: bib0012 article-title: Audiovisual speaker conversion: Jointly and simultaneously transforming facial expression and acoustic characteristics publication-title: Icassp 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP) – volume: 10 start-page: 99 year: 2002 end-page: 127 ident: bib0051 article-title: Evolving neural networks through augmenting topologies publication-title: Evolutionary Computation – start-page: 6645 year: 2013 end-page: 6649 ident: bib0019 article-title: Speech recognition with deep recurrent neural networks publication-title: Acoustics, speech and signal processing (icassp), 2013 IEEE international conference on – reference: . – reference: Assunçao, F., Lourenço, N., Machado, P., & Ribeiro, B. (2018). Denser: Deep evolutionary network structured representation. arXiv: – start-page: 134 year: 2005 end-page: 138 ident: bib0057 article-title: iaperas - intelligent athlete’s personal assistant publication-title: 18th IEEE symposium on computer-based medical systems – year: 2001 ident: bib0035 article-title: Cross-validation for detecting and preventing overfitting publication-title: School of Computer Science Carneigie Mellon University – volume: 22 start-page: 79 year: 1951 end-page: 86 ident: bib0025 article-title: On information and sufficiency publication-title: The Annals of Mathematical Statistics – start-page: 156 year: 2009 end-page: 161 ident: bib0056 article-title: Super mario evolution publication-title: Computational intelligence and games, 2009. cig 2009. ieee symposium on – volume: 32 start-page: 675 year: 1937 end-page: 701 ident: bib0014 article-title: The use of ranks to avoid the assumption of normality implicit in the analysis of variance publication-title: Journal of the American Statistical Association – volume: 81 start-page: 3088 issue: 10 year: 1984 ident: 10.1016/j.eswa.2020.113402_bib0021 article-title: Neurons with graded response have collective computational properties like those of two-state neurons publication-title: Proceedings of the National Academy of Sciences doi: 10.1073/pnas.81.10.3088 – ident: 10.1016/j.eswa.2020.113402_bib0022 – volume: 35 start-page: 826 issue: 5 year: 1995 ident: 10.1016/j.eswa.2020.113402_bib0055 article-title: Neural network studies. 1. comparison of overfitting and overtraining publication-title: Journal of Chemical Information and Computer Sciences doi: 10.1021/ci00027a006 – volume: 8 start-page: 76 issue: 4 year: 2019 ident: 10.1016/j.eswa.2020.113402_bib0041 article-title: Lithuanian speech recognition using purely phonetic deep learning publication-title: Computers doi: 10.3390/computers8040076 – start-page: 6332 year: 2006 ident: 10.1016/j.eswa.2020.113402_bib0023 article-title: Intelligent agent model for remote support of rural healthcare for the elderly – volume: 32 start-page: 675 issue: 200 year: 1937 ident: 10.1016/j.eswa.2020.113402_bib0014 article-title: The use of ranks to avoid the assumption of normality implicit in the analysis of variance publication-title: Journal of the American Statistical Association doi: 10.1080/01621459.1937.10503522 – start-page: 6645 year: 2013 ident: 10.1016/j.eswa.2020.113402_bib0019 article-title: Speech recognition with deep recurrent neural networks – volume: 68 start-page: 1045 year: 2016 ident: 10.1016/j.eswa.2020.113402_bib0032 article-title: Surveillance by algorithm: The nsa, computerized intelligence collection, and human rights publication-title: Florida Law Review – start-page: 745 year: 2016 ident: 10.1016/j.eswa.2020.113402_bib0042 article-title: Robot gains social intelligence through multimodal deep reinforcement learning – year: 2019 ident: 10.1016/j.eswa.2020.113402_bib0007 article-title: Evolutionary optimisation of fully connected artificial neural network topology – start-page: 191 year: 2010 ident: 10.1016/j.eswa.2020.113402_bib0010 article-title: Signal classification using random forest with kernels – year: 1993 ident: 10.1016/j.eswa.2020.113402_bib0003 – volume: 16 start-page: 519 issue: 3 year: 2008 ident: 10.1016/j.eswa.2020.113402_bib0061 article-title: Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition publication-title: IEEE Transactions on Audio, Speech, and Language Processing doi: 10.1109/TASL.2007.913036 – volume: 52 start-page: 4790 issue: 10 year: 2008 ident: 10.1016/j.eswa.2020.113402_bib0028 article-title: Classification of functional data: A segmentation approach publication-title: Computational Statistics & Data Analysis doi: 10.1016/j.csda.2008.03.024 – year: 2016 ident: 10.1016/j.eswa.2020.113402_bib0047 – volume: 77 start-page: 257 issue: 2 year: 1989 ident: 10.1016/j.eswa.2020.113402_bib0043 article-title: A tutorial on hidden markov models and selected applications in speech recognition publication-title: Proceedings of the IEEE doi: 10.1109/5.18626 – volume: 22 start-page: 79 issue: 1 year: 1951 ident: 10.1016/j.eswa.2020.113402_bib0025 article-title: On information and sufficiency publication-title: The Annals of Mathematical Statistics doi: 10.1214/aoms/1177729694 – year: 2012 ident: 10.1016/j.eswa.2020.113402_bib0027 – volume: 29 start-page: 1261 issue: 1 year: 2019 ident: 10.1016/j.eswa.2020.113402_bib0039 article-title: A hybrid of deep cnn and bidirectional lstm for automatic speech recognition publication-title: Journal of Intelligent Systems doi: 10.1515/jisys-2018-0372 – volume: 270 start-page: 303 issue: 5234 year: 1995 ident: 10.1016/j.eswa.2020.113402_bib0049 article-title: Speech recognition with primarily temporal cues publication-title: Science doi: 10.1126/science.270.5234.303 – start-page: 1115 year: 2005 ident: 10.1016/j.eswa.2020.113402_bib0052 article-title: Using gpus for machine learning algorithms – start-page: 753 year: 2016 ident: 10.1016/j.eswa.2020.113402_bib0013 article-title: The mummer project: Engaging human-robot interaction in real-world public spaces – volume: 18 start-page: 263 year: 1962 ident: 10.1016/j.eswa.2020.113402_bib0037 article-title: Distribution-free multiple comparisons – volume: 247 year: 2012 ident: 10.1016/j.eswa.2020.113402_bib0009 – volume: 23 start-page: 20 issue: 1 year: 2015 ident: 10.1016/j.eswa.2020.113402_bib0040 article-title: Random regression forests for acoustic event detection and classification publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing doi: 10.1109/TASLP.2014.2367814 – volume: 1 start-page: 67 issue: 1 year: 1997 ident: 10.1016/j.eswa.2020.113402_bib0060 article-title: No free lunch theorems for optimization publication-title: IEEE Transactions on Evolutionary Computation doi: 10.1109/4235.585893 – start-page: 6795 year: 2019 ident: 10.1016/j.eswa.2020.113402_bib0012 article-title: Audiovisual speaker conversion: Jointly and simultaneously transforming facial expression and acoustic characteristics – year: 2014 ident: 10.1016/j.eswa.2020.113402_bib0031 – start-page: 241 year: 2017 ident: 10.1016/j.eswa.2020.113402_bib0030 article-title: Alexa vs. siri vs. cortana vs. google assistant: a comparison of speech-based natural user interfaces – ident: 10.1016/j.eswa.2020.113402_bib0034 – year: 2017 ident: 10.1016/j.eswa.2020.113402_bib0016 – start-page: CS08 year: 2019 ident: 10.1016/j.eswa.2020.113402_bib0026 article-title: When users assist the voice assistants: From supervision to failure resolution – volume: 8 start-page: 185 issue: 3 year: 1937 ident: 10.1016/j.eswa.2020.113402_bib0053 article-title: A scale for the measurement of the psychological magnitude pitch publication-title: The Journal of the Acoustical Society of America doi: 10.1121/1.1915893 – year: 2006 ident: 10.1016/j.eswa.2020.113402_bib0015 – volume: 1 start-page: 67 year: 2005 ident: 10.1016/j.eswa.2020.113402_bib0024 article-title: Automatic speech recognition–a brief history of the technology development – volume: 37 start-page: 1554 issue: 6 year: 1966 ident: 10.1016/j.eswa.2020.113402_bib0004 article-title: Statistical inference for probabilistic functions of finite state markov chains publication-title: The Annals of Mathematical Statistics doi: 10.1214/aoms/1177699147 – year: 1993 ident: 10.1016/j.eswa.2020.113402_bib0017 – start-page: 273 year: 2013 ident: 10.1016/j.eswa.2020.113402_bib0018 article-title: Hybrid speech recognition with deep bidirectional LSTM – start-page: 73 year: 1995 ident: 10.1016/j.eswa.2020.113402_bib0020 article-title: Penalized discriminant analysis publication-title: The Annals of Statistics – volume: 45 start-page: 315 issue: 3 year: 2009 ident: 10.1016/j.eswa.2020.113402_bib0045 article-title: Acoustic feature selection for automatic emotion recognition from speech publication-title: Information processing & management doi: 10.1016/j.ipm.2008.09.003 – volume: 54 start-page: 543 issue: 4 year: 2012 ident: 10.1016/j.eswa.2020.113402_bib0048 article-title: Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition publication-title: Speech Communication doi: 10.1016/j.specom.2011.11.004 – start-page: 156 year: 2009 ident: 10.1016/j.eswa.2020.113402_bib0056 article-title: Super mario evolution – year: 1961 ident: 10.1016/j.eswa.2020.113402_bib0046 article-title: Principles of neurodynamics. perceptrons and the theory of brain mechanisms – ident: 10.1016/j.eswa.2020.113402_bib0005 – volume: 139 start-page: 112840 year: 2020 ident: 10.1016/j.eswa.2020.113402_bib0062 article-title: Adaptive windows multiple deep residual networks for speech recognition publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2019.112840 – start-page: 65 year: 2009 ident: 10.1016/j.eswa.2020.113402_bib0050 article-title: e-empowerment of young adults with special needs behind the computer screen i am not disable – year: 2013 ident: 10.1016/j.eswa.2020.113402_bib0001 article-title: Phoneme classification in high-dimensional linear feature domains publication-title: Computing Research Repository – start-page: 186 year: 2007 ident: 10.1016/j.eswa.2020.113402_bib0059 article-title: An animated pedagogical agent as a learning management system manipulating intelligent learning objects – volume: 10 start-page: 99 issue: 2 year: 2002 ident: 10.1016/j.eswa.2020.113402_bib0051 article-title: Evolving neural networks through augmenting topologies publication-title: Evolutionary Computation doi: 10.1162/106365602320169811 – volume: 117 start-page: 180 year: 2018 ident: 10.1016/j.eswa.2020.113402_bib0033 article-title: Evodeep: A new evolutionary approach for automatic deep neural networks parametrisation publication-title: Journal of Parallel and Distributed Computing doi: 10.1016/j.jpdc.2017.09.006 – start-page: 134 year: 2005 ident: 10.1016/j.eswa.2020.113402_bib0057 article-title: iaperas - intelligent athlete’s personal assistant – ident: 10.1016/j.eswa.2020.113402_bib0002 doi: 10.1007/s10710-018-9339-y – year: 2001 ident: 10.1016/j.eswa.2020.113402_bib0035 article-title: Cross-validation for detecting and preventing overfitting – ident: 10.1016/j.eswa.2020.113402_bib0036 – start-page: 53 issue: 4 year: 2002 ident: 10.1016/j.eswa.2020.113402_bib0038 article-title: Animated pedagogical agent in the intelligent virtual teaching environment publication-title: Interactive Educational MUltimedia – volume: 96 issue: 5 year: 1996 ident: 10.1016/j.eswa.2020.113402_bib0044 article-title: Prediction of foreign-accented speech intelligibility from segmental contrast measures publication-title: Journal of the Acoustical Society of America – volume: 7 start-page: 462 issue: 5 year: 2000 ident: 10.1016/j.eswa.2020.113402_bib0011 article-title: Comparative evaluation of three continuous speech recognition software packages in the generation of medical reports publication-title: Journal of the American Medical Informatics Association doi: 10.1136/jamia.2000.0070462 – start-page: 393 year: 1990 ident: 10.1016/j.eswa.2020.113402_bib0058 article-title: Phoneme recognition using time-delay neural networks – volume: 521 start-page: 436 issue: 7553 year: 2015 ident: 10.1016/j.eswa.2020.113402_bib0006 article-title: Deep learning publication-title: Nature doi: 10.1038/nature14539 – year: 2007 ident: 10.1016/j.eswa.2020.113402_bib0054 article-title: Large-scale random forest language models for speech recognition – ident: 10.1016/j.eswa.2020.113402_bib0029 – start-page: 362 year: 2019 ident: 10.1016/j.eswa.2020.113402_bib0008 article-title: Phoneme aware speech recognition through evolutionary optimisation |
| SSID | ssj0017007 |
| Score | 2.4006085 |
| Snippet | Recent advances in the availability of computational resources allow for more sophisticated approaches to speech recognition than ever before. This study... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 113402 |
| SubjectTerms | Accuracy Acknowledgment Acoustics Algorithms Applied hyperheuristics Artificial neural networks Classification Diphthongs Evolutionary algorithms Markov analysis Markov chains Mathematical models Multi-objective evolutionary computation Multiple objective analysis Neural networks Objectives Phoneme classification Phonemes Phonetics Resources Robotics Robots Speech Speech recognition Statistics Topology optimization Voice recognition Vowels Words |
| Title | Optimisation of phonetic aware speech recognition through multi-objective evolutionary algorithms |
| URI | https://dx.doi.org/10.1016/j.eswa.2020.113402 https://www.proquest.com/docview/2444673566 |
| Volume | 153 |
| WOSCitedRecordID | wos000533513600005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6793 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017007 issn: 0957-4174 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEF6FlgMX3oiWgvbALXKUtdevY0CtoEIFoSLltlqvd5ukwY4Sk_aX9Pcy-7JNUCt64GLZlr1aeb4dz-x8M4PQe1EWBRjyUVCOIxFQIpOAc6mCRKqUFrEOfJmuJV_Ss7NsOs2_DQY3Phdmu0yrKru-zlf_VdRwD4StU2fvIe52ULgB5yB0OILY4fhPgv8KSuCnI-kYPvOsrqQpy3qlWV6blZRiNmyJQyZYYHv1GHJhUBcLqwSHcusmqpl1fHlRr-fNzFU3X7QUPrluXD1onynXi4l3e_CWQn8Kvq5OmBp1G_m-91efYGaN_EsTxSc2qWhSzc31h1ZRnYCbb6NVc3lRD7-P-jsY4K56ila3FZkCTGy3nlYrx1FPrxISUZOZ_bfKt7sPi5HcXOk6UqFpU-Me_rO-9s5_r2UjeqLbgukxmB6D2TEeoP0wjXPQlvuTz8fT0zY-lY5tIr6fuUvHsszB3ZncZvLs_PyNRXP-FD12rgieWAg9QwNZPUdPfJsP7LT-C8T7iMK1wh5R2CAKW0ThHqKwQxTeQRTuIwp3iHqJfpwcn3_8FLjOHIGIwqwJVFzmWUnBM4PVzakAJ5UmlOeiKAh4xGP9eYiMlOA8EmESE5WXKlREhhwuwGJ-hfYqmOprhHmsxllahjLhJYXfSybjnNIySUKVhESKA0T8x2PCla3X3VOW7HaxHaBh-87KFm258-nYy4Q5s9Oakwwgdud7R16AzK3_DQNrGUyPCJykw3tN4g161C2NI7TXrH_Jt-ih2Dbzzfqdg99vvOiwdw |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Optimisation+of+phonetic+aware+speech+recognition+through+multi-objective+evolutionary+algorithms&rft.jtitle=Expert+systems+with+applications&rft.au=Bird%2C+Jordan+J.&rft.au=Wanner%2C+Elizabeth&rft.au=Ek%C3%A1rt%2C+Anik%C3%B3&rft.au=Faria%2C+Diego+R.&rft.date=2020-09-01&rft.issn=0957-4174&rft.volume=153&rft.spage=113402&rft_id=info:doi/10.1016%2Fj.eswa.2020.113402&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_eswa_2020_113402 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon |