Speech emotion recognition by using complex MFCC and deep sequential model
Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finit...
Saved in:
| Published in: | Multimedia tools and applications Vol. 82; no. 8; pp. 11897 - 11922 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
Springer US
01.03.2023
Springer Nature B.V |
| Subjects: | |
| ISSN: | 1380-7501, 1573-7721 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using
Complex Mel Frequency Cepstral Coefficients (c-MFCC)
as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell’s emotional circumplex pattern. As per the Russell’s 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices. |
|---|---|
| AbstractList | Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using
Complex Mel Frequency Cepstral Coefficients (c-MFCC)
as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell’s emotional circumplex pattern. As per the Russell’s 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices. Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using Complex Mel Frequency Cepstral Coefficients (c-MFCC) as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell’s emotional circumplex pattern. As per the Russell’s 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices. |
| Author | Patnaik, Suprava |
| Author_xml | – sequence: 1 givenname: Suprava orcidid: 0000-0002-7068-5960 surname: Patnaik fullname: Patnaik, Suprava email: suprava.patnaikfet@kiit.ac.in organization: School of Electronics, Kalinga Institute of Industrial Technology |
| BookMark | eNp9kM1OwzAQhC1UJNrCC3CyxNngv8TxEUWUgoo4AGfLcZySKrGDnUrk7UkbJCQOPe0c5tudnQWYOe8sANcE3xKMxV0kBHOKMKWIMEETNJyBOUkEQ0JQMhs1yzASCSYXYBHjDmOSJpTPwfNbZ635hLb1fe0dDNb4rauPuhjgPtZuC41vu8Z-w5dVnkPtSlha28Fov_bW9bVuYOtL21yC80o30V79ziX4WD2852u0eX18yu83yDAie6R5VdIUJ2mSpIXgsiolkdRwajQnghtBK1qmXAhN07QkOBOVlEWFC815YSRmS3Az7e2CHxPEXu38PrjxpKIiI5LzjLHRlU0uE3yMwVbK1L0-_NUHXTeKYHVoTk3NqbE5dWxODSNK_6FdqFsdhtMQm6A4mt3Whr9UJ6gfgPCCWw |
| CitedBy_id | crossref_primary_10_1145_3687303 crossref_primary_10_1007_s11265_024_01929_4 crossref_primary_10_1109_ACCESS_2024_3370431 crossref_primary_10_1007_s11042_024_19674_y crossref_primary_10_1016_j_specom_2024_103102 crossref_primary_10_1109_ACCESS_2024_3517733 crossref_primary_10_1109_ACCESS_2023_3326071 crossref_primary_10_3390_app15136958 crossref_primary_10_1016_j_cmpb_2024_108564 crossref_primary_10_1007_s11042_023_17406_2 crossref_primary_10_1007_s11518_024_5607_y crossref_primary_10_1016_j_inffus_2023_101847 crossref_primary_10_1021_jacs_5c11632 crossref_primary_10_3390_app14062252 crossref_primary_10_1016_j_bspc_2023_105708 crossref_primary_10_3390_app14209458 crossref_primary_10_1016_j_engappai_2024_109103 crossref_primary_10_1016_j_knosys_2025_113414 crossref_primary_10_1007_s11042_023_17915_0 crossref_primary_10_1007_s11042_024_18298_6 crossref_primary_10_1109_ACCESS_2024_3490186 crossref_primary_10_3390_ani14142029 crossref_primary_10_3390_electronics12112512 |
| Cites_doi | 10.1109/T-AFFC.2013.17 10.1016/j.csl.2010.09.001 10.1007/s10462-012-9368-5 10.1109/TASL.2010.2076804 10.1109/ACCESS.2019.2901352 10.1109/TASLP.2014.2339736 10.1109/ACCESS.2020.3043201 10.21437/Interspeech.2005-446 10.1109/TASL.2011.2109379 10.1109/TAFFC.2015.2392101 10.1016/j.specom.2006.04.003 10.1016/j.patcog.2010.09.020 10.1109/MSP.2012.2205597 10.1155/2014/749604 10.1016/j.dsp.2006.06.007 10.1155/2015/394083 10.1109/ICASSP.2018.8462677 10.21437/Interspeech.2014-57 10.1109/ICASSP.2016.7472669 10.1016/j.procs.2017.08.003 10.1109/TMM.2017.2766843 10.1109/TENCON.2019.8929459 10.1109/ACII.2019.8925444 10.1109/EAIS48028.2020.9122698 10.21437/Interspeech.2014-391 10.1007/978-3-642-34447-3_48 10.1109/ICASSP.2016.7471742 10.1109/ICASSP.2015.7177963 10.1007/978-3-319-70772-3_1 10.21437/Interspeech.2015-6 10.1371/journal.pone.0196391 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. |
| DBID | AAYXX CITATION 3V. 7SC 7T9 7WY 7WZ 7XB 87Z 8AL 8AO 8FD 8FE 8FG 8FK 8FL 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L7M L~C L~D M0C M0N M2O MBDVC P5Z P62 PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI PRINS Q9U |
| DOI | 10.1007/s11042-022-13725-y |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni) ProQuest Research Library ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Business Premium Collection ProQuest Technology Collection ProQuest One Community College ProQuest Central Korea Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student ProQuest Research Library SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic |
| DatabaseTitle | CrossRef ProQuest Business Collection (Alumni Edition) Research Library Prep Computer Science Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts SciTech Premium Collection ProQuest Central China ABI/INFORM Complete ProQuest One Applied & Life Sciences ProQuest Central (New) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest One Academic Eastern Edition Linguistics and Language Behavior Abstracts (LLBA) ProQuest Technology Collection ProQuest Business Collection ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ABI/INFORM Global (Corporate) ProQuest One Business Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Central (Alumni Edition) ProQuest One Community College Research Library (Alumni Edition) ProQuest Pharma Collection ProQuest Central ABI/INFORM Professional Advanced ProQuest Central Korea ProQuest Research Library Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Business (Alumni) ProQuest Central (Alumni) Business Premium Collection (Alumni) |
| DatabaseTitleList | ProQuest Business Collection (Alumni Edition) |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1573-7721 |
| EndPage | 11922 |
| ExternalDocumentID | 10_1007_s11042_022_13725_y |
| GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C .4S .86 .DC .VR 06D 0R~ 0VY 123 1N0 1SB 2.D 203 28- 29M 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3EH 3V. 4.4 406 408 409 40D 40E 5QI 5VS 67Z 6NX 7WY 8AO 8FE 8FG 8FL 8G5 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACREN ACSNA ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I-F I09 IHE IJ- IKXTQ ITG ITH ITM IWAJR IXC IXE IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M2O M4Y MA- N2Q N9A NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R4E R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TH9 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z7R Z7S Z7W Z7X Z7Y Z7Z Z81 Z83 Z86 Z88 Z8M Z8N Z8Q Z8R Z8S Z8T Z8U Z8W Z92 ZMTXR ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG ADKFA AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 7SC 7T9 7XB 8AL 8FD 8FK JQ2 L.- L7M L~C L~D MBDVC PKEHL PQEST PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c319t-a4fd26056556b749fd9192c42ca4174c72f2d6477a266d1087f99bf0ba44bc903 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 30 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000852929600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1380-7501 |
| IngestDate | Sat Nov 08 04:38:29 EST 2025 Tue Nov 18 21:56:13 EST 2025 Sat Nov 29 06:20:21 EST 2025 Fri Feb 21 02:44:39 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 8 |
| Keywords | Speech emotion 1-D CNN Emotion circumplex MFCC |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c319t-a4fd26056556b749fd9192c42ca4174c72f2d6477a266d1087f99bf0ba44bc903 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-7068-5960 |
| PQID | 2781944833 |
| PQPubID | 54626 |
| PageCount | 26 |
| ParticipantIDs | proquest_journals_2781944833 crossref_citationtrail_10_1007_s11042_022_13725_y crossref_primary_10_1007_s11042_022_13725_y springer_journals_10_1007_s11042_022_13725_y |
| PublicationCentury | 2000 |
| PublicationDate | 20230300 2023-03-00 20230301 |
| PublicationDateYYYYMMDD | 2023-03-01 |
| PublicationDate_xml | – month: 3 year: 2023 text: 20230300 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York – name: Dordrecht |
| PublicationSubtitle | An International Journal |
| PublicationTitle | Multimedia tools and applications |
| PublicationTitleAbbrev | Multimed Tools Appl |
| PublicationYear | 2023 |
| Publisher | Springer US Springer Nature B.V |
| Publisher_xml | – name: Springer US – name: Springer Nature B.V |
| References | CR18 CR17 CR39 CR38 CR15 CR37 CR14 CR36 CR13 Ververidis, Koropoulos (CR34) 2006; 48 CR12 Shahin, Nassif, Hamsa (CR28) 2019; 7 CR11 Liu, Li, Yuan (CR21) 2018; 2018 CR33 CR32 CR31 CR30 Kleinschmidt, Sridharan, Mason (CR19) 2011; 25 Wang, An (CR35) 2015; 6 Attabi, Dumouchel (CR5) 2013; 4 Alsteris, Paliwal, Leigh (CR3) 2006; 48 McCowan, Dean, McLaren, Vogt (CR23) 2011; 19 CR2 CR8 Hinton (CR16) 2012; 29 CR29 CR9 CR27 CR26 CR25 Burkhardt, Paeschke, Rolfes, Sendimeier, Weiss (CR7) 2005; 5 CR22 CR20 Anagnostopoulos, Iliou, Giannoukos (CR4) 2015; 43 Mower, Mataric, Narayanan (CR24) 2011; 19 Abdel-Hamid, Mohamed, Jiang, Deng, Penn, Yu (CR1) 2014; 22 Ayadi, Kamel, Karray (CR6) 2011; 44 Er (CR10) 2020; 8 13725_CR31 E Mower (13725_CR24) 2011; 19 13725_CR32 13725_CR11 13725_CR33 13725_CR12 I Shahin (13725_CR28) 2019; 7 D Ververidis (13725_CR34) 2006; 48 13725_CR30 13725_CR17 13725_CR39 13725_CR18 LD Alsteris (13725_CR3) 2006; 48 13725_CR13 13725_CR14 13725_CR36 13725_CR15 13725_CR37 13725_CR38 Y Attabi (13725_CR5) 2013; 4 K Wang (13725_CR35) 2015; 6 MEI Ayadi (13725_CR6) 2011; 44 13725_CR9 C-N Anagnostopoulos (13725_CR4) 2015; 43 13725_CR8 T Kleinschmidt (13725_CR19) 2011; 25 I McCowan (13725_CR23) 2011; 19 Y Liu (13725_CR21) 2018; 2018 13725_CR20 MB Er (13725_CR10) 2020; 8 13725_CR22 13725_CR2 13725_CR29 13725_CR25 13725_CR26 G Hinton (13725_CR16) 2012; 29 13725_CR27 O Abdel-Hamid (13725_CR1) 2014; 22 F Burkhardt (13725_CR7) 2005; 5 |
| References_xml | – volume: 4 start-page: 280 issue: 3 year: 2013 end-page: 290 ident: CR5 article-title: Anchor models for emotion recognition from speech publication-title: IEEE Trans Affective Comput doi: 10.1109/T-AFFC.2013.17 – ident: CR22 – ident: CR18 – volume: 25 start-page: 585 issue: 3 year: 2011 end-page: 600 ident: CR19 article-title: Computer publication-title: Speech Language doi: 10.1016/j.csl.2010.09.001 – volume: 43 start-page: 155 issue: 2 year: 2015 end-page: 177 ident: CR4 article-title: Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 publication-title: Artif Intell Rev doi: 10.1007/s10462-012-9368-5 – volume: 19 start-page: 1057 issue: 5 year: 2011 end-page: 1070 ident: CR24 article-title: A framework for automatic human emotion classification using emotion profiles publication-title: IEEE Trans Audio, Speech Language Process doi: 10.1109/TASL.2010.2076804 – volume: 7 start-page: 26777 year: 2019 end-page: 26787 ident: CR28 article-title: Emotion recognition using hybrid Gaussian mixture model and deep neural network publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2901352 – ident: CR14 – ident: CR39 – ident: CR2 – ident: CR37 – ident: CR12 – ident: CR30 – volume: 2018 start-page: 3254 year: 2018 end-page: 3258 ident: CR21 article-title: A Complete Canonical Correlation Analysis for Multiview Learning publication-title: 25th IEEE Int Conf Image Process (ICIP). Athens – ident: CR33 – ident: CR29 – ident: CR8 – ident: CR25 – ident: CR27 – volume: 22 start-page: 1533 issue: 10 year: 2014 end-page: 1545 ident: CR1 article-title: Convolutional neural networks for speech recognition publication-title: Ieee/Acm Trans Audio, Speech, Language Process doi: 10.1109/TASLP.2014.2339736 – volume: 8 start-page: 221640 year: 2020 end-page: 221653 ident: CR10 article-title: A novel approach for classification of speech emotions based on deep and acoustic features publication-title: IEEE Access doi: 10.1109/ACCESS.2020.3043201 – volume: 5 start-page: 1517 year: 2005 end-page: 1520 ident: CR7 article-title: A database of germ an emotional speech publication-title: Interspeech doi: 10.21437/Interspeech.2005-446 – volume: 19 start-page: 2026 issue: 7 year: 2011 end-page: 2038 ident: CR23 article-title: Sridharan S, the delta-phase spectrum with application to voice activity detection and speaker recognition, IEEE transactions on audio publication-title: Speech Language Process doi: 10.1109/TASL.2011.2109379 – volume: 6 start-page: 69 issue: 1 year: 2015 end-page: 75 ident: CR35 article-title: Bing Nan li, Yanyong Zhang, and Lian li. Speech emotion recognition using fourier parameters publication-title: IEEE Trans Affect Comput doi: 10.1109/TAFFC.2015.2392101 – ident: CR15 – ident: CR38 – ident: CR17 – volume: 48 start-page: 1162 year: 2006 end-page: 1181 ident: CR34 article-title: Emotional speech recognition: resources, features, and methods publication-title: Speech Comm doi: 10.1016/j.specom.2006.04.003 – volume: 48 start-page: 727 issue: 6 year: 2006 end-page: 736 ident: CR3 publication-title: Paliwal, Further intelligibility results from human listening tests using the short-time phase spectrum, Speech Communication – volume: 44 start-page: 572 issue: 3 year: 2011 end-page: 587 ident: CR6 article-title: Survey on speech emotion recognition: Features, classification schemes, and databases publication-title: Patt Recog doi: 10.1016/j.patcog.2010.09.020 – ident: CR31 – ident: CR13 – ident: CR11 – ident: CR9 – ident: CR32 – ident: CR36 – volume: 29 start-page: 82 issue: 6 year: 2012 end-page: 97 ident: CR16 article-title: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups publication-title: IEEE Signal Process Mag doi: 10.1109/MSP.2012.2205597 – ident: CR26 – ident: CR20 – ident: 13725_CR17 doi: 10.1155/2014/749604 – ident: 13725_CR2 doi: 10.1016/j.dsp.2006.06.007 – volume: 43 start-page: 155 issue: 2 year: 2015 ident: 13725_CR4 publication-title: Artif Intell Rev doi: 10.1007/s10462-012-9368-5 – ident: 13725_CR29 – ident: 13725_CR25 doi: 10.1155/2015/394083 – ident: 13725_CR33 doi: 10.1109/ICASSP.2018.8462677 – ident: 13725_CR15 doi: 10.21437/Interspeech.2014-57 – ident: 13725_CR31 doi: 10.1109/ICASSP.2016.7472669 – volume: 22 start-page: 1533 issue: 10 year: 2014 ident: 13725_CR1 publication-title: Ieee/Acm Trans Audio, Speech, Language Process doi: 10.1109/TASLP.2014.2339736 – volume: 29 start-page: 82 issue: 6 year: 2012 ident: 13725_CR16 publication-title: IEEE Signal Process Mag doi: 10.1109/MSP.2012.2205597 – volume: 44 start-page: 572 issue: 3 year: 2011 ident: 13725_CR6 publication-title: Patt Recog doi: 10.1016/j.patcog.2010.09.020 – ident: 13725_CR38 doi: 10.1016/j.procs.2017.08.003 – ident: 13725_CR37 doi: 10.1109/TMM.2017.2766843 – volume: 2018 start-page: 3254 year: 2018 ident: 13725_CR21 publication-title: 25th IEEE Int Conf Image Process (ICIP). Athens – volume: 19 start-page: 2026 issue: 7 year: 2011 ident: 13725_CR23 publication-title: Speech Language Process doi: 10.1109/TASL.2011.2109379 – ident: 13725_CR27 doi: 10.1109/TENCON.2019.8929459 – volume: 7 start-page: 26777 year: 2019 ident: 13725_CR28 publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2901352 – volume: 4 start-page: 280 issue: 3 year: 2013 ident: 13725_CR5 publication-title: IEEE Trans Affective Comput doi: 10.1109/T-AFFC.2013.17 – ident: 13725_CR13 doi: 10.1109/ACII.2019.8925444 – volume: 48 start-page: 727 issue: 6 year: 2006 ident: 13725_CR3 publication-title: Paliwal, Further intelligibility results from human listening tests using the short-time phase spectrum, Speech Communication – volume: 6 start-page: 69 issue: 1 year: 2015 ident: 13725_CR35 publication-title: IEEE Trans Affect Comput doi: 10.1109/TAFFC.2015.2392101 – volume: 25 start-page: 585 issue: 3 year: 2011 ident: 13725_CR19 publication-title: Speech Language doi: 10.1016/j.csl.2010.09.001 – ident: 13725_CR26 – ident: 13725_CR8 doi: 10.1109/EAIS48028.2020.9122698 – ident: 13725_CR20 doi: 10.21437/Interspeech.2014-391 – ident: 13725_CR30 – ident: 13725_CR32 – volume: 48 start-page: 1162 year: 2006 ident: 13725_CR34 publication-title: Speech Comm doi: 10.1016/j.specom.2006.04.003 – ident: 13725_CR9 – volume: 19 start-page: 1057 issue: 5 year: 2011 ident: 13725_CR24 publication-title: IEEE Trans Audio, Speech Language Process doi: 10.1109/TASL.2010.2076804 – ident: 13725_CR36 doi: 10.1007/978-3-642-34447-3_48 – volume: 5 start-page: 1517 year: 2005 ident: 13725_CR7 publication-title: Interspeech doi: 10.21437/Interspeech.2005-446 – ident: 13725_CR22 doi: 10.1109/ICASSP.2016.7471742 – volume: 8 start-page: 221640 year: 2020 ident: 13725_CR10 publication-title: IEEE Access doi: 10.1109/ACCESS.2020.3043201 – ident: 13725_CR11 doi: 10.1109/ICASSP.2015.7177963 – ident: 13725_CR12 doi: 10.1007/978-3-319-70772-3_1 – ident: 13725_CR18 – ident: 13725_CR14 doi: 10.21437/Interspeech.2015-6 – ident: 13725_CR39 doi: 10.1371/journal.pone.0196391 |
| SSID | ssj0016524 |
| Score | 2.4555416 |
| Snippet | Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 11897 |
| SubjectTerms | Accuracy Acknowledgment Annotations Circumplex models Classification Classifiers Computer Communication Networks Computer Science Confusion Data Structures and Information Theory Datasets Emotion recognition Emotions Linguistics Matrices Multimedia Information Systems Phonemes Phonetics Prosody Recognition Researcher subject relations Robustness Special Purpose and Application-Based Systems Speech Speech recognition Speeches Subjectivity |
| SummonAdditionalLinks | – databaseName: Computer Science Database dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NS8MwFH_o9KAHp1NxOiUHbxpss6xZTyLDIX4MwQ92K_lUYXRzm-L-e5M2XVXQi7fSvobSl_eR5PfeD-CQE2ZcpoptekwxDTnHgukAKx7FWghJTIZ2f7xmvV67349v_YbbxMMqC5-YOWo1lG6P_IQwG7vsWqLZPB29Ysca5U5XPYXGIiyFhIRunl8xPD9FiFqe1LYdYBsZQ180k5fOha4wxWHZwyYjLTz7HpjKbPPHAWkWd7rV_37xOqz5jBOd5VNkAxZ0WoNqweaAvHHXYPVLa8JNuLwbaS2fkc5pftAcaGSvxQw5uPwTyvDo-gPddDsdxFOFlNYjlKOzrecYoIxnZwseuuf3nQvseRewtAY5xZwa5ZY5UasVCUZjo2KbB0pKJKd2ASMZMUS5AlZuo7sKgzYzcSxMIDilQsZBcxsq6TDVO4C0vWkYk8aIiAbGSmipI8VtGkqI0LoOYfHTE-mbkjtujEFStlN2ikqseJIpKpnV4Wj-zihvyfGndKPQTuLNc5KUqqnDcaHf8vHvo-3-PdoerDg6-hyj1oDKdPym92FZvk9fJuODbHJ-Ahcb6YI priority: 102 providerName: ProQuest |
| Title | Speech emotion recognition by using complex MFCC and deep sequential model |
| URI | https://link.springer.com/article/10.1007/s11042-022-13725-y https://www.proquest.com/docview/2781944833 |
| Volume | 82 |
| WOSCitedRecordID | wos000852929600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-7721 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0016524 issn: 1380-7501 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1ZSwMxEB68HvTBemI9Sh5808Bumt10H7W0iEct3vqyJNlEhVKLrWL_vZM9WhUV9CXsMRuWJJP5hnwzA7AtmbAOqVKEx5xyX0qqhPFoIsPIKKWZTdnuV8ei1ard3ETtPCisX7DdiyPJdKceB7v5LpTEsc_9qmABHU7CNJq7mlPHs_Or0dlBGOSlbGseRXvo56Ey3_fx2RyNMeaXY9HU2jRL__vPBZjP0SXZy5bDIkyY7hKUisoNJFfkJZj7kIZwGQ7Pe8boB2Kykj5kRCrCazUkjhp_T1LuuXkjJ816nchuQhJjeiRjYuMu0SFpTZ0VuGw2LuoHNK-xQDUq34BKbhPn0oRBECrBI5tEiPk0Z1pydFa0YJYlLlhVoiVPfK8mbBQp6ynJudKRV12Fqe5T16wBMfjQCqGtVSH3LEoYbcJEIuRkTBlTBr8Y6ljnCchdHYxOPE6d7IYuRvE4Hbp4WIad0Te9LP3Gr9KbxQzGuSr2YyYQ9KATWq2WYbeYsfHrn3tb_5v4Bsy6UvQZP20TpgbPL2YLZvTr4LH_XIFJcX1bgen9Rqt9hndHgmJ74tVdy06xbQd3lXQhvwN0_eao |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3LTttAFL0CigRd8GoRaSnMoqzoqPZk4okXFUJpIyghQoJW7Mw87gBSFFKSPvJT_Ubu-JEAUtmx6M6yxyPZc-beY8-5cwDea6F8YKqc6LHkMtaaG4URdzpJ0RgrfK52_95R3W7z_Dw9mYG_VS1MkFVWMTEP1O7Ghn_kH4Wi3EXfEvX63uAHD65RYXW1stAoYHGE49_0yTb8dPiZxndHiPaXs9YBL10FuCW4jbiW3gUSnzQaiVEy9S4llmOlsFoSPbdKeOFCeaam3OXiqKl8mhofGS2lsWlUp35n4YWUNB2CVDBqTVYtkkZpotuMOGXiuCzSKUr14lAIE7TzcV2JBh8_TIRTdvtoQTbPc-3l_-0NrcBSyajZfjEFVmEG-2uwXLlVsDJ4rcHLe1svvoKvpwNEe8WwsDFiEyEVHZsxC-UAlyzX2-MfdtxutZjuO-YQB6xQn1Nk7LHcR-g1fHuW51uHuf5NHzeAIZ30SlnvTSIjTy3QYuI00WwhDGIN4mqQM1tuuh68P3rZdLvoAIyMmmc5MLJxDXYn9wyKLUeebL1ZoSErw88wm0KhBh8qPE0v_7u3N0_3tg0LB2fHnaxz2D16C4uCCF-hx9uEudHtT3wH8_bX6Hp4u5VPDAYXz42zO3zwRMM |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT9tAEB5RqCo4QEuLCI92D-2pXWFv1t74gBAKRKXQKFIfQr24-5gFJBQCCS35a_w6Zv1ISqVy49CbZa9XsvfzzDfemfkA3mqhfGCqnOix5DLWmhuFEXc6zdAYK3yR7f79SHW7rePjrDcDt3UtTEirrG1iYajdhQ3_yLeEIt9FsUSzueWrtIjeXmdncMmDglTYaa3lNEqIHOL4N4Vvw-2DPVrrd0J09r-2P_JKYYBbgt6Ia-ldIPRpkqRGycy7jBiPlcJqSVTdKuGFC6WamvyYi6OW8llmfGS0lMZmUZPmfQJzimLMEPj1kh-THYw0qQR1WxEnrxxXBTtl2V4cimJCHn3cVCLh4_tOccp0_9qcLXxeZ-l_flvPYbFi2my3_DRewAz2l2GpVrFglVFbhoU_WjK-hE9fBoj2lGEpb8QmCVZ0bMYslAmcsCIPH2_Y5067zXTfMYc4YGVWOlnMc1boC72Cb4_yfCsw27_o4yowpJNeKeu9SWXkaQRaTJ0m-i2EQWxAXC94bqtm7EET5DyftpEOIMlpeF6AJB834P3knkHZiuTB0Rs1MvLKLA3zKSwa8KHG1vTyv2dbe3i2N_CM4JUfHXQP12FeEA8s0_Q2YHZ0dY2b8NT-Gp0Nr14X3wiDn48NsztXHE3U |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Speech+emotion+recognition+by+using+complex+MFCC+and+deep+sequential+model&rft.jtitle=Multimedia+tools+and+applications&rft.au=Patnaik%2C+Suprava&rft.date=2023-03-01&rft.issn=1380-7501&rft.eissn=1573-7721&rft.volume=82&rft.issue=8&rft.spage=11897&rft.epage=11922&rft_id=info:doi/10.1007%2Fs11042-022-13725-y&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s11042_022_13725_y |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1380-7501&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1380-7501&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1380-7501&client=summon |