Speech emotion recognition by using complex MFCC and deep sequential model

Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finit...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications Vol. 82; no. 8; pp. 11897 - 11922
Main Author: Patnaik, Suprava
Format: Journal Article
Language:English
Published: New York Springer US 01.03.2023
Springer Nature B.V
Subjects:
ISSN:1380-7501, 1573-7721
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using Complex Mel Frequency Cepstral Coefficients (c-MFCC) as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell’s emotional circumplex pattern. As per the Russell’s 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices.
AbstractList Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using Complex Mel Frequency Cepstral Coefficients (c-MFCC) as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell’s emotional circumplex pattern. As per the Russell’s 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices.
Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using Complex Mel Frequency Cepstral Coefficients (c-MFCC) as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell’s emotional circumplex pattern. As per the Russell’s 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices.
Author Patnaik, Suprava
Author_xml – sequence: 1
  givenname: Suprava
  orcidid: 0000-0002-7068-5960
  surname: Patnaik
  fullname: Patnaik, Suprava
  email: suprava.patnaikfet@kiit.ac.in
  organization: School of Electronics, Kalinga Institute of Industrial Technology
BookMark eNp9kM1OwzAQhC1UJNrCC3CyxNngv8TxEUWUgoo4AGfLcZySKrGDnUrk7UkbJCQOPe0c5tudnQWYOe8sANcE3xKMxV0kBHOKMKWIMEETNJyBOUkEQ0JQMhs1yzASCSYXYBHjDmOSJpTPwfNbZ635hLb1fe0dDNb4rauPuhjgPtZuC41vu8Z-w5dVnkPtSlha28Fov_bW9bVuYOtL21yC80o30V79ziX4WD2852u0eX18yu83yDAie6R5VdIUJ2mSpIXgsiolkdRwajQnghtBK1qmXAhN07QkOBOVlEWFC815YSRmS3Az7e2CHxPEXu38PrjxpKIiI5LzjLHRlU0uE3yMwVbK1L0-_NUHXTeKYHVoTk3NqbE5dWxODSNK_6FdqFsdhtMQm6A4mt3Whr9UJ6gfgPCCWw
CitedBy_id crossref_primary_10_1145_3687303
crossref_primary_10_1007_s11265_024_01929_4
crossref_primary_10_1109_ACCESS_2024_3370431
crossref_primary_10_1007_s11042_024_19674_y
crossref_primary_10_1016_j_specom_2024_103102
crossref_primary_10_1109_ACCESS_2024_3517733
crossref_primary_10_1109_ACCESS_2023_3326071
crossref_primary_10_3390_app15136958
crossref_primary_10_1016_j_cmpb_2024_108564
crossref_primary_10_1007_s11042_023_17406_2
crossref_primary_10_1007_s11518_024_5607_y
crossref_primary_10_1016_j_inffus_2023_101847
crossref_primary_10_1021_jacs_5c11632
crossref_primary_10_3390_app14062252
crossref_primary_10_1016_j_bspc_2023_105708
crossref_primary_10_3390_app14209458
crossref_primary_10_1016_j_engappai_2024_109103
crossref_primary_10_1016_j_knosys_2025_113414
crossref_primary_10_1007_s11042_023_17915_0
crossref_primary_10_1007_s11042_024_18298_6
crossref_primary_10_1109_ACCESS_2024_3490186
crossref_primary_10_3390_ani14142029
crossref_primary_10_3390_electronics12112512
Cites_doi 10.1109/T-AFFC.2013.17
10.1016/j.csl.2010.09.001
10.1007/s10462-012-9368-5
10.1109/TASL.2010.2076804
10.1109/ACCESS.2019.2901352
10.1109/TASLP.2014.2339736
10.1109/ACCESS.2020.3043201
10.21437/Interspeech.2005-446
10.1109/TASL.2011.2109379
10.1109/TAFFC.2015.2392101
10.1016/j.specom.2006.04.003
10.1016/j.patcog.2010.09.020
10.1109/MSP.2012.2205597
10.1155/2014/749604
10.1016/j.dsp.2006.06.007
10.1155/2015/394083
10.1109/ICASSP.2018.8462677
10.21437/Interspeech.2014-57
10.1109/ICASSP.2016.7472669
10.1016/j.procs.2017.08.003
10.1109/TMM.2017.2766843
10.1109/TENCON.2019.8929459
10.1109/ACII.2019.8925444
10.1109/EAIS48028.2020.9122698
10.21437/Interspeech.2014-391
10.1007/978-3-642-34447-3_48
10.1109/ICASSP.2016.7471742
10.1109/ICASSP.2015.7177963
10.1007/978-3-319-70772-3_1
10.21437/Interspeech.2015-6
10.1371/journal.pone.0196391
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
DBID AAYXX
CITATION
3V.
7SC
7T9
7WY
7WZ
7XB
87Z
8AL
8AO
8FD
8FE
8FG
8FK
8FL
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
L.-
L7M
L~C
L~D
M0C
M0N
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PKEHL
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
DOI 10.1007/s11042-022-13725-y
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
Linguistics and Language Behavior Abstracts (LLBA)
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni)
ProQuest Research Library
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest Central
Business Premium Collection
ProQuest Technology Collection
ProQuest One Community College
ProQuest Central Korea
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
ProQuest Research Library
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest One Academic Middle East (New)
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
DatabaseTitle CrossRef
ProQuest Business Collection (Alumni Edition)
Research Library Prep
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest Central China
ABI/INFORM Complete
ProQuest One Applied & Life Sciences
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest One Academic Eastern Edition
Linguistics and Language Behavior Abstracts (LLBA)
ProQuest Technology Collection
ProQuest Business Collection
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ABI/INFORM Global (Corporate)
ProQuest One Business
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Central (Alumni Edition)
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Pharma Collection
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest Central Korea
ProQuest Research Library
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
ProQuest Computing
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Business (Alumni)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList
ProQuest Business Collection (Alumni Edition)
Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1573-7721
EndPage 11922
ExternalDocumentID 10_1007_s11042_022_13725_y
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
.4S
.86
.DC
.VR
06D
0R~
0VY
123
1N0
1SB
2.D
203
28-
29M
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3EH
3V.
4.4
406
408
409
40D
40E
5QI
5VS
67Z
6NX
7WY
8AO
8FE
8FG
8FL
8G5
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACREN
ACSNA
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I-F
I09
IHE
IJ-
IKXTQ
ITG
ITH
ITM
IWAJR
IXC
IXE
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
N2Q
N9A
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TH9
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7S
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z86
Z88
Z8M
Z8N
Z8Q
Z8R
Z8S
Z8T
Z8U
Z8W
Z92
ZMTXR
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
7SC
7T9
7XB
8AL
8FD
8FK
JQ2
L.-
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c319t-a4fd26056556b749fd9192c42ca4174c72f2d6477a266d1087f99bf0ba44bc903
IEDL.DBID RSV
ISICitedReferencesCount 30
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000852929600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1380-7501
IngestDate Sat Nov 08 04:38:29 EST 2025
Tue Nov 18 21:56:13 EST 2025
Sat Nov 29 06:20:21 EST 2025
Fri Feb 21 02:44:39 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords Speech emotion
1-D CNN
Emotion circumplex
MFCC
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-a4fd26056556b749fd9192c42ca4174c72f2d6477a266d1087f99bf0ba44bc903
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-7068-5960
PQID 2781944833
PQPubID 54626
PageCount 26
ParticipantIDs proquest_journals_2781944833
crossref_citationtrail_10_1007_s11042_022_13725_y
crossref_primary_10_1007_s11042_022_13725_y
springer_journals_10_1007_s11042_022_13725_y
PublicationCentury 2000
PublicationDate 20230300
2023-03-00
20230301
PublicationDateYYYYMMDD 2023-03-01
PublicationDate_xml – month: 3
  year: 2023
  text: 20230300
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationSubtitle An International Journal
PublicationTitle Multimedia tools and applications
PublicationTitleAbbrev Multimed Tools Appl
PublicationYear 2023
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References CR18
CR17
CR39
CR38
CR15
CR37
CR14
CR36
CR13
Ververidis, Koropoulos (CR34) 2006; 48
CR12
Shahin, Nassif, Hamsa (CR28) 2019; 7
CR11
Liu, Li, Yuan (CR21) 2018; 2018
CR33
CR32
CR31
CR30
Kleinschmidt, Sridharan, Mason (CR19) 2011; 25
Wang, An (CR35) 2015; 6
Attabi, Dumouchel (CR5) 2013; 4
Alsteris, Paliwal, Leigh (CR3) 2006; 48
McCowan, Dean, McLaren, Vogt (CR23) 2011; 19
CR2
CR8
Hinton (CR16) 2012; 29
CR29
CR9
CR27
CR26
CR25
Burkhardt, Paeschke, Rolfes, Sendimeier, Weiss (CR7) 2005; 5
CR22
CR20
Anagnostopoulos, Iliou, Giannoukos (CR4) 2015; 43
Mower, Mataric, Narayanan (CR24) 2011; 19
Abdel-Hamid, Mohamed, Jiang, Deng, Penn, Yu (CR1) 2014; 22
Ayadi, Kamel, Karray (CR6) 2011; 44
Er (CR10) 2020; 8
13725_CR31
E Mower (13725_CR24) 2011; 19
13725_CR32
13725_CR11
13725_CR33
13725_CR12
I Shahin (13725_CR28) 2019; 7
D Ververidis (13725_CR34) 2006; 48
13725_CR30
13725_CR17
13725_CR39
13725_CR18
LD Alsteris (13725_CR3) 2006; 48
13725_CR13
13725_CR14
13725_CR36
13725_CR15
13725_CR37
13725_CR38
Y Attabi (13725_CR5) 2013; 4
K Wang (13725_CR35) 2015; 6
MEI Ayadi (13725_CR6) 2011; 44
13725_CR9
C-N Anagnostopoulos (13725_CR4) 2015; 43
13725_CR8
T Kleinschmidt (13725_CR19) 2011; 25
I McCowan (13725_CR23) 2011; 19
Y Liu (13725_CR21) 2018; 2018
13725_CR20
MB Er (13725_CR10) 2020; 8
13725_CR22
13725_CR2
13725_CR29
13725_CR25
13725_CR26
G Hinton (13725_CR16) 2012; 29
13725_CR27
O Abdel-Hamid (13725_CR1) 2014; 22
F Burkhardt (13725_CR7) 2005; 5
References_xml – volume: 4
  start-page: 280
  issue: 3
  year: 2013
  end-page: 290
  ident: CR5
  article-title: Anchor models for emotion recognition from speech
  publication-title: IEEE Trans Affective Comput
  doi: 10.1109/T-AFFC.2013.17
– ident: CR22
– ident: CR18
– volume: 25
  start-page: 585
  issue: 3
  year: 2011
  end-page: 600
  ident: CR19
  article-title: Computer
  publication-title: Speech Language
  doi: 10.1016/j.csl.2010.09.001
– volume: 43
  start-page: 155
  issue: 2
  year: 2015
  end-page: 177
  ident: CR4
  article-title: Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011
  publication-title: Artif Intell Rev
  doi: 10.1007/s10462-012-9368-5
– volume: 19
  start-page: 1057
  issue: 5
  year: 2011
  end-page: 1070
  ident: CR24
  article-title: A framework for automatic human emotion classification using emotion profiles
  publication-title: IEEE Trans Audio, Speech Language Process
  doi: 10.1109/TASL.2010.2076804
– volume: 7
  start-page: 26777
  year: 2019
  end-page: 26787
  ident: CR28
  article-title: Emotion recognition using hybrid Gaussian mixture model and deep neural network
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2901352
– ident: CR14
– ident: CR39
– ident: CR2
– ident: CR37
– ident: CR12
– ident: CR30
– volume: 2018
  start-page: 3254
  year: 2018
  end-page: 3258
  ident: CR21
  article-title: A Complete Canonical Correlation Analysis for Multiview Learning
  publication-title: 25th IEEE Int Conf Image Process (ICIP). Athens
– ident: CR33
– ident: CR29
– ident: CR8
– ident: CR25
– ident: CR27
– volume: 22
  start-page: 1533
  issue: 10
  year: 2014
  end-page: 1545
  ident: CR1
  article-title: Convolutional neural networks for speech recognition
  publication-title: Ieee/Acm Trans Audio, Speech, Language Process
  doi: 10.1109/TASLP.2014.2339736
– volume: 8
  start-page: 221640
  year: 2020
  end-page: 221653
  ident: CR10
  article-title: A novel approach for classification of speech emotions based on deep and acoustic features
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.3043201
– volume: 5
  start-page: 1517
  year: 2005
  end-page: 1520
  ident: CR7
  article-title: A database of germ an emotional speech
  publication-title: Interspeech
  doi: 10.21437/Interspeech.2005-446
– volume: 19
  start-page: 2026
  issue: 7
  year: 2011
  end-page: 2038
  ident: CR23
  article-title: Sridharan S, the delta-phase spectrum with application to voice activity detection and speaker recognition, IEEE transactions on audio
  publication-title: Speech Language Process
  doi: 10.1109/TASL.2011.2109379
– volume: 6
  start-page: 69
  issue: 1
  year: 2015
  end-page: 75
  ident: CR35
  article-title: Bing Nan li, Yanyong Zhang, and Lian li. Speech emotion recognition using fourier parameters
  publication-title: IEEE Trans Affect Comput
  doi: 10.1109/TAFFC.2015.2392101
– ident: CR15
– ident: CR38
– ident: CR17
– volume: 48
  start-page: 1162
  year: 2006
  end-page: 1181
  ident: CR34
  article-title: Emotional speech recognition: resources, features, and methods
  publication-title: Speech Comm
  doi: 10.1016/j.specom.2006.04.003
– volume: 48
  start-page: 727
  issue: 6
  year: 2006
  end-page: 736
  ident: CR3
  publication-title: Paliwal, Further intelligibility results from human listening tests using the short-time phase spectrum, Speech Communication
– volume: 44
  start-page: 572
  issue: 3
  year: 2011
  end-page: 587
  ident: CR6
  article-title: Survey on speech emotion recognition: Features, classification schemes, and databases
  publication-title: Patt Recog
  doi: 10.1016/j.patcog.2010.09.020
– ident: CR31
– ident: CR13
– ident: CR11
– ident: CR9
– ident: CR32
– ident: CR36
– volume: 29
  start-page: 82
  issue: 6
  year: 2012
  end-page: 97
  ident: CR16
  article-title: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
  publication-title: IEEE Signal Process Mag
  doi: 10.1109/MSP.2012.2205597
– ident: CR26
– ident: CR20
– ident: 13725_CR17
  doi: 10.1155/2014/749604
– ident: 13725_CR2
  doi: 10.1016/j.dsp.2006.06.007
– volume: 43
  start-page: 155
  issue: 2
  year: 2015
  ident: 13725_CR4
  publication-title: Artif Intell Rev
  doi: 10.1007/s10462-012-9368-5
– ident: 13725_CR29
– ident: 13725_CR25
  doi: 10.1155/2015/394083
– ident: 13725_CR33
  doi: 10.1109/ICASSP.2018.8462677
– ident: 13725_CR15
  doi: 10.21437/Interspeech.2014-57
– ident: 13725_CR31
  doi: 10.1109/ICASSP.2016.7472669
– volume: 22
  start-page: 1533
  issue: 10
  year: 2014
  ident: 13725_CR1
  publication-title: Ieee/Acm Trans Audio, Speech, Language Process
  doi: 10.1109/TASLP.2014.2339736
– volume: 29
  start-page: 82
  issue: 6
  year: 2012
  ident: 13725_CR16
  publication-title: IEEE Signal Process Mag
  doi: 10.1109/MSP.2012.2205597
– volume: 44
  start-page: 572
  issue: 3
  year: 2011
  ident: 13725_CR6
  publication-title: Patt Recog
  doi: 10.1016/j.patcog.2010.09.020
– ident: 13725_CR38
  doi: 10.1016/j.procs.2017.08.003
– ident: 13725_CR37
  doi: 10.1109/TMM.2017.2766843
– volume: 2018
  start-page: 3254
  year: 2018
  ident: 13725_CR21
  publication-title: 25th IEEE Int Conf Image Process (ICIP). Athens
– volume: 19
  start-page: 2026
  issue: 7
  year: 2011
  ident: 13725_CR23
  publication-title: Speech Language Process
  doi: 10.1109/TASL.2011.2109379
– ident: 13725_CR27
  doi: 10.1109/TENCON.2019.8929459
– volume: 7
  start-page: 26777
  year: 2019
  ident: 13725_CR28
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2901352
– volume: 4
  start-page: 280
  issue: 3
  year: 2013
  ident: 13725_CR5
  publication-title: IEEE Trans Affective Comput
  doi: 10.1109/T-AFFC.2013.17
– ident: 13725_CR13
  doi: 10.1109/ACII.2019.8925444
– volume: 48
  start-page: 727
  issue: 6
  year: 2006
  ident: 13725_CR3
  publication-title: Paliwal, Further intelligibility results from human listening tests using the short-time phase spectrum, Speech Communication
– volume: 6
  start-page: 69
  issue: 1
  year: 2015
  ident: 13725_CR35
  publication-title: IEEE Trans Affect Comput
  doi: 10.1109/TAFFC.2015.2392101
– volume: 25
  start-page: 585
  issue: 3
  year: 2011
  ident: 13725_CR19
  publication-title: Speech Language
  doi: 10.1016/j.csl.2010.09.001
– ident: 13725_CR26
– ident: 13725_CR8
  doi: 10.1109/EAIS48028.2020.9122698
– ident: 13725_CR20
  doi: 10.21437/Interspeech.2014-391
– ident: 13725_CR30
– ident: 13725_CR32
– volume: 48
  start-page: 1162
  year: 2006
  ident: 13725_CR34
  publication-title: Speech Comm
  doi: 10.1016/j.specom.2006.04.003
– ident: 13725_CR9
– volume: 19
  start-page: 1057
  issue: 5
  year: 2011
  ident: 13725_CR24
  publication-title: IEEE Trans Audio, Speech Language Process
  doi: 10.1109/TASL.2010.2076804
– ident: 13725_CR36
  doi: 10.1007/978-3-642-34447-3_48
– volume: 5
  start-page: 1517
  year: 2005
  ident: 13725_CR7
  publication-title: Interspeech
  doi: 10.21437/Interspeech.2005-446
– ident: 13725_CR22
  doi: 10.1109/ICASSP.2016.7471742
– volume: 8
  start-page: 221640
  year: 2020
  ident: 13725_CR10
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.3043201
– ident: 13725_CR11
  doi: 10.1109/ICASSP.2015.7177963
– ident: 13725_CR12
  doi: 10.1007/978-3-319-70772-3_1
– ident: 13725_CR18
– ident: 13725_CR14
  doi: 10.21437/Interspeech.2015-6
– ident: 13725_CR39
  doi: 10.1371/journal.pone.0196391
SSID ssj0016524
Score 2.4555416
Snippet Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 11897
SubjectTerms Accuracy
Acknowledgment
Annotations
Circumplex models
Classification
Classifiers
Computer Communication Networks
Computer Science
Confusion
Data Structures and Information Theory
Datasets
Emotion recognition
Emotions
Linguistics
Matrices
Multimedia Information Systems
Phonemes
Phonetics
Prosody
Recognition
Researcher subject relations
Robustness
Special Purpose and Application-Based Systems
Speech
Speech recognition
Speeches
Subjectivity
SummonAdditionalLinks – databaseName: Computer Science Database
  dbid: K7-
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NS8MwFH_o9KAHp1NxOiUHbxpss6xZTyLDIX4MwQ92K_lUYXRzm-L-e5M2XVXQi7fSvobSl_eR5PfeD-CQE2ZcpoptekwxDTnHgukAKx7FWghJTIZ2f7xmvV67349v_YbbxMMqC5-YOWo1lG6P_IQwG7vsWqLZPB29Ysca5U5XPYXGIiyFhIRunl8xPD9FiFqe1LYdYBsZQ180k5fOha4wxWHZwyYjLTz7HpjKbPPHAWkWd7rV_37xOqz5jBOd5VNkAxZ0WoNqweaAvHHXYPVLa8JNuLwbaS2fkc5pftAcaGSvxQw5uPwTyvDo-gPddDsdxFOFlNYjlKOzrecYoIxnZwseuuf3nQvseRewtAY5xZwa5ZY5UasVCUZjo2KbB0pKJKd2ASMZMUS5AlZuo7sKgzYzcSxMIDilQsZBcxsq6TDVO4C0vWkYk8aIiAbGSmipI8VtGkqI0LoOYfHTE-mbkjtujEFStlN2ikqseJIpKpnV4Wj-zihvyfGndKPQTuLNc5KUqqnDcaHf8vHvo-3-PdoerDg6-hyj1oDKdPym92FZvk9fJuODbHJ-Ahcb6YI
  priority: 102
  providerName: ProQuest
Title Speech emotion recognition by using complex MFCC and deep sequential model
URI https://link.springer.com/article/10.1007/s11042-022-13725-y
https://www.proquest.com/docview/2781944833
Volume 82
WOSCitedRecordID wos000852929600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1573-7721
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016524
  issn: 1380-7501
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1ZSwMxEB68HvTBemI9Sh5808Bumt10H7W0iEct3vqyJNlEhVKLrWL_vZM9WhUV9CXsMRuWJJP5hnwzA7AtmbAOqVKEx5xyX0qqhPFoIsPIKKWZTdnuV8ei1ard3ETtPCisX7DdiyPJdKceB7v5LpTEsc_9qmABHU7CNJq7mlPHs_Or0dlBGOSlbGseRXvo56Ey3_fx2RyNMeaXY9HU2jRL__vPBZjP0SXZy5bDIkyY7hKUisoNJFfkJZj7kIZwGQ7Pe8boB2Kykj5kRCrCazUkjhp_T1LuuXkjJ816nchuQhJjeiRjYuMu0SFpTZ0VuGw2LuoHNK-xQDUq34BKbhPn0oRBECrBI5tEiPk0Z1pydFa0YJYlLlhVoiVPfK8mbBQp6ynJudKRV12Fqe5T16wBMfjQCqGtVSH3LEoYbcJEIuRkTBlTBr8Y6ljnCchdHYxOPE6d7IYuRvE4Hbp4WIad0Te9LP3Gr9KbxQzGuSr2YyYQ9KATWq2WYbeYsfHrn3tb_5v4Bsy6UvQZP20TpgbPL2YLZvTr4LH_XIFJcX1bgen9Rqt9hndHgmJ74tVdy06xbQd3lXQhvwN0_eao
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3LTttAFL0CigRd8GoRaSnMoqzoqPZk4okXFUJpIyghQoJW7Mw87gBSFFKSPvJT_Ubu-JEAUtmx6M6yxyPZc-beY8-5cwDea6F8YKqc6LHkMtaaG4URdzpJ0RgrfK52_95R3W7z_Dw9mYG_VS1MkFVWMTEP1O7Ghn_kH4Wi3EXfEvX63uAHD65RYXW1stAoYHGE49_0yTb8dPiZxndHiPaXs9YBL10FuCW4jbiW3gUSnzQaiVEy9S4llmOlsFoSPbdKeOFCeaam3OXiqKl8mhofGS2lsWlUp35n4YWUNB2CVDBqTVYtkkZpotuMOGXiuCzSKUr14lAIE7TzcV2JBh8_TIRTdvtoQTbPc-3l_-0NrcBSyajZfjEFVmEG-2uwXLlVsDJ4rcHLe1svvoKvpwNEe8WwsDFiEyEVHZsxC-UAlyzX2-MfdtxutZjuO-YQB6xQn1Nk7LHcR-g1fHuW51uHuf5NHzeAIZ30SlnvTSIjTy3QYuI00WwhDGIN4mqQM1tuuh68P3rZdLvoAIyMmmc5MLJxDXYn9wyKLUeebL1ZoSErw88wm0KhBh8qPE0v_7u3N0_3tg0LB2fHnaxz2D16C4uCCF-hx9uEudHtT3wH8_bX6Hp4u5VPDAYXz42zO3zwRMM
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT9tAEB5RqCo4QEuLCI92D-2pXWFv1t74gBAKRKXQKFIfQr24-5gFJBQCCS35a_w6Zv1ISqVy49CbZa9XsvfzzDfemfkA3mqhfGCqnOix5DLWmhuFEXc6zdAYK3yR7f79SHW7rePjrDcDt3UtTEirrG1iYajdhQ3_yLeEIt9FsUSzueWrtIjeXmdncMmDglTYaa3lNEqIHOL4N4Vvw-2DPVrrd0J09r-2P_JKYYBbgt6Ia-ldIPRpkqRGycy7jBiPlcJqSVTdKuGFC6WamvyYi6OW8llmfGS0lMZmUZPmfQJzimLMEPj1kh-THYw0qQR1WxEnrxxXBTtl2V4cimJCHn3cVCLh4_tOccp0_9qcLXxeZ-l_flvPYbFi2my3_DRewAz2l2GpVrFglVFbhoU_WjK-hE9fBoj2lGEpb8QmCVZ0bMYslAmcsCIPH2_Y5067zXTfMYc4YGVWOlnMc1boC72Cb4_yfCsw27_o4yowpJNeKeu9SWXkaQRaTJ0m-i2EQWxAXC94bqtm7EET5DyftpEOIMlpeF6AJB834P3knkHZiuTB0Rs1MvLKLA3zKSwa8KHG1vTyv2dbe3i2N_CM4JUfHXQP12FeEA8s0_Q2YHZ0dY2b8NT-Gp0Nr14X3wiDn48NsztXHE3U
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Speech+emotion+recognition+by+using+complex+MFCC+and+deep+sequential+model&rft.jtitle=Multimedia+tools+and+applications&rft.au=Patnaik%2C+Suprava&rft.date=2023-03-01&rft.issn=1380-7501&rft.eissn=1573-7721&rft.volume=82&rft.issue=8&rft.spage=11897&rft.epage=11922&rft_id=info:doi/10.1007%2Fs11042-022-13725-y&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s11042_022_13725_y
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1380-7501&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1380-7501&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1380-7501&client=summon