Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features

Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single sense. This is due to the fact that sentiments may be conveyed in a wide range of modalities, each of which offers a different and complement...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Sensors (Basel, Switzerland) Ročník 23; číslo 12; s. 5475
Hlavní autoři: Mamieva, Dilnoza, Abdusalomov, Akmalbek Bobomirzaevich, Kutlimuratov, Alpamis, Muminov, Bahodir, Whangbo, Taeg Keun
Médium: Journal Article
Jazyk:angličtina
Vydáno: Switzerland MDPI AG 09.06.2023
MDPI
Témata:
ISSN:1424-8220, 1424-8220
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single sense. This is due to the fact that sentiments may be conveyed in a wide range of modalities, each of which offers a different and complementary window into the thoughts and emotions of the speaker. In this way, a more complete picture of a person’s emotional state may emerge through the fusion and analysis of data from several modalities. The research suggests a new attention-based approach to multimodal emotion recognition. This technique integrates facial and speech features that have been extracted by independent encoders in order to pick the aspects that are the most informative. It increases the system’s accuracy by processing speech and facial features of various sizes and focuses on the most useful bits of input. A more comprehensive representation of facial expressions is extracted by the use of both low- and high-level facial features. These modalities are combined using a fusion network to create a multimodal feature vector which is then fed to a classification layer for emotion recognition. The developed system is evaluated on two datasets, IEMOCAP and CMU-MOSEI, and shows superior performance compared to existing models, achieving a weighted accuracy WA of 74.6% and an F1 score of 66.1% on the IEMOCAP dataset and a WA of 80.7% and F1 score of 73.7% on the CMU-MOSEI dataset.
AbstractList Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single sense. This is due to the fact that sentiments may be conveyed in a wide range of modalities, each of which offers a different and complementary window into the thoughts and emotions of the speaker. In this way, a more complete picture of a person's emotional state may emerge through the fusion and analysis of data from several modalities. The research suggests a new attention-based approach to multimodal emotion recognition. This technique integrates facial and speech features that have been extracted by independent encoders in order to pick the aspects that are the most informative. It increases the system's accuracy by processing speech and facial features of various sizes and focuses on the most useful bits of input. A more comprehensive representation of facial expressions is extracted by the use of both low- and high-level facial features. These modalities are combined using a fusion network to create a multimodal feature vector which is then fed to a classification layer for emotion recognition. The developed system is evaluated on two datasets, IEMOCAP and CMU-MOSEI, and shows superior performance compared to existing models, achieving a weighted accuracy WA of 74.6% and an F1 score of 66.1% on the IEMOCAP dataset and a WA of 80.7% and F1 score of 73.7% on the CMU-MOSEI dataset.
Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single sense. This is due to the fact that sentiments may be conveyed in a wide range of modalities, each of which offers a different and complementary window into the thoughts and emotions of the speaker. In this way, a more complete picture of a person's emotional state may emerge through the fusion and analysis of data from several modalities. The research suggests a new attention-based approach to multimodal emotion recognition. This technique integrates facial and speech features that have been extracted by independent encoders in order to pick the aspects that are the most informative. It increases the system's accuracy by processing speech and facial features of various sizes and focuses on the most useful bits of input. A more comprehensive representation of facial expressions is extracted by the use of both low- and high-level facial features. These modalities are combined using a fusion network to create a multimodal feature vector which is then fed to a classification layer for emotion recognition. The developed system is evaluated on two datasets, IEMOCAP and CMU-MOSEI, and shows superior performance compared to existing models, achieving a weighted accuracy WA of 74.6% and an F1 score of 66.1% on the IEMOCAP dataset and a WA of 80.7% and F1 score of 73.7% on the CMU-MOSEI dataset.Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single sense. This is due to the fact that sentiments may be conveyed in a wide range of modalities, each of which offers a different and complementary window into the thoughts and emotions of the speaker. In this way, a more complete picture of a person's emotional state may emerge through the fusion and analysis of data from several modalities. The research suggests a new attention-based approach to multimodal emotion recognition. This technique integrates facial and speech features that have been extracted by independent encoders in order to pick the aspects that are the most informative. It increases the system's accuracy by processing speech and facial features of various sizes and focuses on the most useful bits of input. A more comprehensive representation of facial expressions is extracted by the use of both low- and high-level facial features. These modalities are combined using a fusion network to create a multimodal feature vector which is then fed to a classification layer for emotion recognition. The developed system is evaluated on two datasets, IEMOCAP and CMU-MOSEI, and shows superior performance compared to existing models, achieving a weighted accuracy WA of 74.6% and an F1 score of 66.1% on the IEMOCAP dataset and a WA of 80.7% and F1 score of 73.7% on the CMU-MOSEI dataset.
Audience Academic
Author Kutlimuratov, Alpamis
Whangbo, Taeg Keun
Mamieva, Dilnoza
Muminov, Bahodir
Abdusalomov, Akmalbek Bobomirzaevich
AuthorAffiliation 3 Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
1 Department of Computer Engineering, Gachon University, Seongnam-si 13120, Republic of Korea; mamiyeva.dilnoza@gmail.com (D.M.)
2 Department of AI. Software, Gachon University, Seongnam-si 13120, Republic of Korea
AuthorAffiliation_xml – name: 3 Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
– name: 1 Department of Computer Engineering, Gachon University, Seongnam-si 13120, Republic of Korea; mamiyeva.dilnoza@gmail.com (D.M.)
– name: 2 Department of AI. Software, Gachon University, Seongnam-si 13120, Republic of Korea
Author_xml – sequence: 1
  givenname: Dilnoza
  surname: Mamieva
  fullname: Mamieva, Dilnoza
– sequence: 2
  givenname: Akmalbek Bobomirzaevich
  orcidid: 0000-0001-5923-8695
  surname: Abdusalomov
  fullname: Abdusalomov, Akmalbek Bobomirzaevich
– sequence: 3
  givenname: Alpamis
  surname: Kutlimuratov
  fullname: Kutlimuratov, Alpamis
– sequence: 4
  givenname: Bahodir
  surname: Muminov
  fullname: Muminov, Bahodir
– sequence: 5
  givenname: Taeg Keun
  surname: Whangbo
  fullname: Whangbo, Taeg Keun
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37420642$$D View this record in MEDLINE/PubMed
BookMark eNptkktv1DAQxyNURB9w4AugSFzgsK2fiX1CS9mFSkUcgLM18WPrVRIvtlPBt8fpllVbIR88nvnNXzPjOa2OxjDaqnqN0TmlEl0kQjHhrOXPqhPMCFsIQtDRA_u4Ok1pixChlIoX1TFtGUENIyeV-jr12Q_BQF-vhpB9GOtPNlt9Z916qJc523F-LT5CsqZeT2kOBVevfucIOs8-0L4IwGjq7ztr9U29tpCnaNPL6rmDPtlX9_dZ9XO9-nH5ZXH97fPV5fJ6oTmSeUFMRzsjHRYdbzpHkKOAOyql6wzSjDtBWyE7aonDGKQwDRijGWpcI6kzjJ5VV3tdE2CrdtEPEP-oAF7dOULcKIjZ696qVhINLZXcmJaZxoK2HXaNaDrELWeiaH3Ya-2mbrBGl_Yj9I9EH0dGf6M24VZhRBHDFBWFd_cKMfyabMpq8EnbvofRhikpIignrWgIL-jbJ-g2THEssyoUkaIVGLWFOt9TGygd-NGFefTlGDt4XbbB-eJftlwwQYSYE9487OFQ_L-fL8D7PaBjSClad0AwUvNWqcNWFfbiCat9hnknShW-_0_GX8KXzas
CitedBy_id crossref_primary_10_3390_fi15090297
crossref_primary_10_1016_j_inffus_2025_103028
crossref_primary_10_3390_s23146459
crossref_primary_10_3390_s24237810
crossref_primary_10_1109_ACCESS_2024_3427111
crossref_primary_10_12720_jait_16_8_1127_1141
crossref_primary_10_3389_fpsyt_2024_1522915
crossref_primary_10_32604_cmc_2023_044466
crossref_primary_10_7717_peerj_cs_2861
crossref_primary_10_3390_app142311342
crossref_primary_10_3390_s25123832
crossref_primary_10_3390_s23249845
crossref_primary_10_1007_s00530_024_01302_2
crossref_primary_10_1016_j_ins_2024_121658
crossref_primary_10_1109_ACCESS_2024_3418499
crossref_primary_10_3390_s23146640
crossref_primary_10_1016_j_engappai_2023_107708
crossref_primary_10_3390_bioengineering10091031
crossref_primary_10_3390_math11163519
crossref_primary_10_3390_app14104199
crossref_primary_10_3390_s24185862
crossref_primary_10_3390_app15136958
crossref_primary_10_1109_ACCESS_2024_3442377
crossref_primary_10_1111_exsy_70103
crossref_primary_10_1016_j_bspc_2024_106224
crossref_primary_10_3390_jimaging10120313
crossref_primary_10_1007_s11760_025_04558_x
crossref_primary_10_1002_widm_1563
crossref_primary_10_3389_frai_2024_1386753
crossref_primary_10_3390_electronics14132581
crossref_primary_10_3390_s23167078
Cites_doi 10.1016/j.knosys.2022.109409
10.18653/v1/2021.naacl-main.417
10.1109/TAFFC.2021.3100868
10.1007/s12193-019-00308-9
10.3390/app112311091
10.3390/s22218224
10.1109/ICASSP.2017.7952552
10.3390/s22218122
10.3390/electronics11234047
10.3390/s23010502
10.1109/ICDM.2017.134
10.1109/TASLP.2021.3076364
10.3390/s22228704
10.3390/s20247031
10.1007/s11042-021-10997-8
10.1109/ICCST53801.2021.00027
10.1109/EEBDA56825.2023.10090537
10.3390/s21175844
10.7717/peerj-cs.1091
10.3389/fnins.2022.1086380
10.1007/s11042-022-14185-0
10.23915/distill.00021
10.3390/electronics12020288
10.3390/s23010338
10.1109/OJEMB.2023.3240280
10.3390/jimaging7080157
10.1109/MIS.2018.2882362
10.3390/pr9081454
10.1007/s40747-022-00841-3
10.3389/fpsyg.2021.759485
10.1155/2023/7091407
10.1007/s10579-008-9076-6
10.5772/54002
10.1007/s00521-021-06012-8
10.1007/s11042-022-13558-9
10.3390/sym12111930
10.1109/CVPR.2016.90
10.1016/j.aej.2023.01.017
10.1109/ICASSP43922.2022.9747095
10.1142/S0219691320500526
10.3390/app12199518
10.3390/s21227665
10.1007/s00521-023-08248-y
10.1007/978-3-031-11432-8
10.3390/electronics12040809
10.1049/sil2.12201
10.3390/mti6020011
ContentType Journal Article
Copyright COPYRIGHT 2023 MDPI AG
2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
2023 by the authors. 2023
Copyright_xml – notice: COPYRIGHT 2023 MDPI AG
– notice: 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: 2023 by the authors. 2023
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7X7
7XB
88E
8FI
8FJ
8FK
ABUWG
AFKRA
AZQEC
BENPR
CCPQU
DWQXO
FYUFA
GHDGH
K9.
M0S
M1P
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQQKQ
PQUKI
PRINS
7X8
5PM
DOA
DOI 10.3390/s23125475
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest One
ProQuest Central Korea
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Health & Medical Complete (Alumni)
ProQuest Health & Medical Collection
Medical Database
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Central China
ProQuest Central
ProQuest Health & Medical Research Collection
Health Research Premium Collection
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Health & Medical Research Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
ProQuest Hospital Collection (Alumni)
ProQuest Health & Medical Complete
ProQuest Medical Library
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList MEDLINE
Publicly Available Content Database
MEDLINE - Academic



CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1424-8220
ExternalDocumentID oai_doaj_org_article_792ca7395dd74d6eaceb1f686b05e548
PMC10304130
A758482887
37420642
10_3390_s23125475
Genre Journal Article
GrantInformation_xml – fundername: GRRC program of Gyeonggi province
  grantid: GRRC-Gachon2021(B03)
GroupedDBID ---
123
2WC
53G
5VS
7X7
88E
8FE
8FG
8FI
8FJ
AADQD
AAHBH
AAYXX
ABDBF
ABUWG
ACUHS
ADBBV
ADMLS
AENEX
AFFHD
AFKRA
AFZYC
ALMA_UNASSIGNED_HOLDINGS
BENPR
BPHCQ
BVXVI
CCPQU
CITATION
CS3
D1I
DU5
E3Z
EBD
ESX
F5P
FYUFA
GROUPED_DOAJ
GX1
HH5
HMCUK
HYE
IAO
ITC
KQ8
L6V
M1P
M48
MODMG
M~E
OK1
OVT
P2P
P62
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQQKQ
PROAC
PSQYO
RNS
RPM
TUS
UKHRP
XSB
~8M
ALIPV
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7XB
8FK
AZQEC
DWQXO
K9.
PKEHL
PQEST
PQUKI
PRINS
7X8
5PM
ID FETCH-LOGICAL-c509t-2db3bd9f18b56bf20f3a1b399fbd0c45f83789b3e2f11a98d6addc406f693fd43
IEDL.DBID DOA
ISICitedReferencesCount 39
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001015551000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1424-8220
IngestDate Tue Oct 14 19:08:43 EDT 2025
Tue Nov 04 02:06:53 EST 2025
Sun Nov 09 11:09:53 EST 2025
Tue Oct 07 07:26:28 EDT 2025
Tue Nov 04 18:43:13 EST 2025
Mon Jul 21 05:56:35 EDT 2025
Sat Nov 29 07:12:15 EST 2025
Tue Nov 18 22:16:51 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Keywords speech feature
CNN
facial feature
attention mechanism
multimodal emotion recognition
Language English
License Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c509t-2db3bd9f18b56bf20f3a1b399fbd0c45f83789b3e2f11a98d6addc406f693fd43
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-5923-8695
OpenAccessLink https://doaj.org/article/792ca7395dd74d6eaceb1f686b05e548
PMID 37420642
PQID 2829878107
PQPubID 2032333
ParticipantIDs doaj_primary_oai_doaj_org_article_792ca7395dd74d6eaceb1f686b05e548
pubmedcentral_primary_oai_pubmedcentral_nih_gov_10304130
proquest_miscellaneous_2835278625
proquest_journals_2829878107
gale_infotracacademiconefile_A758482887
pubmed_primary_37420642
crossref_primary_10_3390_s23125475
crossref_citationtrail_10_3390_s23125475
PublicationCentury 2000
PublicationDate 20230609
PublicationDateYYYYMMDD 2023-06-09
PublicationDate_xml – month: 6
  year: 2023
  text: 20230609
  day: 9
PublicationDecade 2020
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Basel
PublicationTitle Sensors (Basel, Switzerland)
PublicationTitleAlternate Sensors (Basel)
PublicationYear 2023
Publisher MDPI AG
MDPI
Publisher_xml – name: MDPI AG
– name: MDPI
References Krishna (ref_18) 2020; 2020
ref_14
ref_13
(ref_39) 2013; 10
ref_53
Xia (ref_21) 2022; 16
ref_52
Araujo (ref_43) 2019; 4
ref_51
ref_17
ref_16
Busso (ref_49) 2008; 42
Makhmudov (ref_11) 2020; 18
Hsu (ref_9) 2021; 29
ref_25
ref_24
Wei (ref_29) 2019; 14
Kanwal (ref_35) 2022; 8
Song (ref_7) 2021; 12
Liu (ref_15) 2023; 2023
ref_20
Tang (ref_4) 2022; 82
ref_28
ref_26
Ayvaz (ref_10) 2022; 71
Yang (ref_23) 2021; 14
Gupta (ref_30) 2023; 82
ref_36
Ahmed (ref_2) 2023; 17
ref_34
ref_33
ref_32
ref_31
Huddar (ref_37) 2021; 6
Zhao (ref_38) 2023; 17
Sajjad (ref_6) 2023; 68
Nguyen (ref_27) 2023; 35
ref_47
ref_46
Xu (ref_19) 2023; 9
ref_45
ref_44
ref_42
ref_41
ref_40
ref_1
ref_3
ref_48
ref_8
Poria (ref_50) 2018; 33
ref_5
Yoon (ref_22) 2022; 10
Vijayvergia (ref_12) 2021; 80
References_xml – ident: ref_51
– volume: 17
  start-page: 200171
  year: 2023
  ident: ref_2
  article-title: A systematic survey on multimodal emotion recognition using learning algorithms
  publication-title: Intell. Syst. Appl.
– ident: ref_47
  doi: 10.1016/j.knosys.2022.109409
– ident: ref_20
  doi: 10.18653/v1/2021.naacl-main.417
– volume: 2020
  start-page: 4243
  year: 2020
  ident: ref_18
  article-title: Multimodal Emotion Recognition Using Cross-Modal Attention and 1D Convolutional Neural Networks
  publication-title: Interspeech
– volume: 14
  start-page: 1082
  year: 2021
  ident: ref_23
  article-title: Behavioral and Physiological Signals-Based Deep Multimodal Approach for Mobile Emotion Recognition
  publication-title: IEEE Trans. Affect. Comput.
  doi: 10.1109/TAFFC.2021.3100868
– volume: 14
  start-page: 17
  year: 2019
  ident: ref_29
  article-title: Multi-modal facial expression feature based on deep-neural networks
  publication-title: J. Multimodal User Interfaces
  doi: 10.1007/s12193-019-00308-9
– ident: ref_13
  doi: 10.3390/app112311091
– ident: ref_33
  doi: 10.3390/s22218224
– ident: ref_46
  doi: 10.1109/ICASSP.2017.7952552
– ident: ref_8
  doi: 10.3390/s22218122
– ident: ref_42
  doi: 10.3390/electronics11234047
– ident: ref_40
  doi: 10.3390/s23010502
– ident: ref_48
  doi: 10.1109/ICDM.2017.134
– volume: 29
  start-page: 1675
  year: 2021
  ident: ref_9
  article-title: Speech emotion recognition considering nonverbal vocalization in affective con-versations
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASLP.2021.3076364
– ident: ref_16
  doi: 10.3390/s22228704
– ident: ref_44
  doi: 10.3390/s20247031
– volume: 80
  start-page: 28349
  year: 2021
  ident: ref_12
  article-title: Selective shallow models strength integration for emotion detection using GloVe and LSTM
  publication-title: Multimed. Tools Appl.
  doi: 10.1007/s11042-021-10997-8
– ident: ref_3
  doi: 10.1109/ICCST53801.2021.00027
– volume: 10
  start-page: 64516
  year: 2022
  ident: ref_22
  article-title: Can We Exploit All Datasets?
  publication-title: Multimodal Emotion Recognition Using Cross-Modal Translation. IEEE Access
– ident: ref_36
  doi: 10.1109/EEBDA56825.2023.10090537
– ident: ref_45
  doi: 10.3390/s21175844
– volume: 8
  start-page: e1091
  year: 2022
  ident: ref_35
  article-title: Feature selection enhancement and feature space visualization for speech-based emotion recognition
  publication-title: PeerJ Comput. Sci.
  doi: 10.7717/peerj-cs.1091
– volume: 16
  start-page: 1086380
  year: 2022
  ident: ref_21
  article-title: Multimodal interaction enhanced representation learning for video emotion recognition
  publication-title: Front. Neurosci.
  doi: 10.3389/fnins.2022.1086380
– volume: 82
  start-page: 16359
  year: 2022
  ident: ref_4
  article-title: Multimodal emotion recognition from facial expression and speech based on feature fusion
  publication-title: Multimedia Tools Appl.
  doi: 10.1007/s11042-022-14185-0
– volume: 4
  start-page: e21
  year: 2019
  ident: ref_43
  article-title: Computing Receptive Fields of Convolutional Neural Networks
  publication-title: Distill
  doi: 10.23915/distill.00021
– ident: ref_17
  doi: 10.3390/electronics12020288
– ident: ref_26
  doi: 10.3390/s23010338
– ident: ref_14
  doi: 10.1109/OJEMB.2023.3240280
– volume: 6
  start-page: 112
  year: 2021
  ident: ref_37
  article-title: Attention-based Multi-modal Sentiment Analysis and Emotion Detection in Conversation using RNN
  publication-title: Int. J. Interact. Multimed. Artif. Intell.
– ident: ref_24
  doi: 10.3390/jimaging7080157
– volume: 33
  start-page: 17
  year: 2018
  ident: ref_50
  article-title: Multimodal Sentiment Analysis: Addressing Key Issues and Setting Up the Baselines
  publication-title: IEEE Intell. Syst.
  doi: 10.1109/MIS.2018.2882362
– ident: ref_52
  doi: 10.3390/pr9081454
– volume: 9
  start-page: 951
  year: 2023
  ident: ref_19
  article-title: A novel dual-modal emotion recognition algorithm with fusing hybrid features of audio signal and speech context
  publication-title: Complex Intell. Syst.
  doi: 10.1007/s40747-022-00841-3
– volume: 12
  start-page: 759485
  year: 2021
  ident: ref_7
  article-title: Facial Expression Emotion Recognition Model Integrating Philosophy and Machine Learning Theory
  publication-title: Front. Psychol.
  doi: 10.3389/fpsyg.2021.759485
– volume: 2023
  start-page: 1
  year: 2023
  ident: ref_15
  article-title: Multimodal Emotion Recognition Based on Cascaded Multichannel and Hierarchical Fusion
  publication-title: Comput. Intell. Neurosci.
  doi: 10.1155/2023/7091407
– volume: 42
  start-page: 335
  year: 2008
  ident: ref_49
  article-title: IEMOCAP: Interactive emotional dyadic motion capture database
  publication-title: Lang. Resour. Evaluation
  doi: 10.1007/s10579-008-9076-6
– volume: 10
  start-page: 53
  year: 2013
  ident: ref_39
  article-title: Towards Efficient Multi-Modal Emotion Recognition
  publication-title: Int. J. Adv. Robot. Syst.
  doi: 10.5772/54002
– ident: ref_31
  doi: 10.1007/s00521-021-06012-8
– volume: 82
  start-page: 11365
  year: 2023
  ident: ref_30
  article-title: Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models
  publication-title: Multimedia Tools Appl.
  doi: 10.1007/s11042-022-13558-9
– ident: ref_25
  doi: 10.3390/sym12111930
– ident: ref_41
  doi: 10.1109/CVPR.2016.90
– volume: 68
  start-page: 817
  year: 2023
  ident: ref_6
  article-title: A comprehensive survey on deep facial expression recognition: Challenges, applications, and future guidelines
  publication-title: Alex. Eng. J.
  doi: 10.1016/j.aej.2023.01.017
– ident: ref_34
  doi: 10.1109/ICASSP43922.2022.9747095
– volume: 18
  start-page: 2050052
  year: 2020
  ident: ref_11
  article-title: Improvement of the end-to-end scene text recognition method for “text-to-speech” conversion
  publication-title: Int. J. Wavelets Multiresolution Inf. Process.
  doi: 10.1142/S0219691320500526
– ident: ref_32
  doi: 10.3390/app12199518
– ident: ref_5
  doi: 10.3390/s21227665
– volume: 35
  start-page: 10535
  year: 2023
  ident: ref_27
  article-title: Meta-transfer learning for emotion recognition
  publication-title: Neural Comput. Appl.
  doi: 10.1007/s00521-023-08248-y
– volume: 71
  start-page: 5511
  year: 2022
  ident: ref_10
  article-title: Automatic speaker recognition using mel-frequency cepstral coefficients through machine learning
  publication-title: Comput. Mater. Contin.
– ident: ref_1
  doi: 10.1007/978-3-031-11432-8
– ident: ref_53
  doi: 10.3390/electronics12040809
– volume: 17
  start-page: e12201
  year: 2023
  ident: ref_38
  article-title: Attention-based sensor fusion for emotion recognition from human motion by combining convolutional neural network and weighted kernel support vector machine and using inertial measurement unit signals
  publication-title: IET Signal Process.
  doi: 10.1049/sil2.12201
– ident: ref_28
  doi: 10.3390/mti6020011
SSID ssj0023338
Score 2.5920124
Snippet Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single...
SourceID doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 5475
SubjectTerms Accuracy
attention mechanism
CNN
Datasets
Deep learning
Emotions
Facial Expression
facial feature
Humans
Identification systems
multimodal emotion recognition
Neural networks
Recognition, Psychology
Signal processing
Speech
speech feature
SummonAdditionalLinks – databaseName: Health & Medical Collection
  dbid: 7X7
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lj9MwELZg4QAH3iyBBRmEBJdok9iOnRPqQisOaIXEQ71Zfu5WYpPSpit-PjNJmm0F4sI1mYOTeX72-BtCXvNcZVEwm0pns5SLwqXWcp7aAg9pjOE9A9_3T_L0VM3n1edhw209tFVuY2IXqH3jcI_8GE_8lFSAVt4tf6Y4NQpPV4cRGtfJDRybjXYu51eAiwH-6tmEGED74zXUMoCHsKVwJwd1VP1_BuSdjLTfLbmTfmZ3_3fh98idofCkk95S7pNroX5Abu_QET4kuruNe9F4kJv2433oh9B2zVo1vVwYOmnbvj0yPYHs5-lsg5tttIl0-qvtiJ_hmcFdeGpqT78sQ3DnFMvMDcD6R-TbbPr1_cd0GMCQOqgj2rTwlllfxVxZUdpYZJGZ3EJJE63PHBcR2egry0IR89xUypcQLR2UCLGsWPScPSYHdVOHJ4QKmzNTVI7lIXBhosqMcNYL50RleBkT8narEu0GdnIckvFDA0pB7elRewl5NYoue0qOvwmdoF5HAWTR7h40qzM9OKWWVeEMnlR6L7kvIQdB5oqlKm0mAkC5hLxBq9Do6_gTzXBlAT4JWbP0BMAWB8iqZEKOtsrXQxBY6yvNJ-Tl-BrcF89kTB2aDcpABSwBVsKKD3s7G9fMJC8QHyZE7Vng3kftv6kX5x1FOA6Pw_Lk6b_X9YzcKsBNuta36ogctKtNeE5uust2sV696JzpN_scKuc
  priority: 102
  providerName: ProQuest
Title Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features
URI https://www.ncbi.nlm.nih.gov/pubmed/37420642
https://www.proquest.com/docview/2829878107
https://www.proquest.com/docview/2835278625
https://pubmed.ncbi.nlm.nih.gov/PMC10304130
https://doaj.org/article/792ca7395dd74d6eaceb1f686b05e548
Volume 23
WOSCitedRecordID wos001015551000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1424-8220
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0023338
  issn: 1424-8220
  databaseCode: DOA
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1424-8220
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0023338
  issn: 1424-8220
  databaseCode: M~E
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1424-8220
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0023338
  issn: 1424-8220
  databaseCode: 7X7
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1424-8220
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0023338
  issn: 1424-8220
  databaseCode: BENPR
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1424-8220
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0023338
  issn: 1424-8220
  databaseCode: PIMPY
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lj9MwELZg4QAHxHMJLJVBSHCJNomd2Dm2kAoktqp4qZwsP7WVIF1t0xUnfjszSRqlAokLlxycOTjjeX3x-DMhL3kqk5AzEwtrkpjnmY2N4Tw2GW7SaM07Br6vH8RiIVercjm66gt7wjp64E5xp6LMrMbdJOcEdwXECYguoZCFSXIP5TZG30SUezDVQy0GyKvjEWIA6k-3UMUAEsJmwlH2aUn6_wzFo1x02Cc5Sjzzu-ROXzHSaTfTe-Sar--T2yMewQdEtcdof2wcyFXdvTz0rW_aLquaXq01nTZN19cYzyBtOTrf4V8yugm0-tm0jM0wpvH3OdW1o58uvLfnFOvDHeDxh-TLvPr85l3c35wQWygAmjhzhhlXhlSavDAhSwLTqYFaJBiXWJ4HpJEvDfNZSFNdSldAmLOQ20NRsuA4e0SO6k3tHxOam5TprLQs9Z7nOshE59a43Nq81LwIEXm916iyPa043m7xXQG8QOWrQfkReTGIXnRcGn8TmuGyDAJIf90OgFGo3ijUv4wiIq9wURU6KSpR92cN4JOQ7kpNASVxwJpSRORkv-6q996twt1lKSQg44g8H16D3-Fmiq79ZocyULoKwIMw4-POTIY5M8EzBHYRkQcGdPBRh2_q9XnL7Y23vmFd8eR_qOEpuZWBL7SdbeUJOWoud_4ZuWmvmvX2ckKui5Von3JCbsyqxfLjpPUieJ79qmBs-f5s-e03XYIjsg
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEF6VggQceD8MBRYEgotVe3f9OiCU0katGiIkSpXbsk8aCeyQOAX-FL-RGdtJE4G49cDVXlm79rfzzeeZnSHkuYjzyCdch5nRUSgSZkKthQg1wyCNUqKtwHc8yIbDfDQq3m-QX4uzMJhWubCJjaG2lcF_5NsY8cuzHNTKm8m3ELtGYXR10UKjhcWh-_kdJNvs9cEufN8XjPX3jt7uh11XgdAAOdYhs5prW_g410mqPYs8V7EGnvbaRkYkHkusF5o75uNYFblNwQQY4D2fFtxbweG5F8hFsOMZir1sdCbwOOi9tnoR50W0PQPfCfQXpjCucF7TGuBPAlhhwPXszBW661__317UDXKtc6xpr90JN8mGK2-RqyvlFm8T2Zw2_lpZGLfXti-iu65uktFKejpWtFfXbfpnuAPsbml_jj8TaeXp3o-6KWwN1xRGGagqLf0wcc6cUHSj51M3u0M-nssK75LNsirdfUITHXPFCsNj50SifB6pxGibGJMUSqQ-IK8WEJCmq76OTUC-SFBhiBa5REtAni2HTtqSI38btIM4Wg7AKuHNhWr6WXZGR2YFMwojsdZmwqbAscDMPs1THSUOpGpAXiIKJdoyfImqO5IBS8KqYLIHYlKAJM-zgGwtwCY7IzeTZ0gLyNPlbTBPGHNSpavmOAY8_AxkM8z4Xovr5Zx5Jhjq34Dka4hfW9T6nXJ80pRAx-Z46H49-Pe8npDL-0fvBnJwMDx8SK4w2KJNml-xRTbr6dw9IpfMaT2eTR83G5mST-e9IX4DfPuJ7Q
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEF6VFCE48H4YCiwIBBcrtnf9OiCUkkRELVEkoCons08aCewQOwX-Gr-OGdsxiUDceuBqj6xde16fZ_YbQp5wP_FsyKQbK-m5PAyUKyXnrgywSCMEbxj4jg7j6TQ5Pk5nO-Tn-iwMtlWufWLtqHWh8B95Hyt-SZwAWunbti1iNhy_XHx1cYIUVlrX4zQaFTkwP74BfCtfTIbwrZ8GwXj07tVrt50w4CoIlJUbaMmkTq2fyDCSNvAsE76EmG2l9hQPLdKtp5KZwPq-SBMdgTtQEANtlDKrOYPnniO7kJLzoEd2Z5M3sw8d3GOA_houI8ZSr19CJgVoDBsaNyJgPSjgz3CwEQ-3ezU3gt_4yv_82q6Sy23KTQeNjVwjOya_Ti5tEDHeIFl9DvlLoUFu1Aw2okNT1W1qOT2dCzqoqqYx1N2HuK_peIW_GWlh6eh7VVNewzWB9Qcqck3fLoxRJxQT7NXSlDfJ-zPZ4S3Sy4vc3CE0lD4TQaqYbwwPhU08ESqpQ6XCVPDIOuT5Wh0y1fKy43iQzxngM9ScrNMchzzuRBcNGcnfhPZRpzoB5A-vLxTLT1nrjrI4DZTAGq3WMdcRRF-I2TZKIumFBkCsQ56hRmbo5fAlivawBmwJ-cKyAcBMDmA9iR2yt1a8rHV_ZfZb6xzyqLsNjgurUSI3xQplIPePAVDDim83Ot6tmYE9ITJ2SLKl_Vub2r6Tz09qcnQcm4eJ2d1_r-shuQB2kB1Opgf3yMUArLXu_0v3SK9arsx9cl6dVvNy-aC1ako-nrVF_AK9vpQ8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multimodal+Emotion+Detection+via+Attention-Based+Fusion+of+Extracted+Facial+and+Speech+Features&rft.jtitle=Sensors+%28Basel%2C+Switzerland%29&rft.au=Dilnoza+Mamieva&rft.au=Akmalbek+Bobomirzaevich+Abdusalomov&rft.au=Alpamis+Kutlimuratov&rft.au=Bahodir+Muminov&rft.date=2023-06-09&rft.pub=MDPI+AG&rft.eissn=1424-8220&rft.volume=23&rft.issue=12&rft.spage=5475&rft_id=info:doi/10.3390%2Fs23125475&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_792ca7395dd74d6eaceb1f686b05e548
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1424-8220&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1424-8220&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1424-8220&client=summon