Cross-modal emotion hashing network: Efficient binary coding for large-scale multimodal emotion retrieval

•Reframe multimodal emotion recognition as affect-oriented binary hashing with the Cross-Modal Emotion Hashing Network (CEHN).•Employ a three-hop gated cross-modal attention hierarchy to fuse audio, visual, and textual cues efficiently.•Use polarity-aware contrastive distillation to produce balanced...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Knowledge-based systems Ročník 331; s. 114771
Hlavní autori:	Li, Chenhao, Huang, Wenti, Yang, Zhan, Long, Jun
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Elsevier B.V 03.12.2025
Predmet:	Binary hashing Cross-modal attention Multimodal emotion recognition Polarity-aware contrastive distillation Multimodal emotion recognition Binary hashing Cross-modal attention Polarity-aware contrastive distillation
ISSN:	0950-7051
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	•Reframe multimodal emotion recognition as affect-oriented binary hashing with the Cross-Modal Emotion Hashing Network (CEHN).•Employ a three-hop gated cross-modal attention hierarchy to fuse audio, visual, and textual cues efficiently.•Use polarity-aware contrastive distillation to produce balanced 32–128-bit codes whose Hamming distances preserve emotion similarity.•On MELD and CMU-MOSI, a 64-bit CEHN variant matches dense-vector baselines while shrinking memory up to 256× and cutting CPU latency to <3 ms.•Enable billion-scale, privacy-preserving retrieval on commodity hardware. Multimodal emotion recognition (MER) remains computationally heavy because speech, vision, and language features are high-dimensional. We reframe MER as affect-oriented retrieval and propose a Cross-Modal Emotion Hashing Network (CEHN), which encodes each utterance into compact binary codes that preserve fine-grained emotional semantics. CEHN couples pre-trained acoustic, visual, and textual encoders with gated cross-modal attention, and then applies a polarity-aware distillation loss that aligns continuous affect vectors with a temperature-controlled sign layer. The result is 32–128–bit hash codes whose Hamming distances approximate emotion similarity. On MELD and CMU-MOSI, CEHN matches or surpasses dense state-of-the-art models, improving F1 by up to 1.9 points while shrinking representation size by up to 16× and reducing CPU inference to <2 ms per utterance. Latency-matched quantization baselines trail by as much as 5 mAP points. Ablations confirm the critical roles of gated attention and polarity-aware distillation. By uniting efficient hashing with modality-aware fusion, CEHN enables real-time, privacy-preserving emotion understanding for dialogue systems, recommendation engines, and safety-critical moderation on resource-constrained devices.
AbstractList	•Reframe multimodal emotion recognition as affect-oriented binary hashing with the Cross-Modal Emotion Hashing Network (CEHN).•Employ a three-hop gated cross-modal attention hierarchy to fuse audio, visual, and textual cues efficiently.•Use polarity-aware contrastive distillation to produce balanced 32–128-bit codes whose Hamming distances preserve emotion similarity.•On MELD and CMU-MOSI, a 64-bit CEHN variant matches dense-vector baselines while shrinking memory up to 256× and cutting CPU latency to <3 ms.•Enable billion-scale, privacy-preserving retrieval on commodity hardware. Multimodal emotion recognition (MER) remains computationally heavy because speech, vision, and language features are high-dimensional. We reframe MER as affect-oriented retrieval and propose a Cross-Modal Emotion Hashing Network (CEHN), which encodes each utterance into compact binary codes that preserve fine-grained emotional semantics. CEHN couples pre-trained acoustic, visual, and textual encoders with gated cross-modal attention, and then applies a polarity-aware distillation loss that aligns continuous affect vectors with a temperature-controlled sign layer. The result is 32–128–bit hash codes whose Hamming distances approximate emotion similarity. On MELD and CMU-MOSI, CEHN matches or surpasses dense state-of-the-art models, improving F1 by up to 1.9 points while shrinking representation size by up to 16× and reducing CPU inference to <2 ms per utterance. Latency-matched quantization baselines trail by as much as 5 mAP points. Ablations confirm the critical roles of gated attention and polarity-aware distillation. By uniting efficient hashing with modality-aware fusion, CEHN enables real-time, privacy-preserving emotion understanding for dialogue systems, recommendation engines, and safety-critical moderation on resource-constrained devices.
ArticleNumber	114771
Author	Huang, Wenti Yang, Zhan Long, Jun Li, Chenhao
Author_xml	– sequence: 1 givenname: Chenhao orcidid: 0009-0005-8903-0955 surname: Li fullname: Li, Chenhao email: li0991xju@163.com organization: Computer Science and Technology, Xinjiang University, Urumqi, 830000, China – sequence: 2 givenname: Wenti surname: Huang fullname: Huang, Wenti organization: School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411100, China – sequence: 3 givenname: Zhan orcidid: 0000-0002-6336-0228 surname: Yang fullname: Yang, Zhan organization: Big Data Institute, School of Computer Science and Engineering, Central South University, Changsha, 410083, China – sequence: 4 givenname: Jun orcidid: 0000-0003-0163-0007 surname: Long fullname: Long, Jun email: junlong@hnu.edu.cn organization: Big Data Institute, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
BookMark	eNp9kL1OwzAUhT0UiRZ4Awa_QILtOHHCgISq8iNVYoHZcuzr1m1iIzsU9e1JFBYWpjuce47O-VZo4YMHhG4pySmh1d0hP_qQzilnhJU5pVwIukBL0pQkE6Skl2iV0oEQwhitl8itY0gp64NRHYY-DC54vFdp7_wOexi-Qzze4421TjvwA26dV_GMdTDTgw0RdyruIEtadYD7r25wf7MiDNHBSXXX6MKqLsHN771CH0-b9_VLtn17fl0_bjNNBRsyS1nFua6YMm3DTWVJWzPBFajGaLDCUmIVAC_Kmo5SXZKisJUirGmZKI0qrhCfc_W0LIKVn9H1Y2lJiZwQyYOcEckJkZwRjbaH2QZjt5ODKNO0WINxEfQgTXD_B_wA1Gh4UQ
Cites_doi	10.1016/j.patcog.2023.110079 10.1109/TPAMI.2018.2798607 10.1609/aaai.v39i2.32135 10.1007/s11042-024-19371-w 10.1109/TBDATA.2019.2921572 10.1109/JPROC.2017.2761740 10.1016/j.inffus.2017.02.003 10.1145/3610661.3616190 10.1037/h0077714 10.1016/j.knosys.2024.112599 10.1016/j.bspc.2025.108231 10.1145/3632527 10.1109/TPAMI.2010.57 10.1109/TAFFC.2024.3394873
ContentType	Journal Article
Copyright	2025 Elsevier B.V.
Copyright_xml	– notice: 2025 Elsevier B.V.
DBID	AAYXX CITATION
DOI	10.1016/j.knosys.2025.114771
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
ExternalDocumentID	10_1016_j_knosys_2025_114771 S095070512501809X
GroupedDBID	--K --M .DC .~1 0R~ 1B1 1~. 1~5 4.4 457 4G. 5VS 7-5 71M 77I 77K 8P~ 9JN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AATTM AAXKI AAXUO AAYFN AAYWO ABAOU ABBOA ABIVO ABJNI ABMAC ACDAQ ACGFS ACLOT ACRLP ACVFH ACZNC ADBBV ADCNI ADEZE ADGUI ADTZH AEBSH AECPX AEIPS AEKER AENEX AEUPX AFJKZ AFPUW AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIGII AIIUN AIKHN AITUG AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD APXCP ARUGR AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFKBS EFLBG EO8 EO9 EP2 EP3 FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W JJJVA KOM MHUIS MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SST SSV SSW SSZ T5K WH7 XPP ZMT ~02 ~G- ~HD 29L 9DU AAQXK AAYXX ABDPE ABWVN ABXDB ACNNM ACRPL ADJOM ADMUD ADNMO AGQPQ ASPBG AVWKF AZFZN CITATION EJD FEDTE FGOYB G-2 HLZ HVGLF HZ~ LG9 LY7 M41 R2- SBC SET UHS WUQ
ID	FETCH-LOGICAL-c172t-f12644c62adb94d6f0b8274aea9dcef7f10faee435810b885033f6a029b275da3
ISSN	0950-7051
IngestDate	Sat Nov 29 06:54:20 EST 2025 Wed Dec 10 14:26:13 EST 2025
IsPeerReviewed	true
IsScholarly	true
Keywords	Multimodal emotion recognition Binary hashing Cross-modal attention Polarity-aware contrastive distillation
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c172t-f12644c62adb94d6f0b8274aea9dcef7f10faee435810b885033f6a029b275da3
ORCID	0009-0005-8903-0955 0000-0002-6336-0228 0000-0003-0163-0007
ParticipantIDs	crossref_primary_10_1016_j_knosys_2025_114771 elsevier_sciencedirect_doi_10_1016_j_knosys_2025_114771
PublicationCentury	2000
PublicationDate	2025-12-03
PublicationDateYYYYMMDD	2025-12-03
PublicationDate_xml	– month: 12 year: 2025 text: 2025-12-03 day: 03
PublicationDecade	2020
PublicationTitle	Knowledge-based systems
PublicationYear	2025
Publisher	Elsevier B.V
Publisher_xml	– name: Elsevier B.V
References	Russell (bib0017) 1980; 39 Zhang, Lin, Jiang, Chen (bib0044) 2024; 83 Wang, Zhang, Zhang, Niu, Zhang, Sankaranarayana, Caldwell, Gedeon (bib0027) 2025 Zhang, Gu (bib0019) 2020 Shen, Huang, Lan, Zheng (bib0022) 2024 Deng, Yan (bib0040) 2019; 88 Ge, He, Ke, Sun (bib0047) 2013 Yang, Wang, Yi, Zhu, Zadeh, Poria, Morency (bib0005) 2020 Liu, Wang, Jiang, An, Gu, Li, Zhang (bib0029) 2024; 305 M. Shamsi, M. Tahon, Training Speech Emotion Classifier Without Categorical Annotations, arXiv Zhang, Zhou, Feng, Lai, Li, Pan, Yin, Yan (bib0016) 2018 (2015). Johnson, Douze, Jégou (bib0046) 2021; 7 Cao, Long, Wang, Huang, Yu (bib0015) 2017 Liu, Wang, Shan, Chen (bib0013) 2016 Cao, Chao, Liu (bib0043) 2025; 225 Poria, Cambria, Hussain, Huang (bib0003) 2017; 37 Cao, Long, Wang, Yu (bib0014) 2016 Poria, Hazarika, Majumder, Naik, Cambria, Mihalcea (bib0006) 2019 Song, Su, Huang, Yang (bib0038) 2024; 147 Sze, Chen, Yang, Emer (bib0008) 2017; 105 Kollias, Zafeiriou (bib0018) 2019 Baltrušaitis, Ahuja, Morency (bib0002) 2019; 41 A. Ispas, T. Deschamps-Berger, L. Devillers, A Multi-Task, Multi-Modal Approach for Predicting Categorical and Dimensional Emotions, arXiv Johnson, Douze, Jégou (bib0035) 2019; 7 (2024). Zaken, Ravfogel, Goldberg (bib0034) 2022 (2022). Jégou, Douze, Schmid (bib0012) 2011; 33 Li, Zuo, Mei, Zhong, Meng (bib0036) 2016 Chen, Ren, Ma, Wang (bib0048) 2025 Picard (bib0001) 1997 G. Hinton, O. Vinyals, J. Dean, Distilling the Knowledge in a Neural Network, arXiv Sun, Li (bib0041) 2024 Ding, Zhang, Li, Wu (bib0030) 2023 Tsai, Bai, Liang, Kolter, Morency, Salakhutdinov (bib0004) 2019 Zhang, Ma (bib0037) 2018 Li, Chen, Wang (bib0049) 2025; 25 Li, Zhang, Liu (bib0042) 2023; 17 Han, Liu, Wei, Zhou, Xu, Long (bib0039) 2024; 20 Jacob, Kligys, Chen, Zhu, Tang, Howard, Adam, Kalenichenko (bib0011) 2018 Song, Gu, Zhao, Dong (bib0025) 2021 Li, Chen, Zhang, Xu (bib0031) 2024 Johnson, Douze, Jégou (bib0007) 2021; 7 Shi, Tao, Jin, Yang, Yuan, Wang (bib0033) 2023; 202 Yan, Guo, Xing, Xu (bib0024) 2024; 15 Radford, Kim, Hallacy, Ramesh, Goh, Agarwal, Sastry, Askell, Mishkin, Clark, Krueger, Sutskever (bib0045) 2021 Yun, Park, Lee, Kim (bib0032) 2024 Liu, Wang, Gu, An, Zhao, Li, Zhang (bib0028) 2025; 110 Zhang, Chen, Chen (bib0023) 2023 Han, Pool, Tran, Dally (bib0009) 2015 Lu, Chen, Liang, Tan, Zeng, Hu (bib0026) 2025; 39 Liu (10.1016/j.knosys.2025.114771_bib0028) 2025; 110 Zhang (10.1016/j.knosys.2025.114771_bib0019) 2020 Cao (10.1016/j.knosys.2025.114771_bib0014) 2016 Wang (10.1016/j.knosys.2025.114771_bib0027) 2025 Li (10.1016/j.knosys.2025.114771_bib0031) 2024 Shen (10.1016/j.knosys.2025.114771_bib0022) 2024 Johnson (10.1016/j.knosys.2025.114771_bib0046) 2021; 7 Shi (10.1016/j.knosys.2025.114771_bib0033) 2023; 202 Radford (10.1016/j.knosys.2025.114771_bib0045) 2021 Jégou (10.1016/j.knosys.2025.114771_bib0012) 2011; 33 Cao (10.1016/j.knosys.2025.114771_bib0043) 2025; 225 Poria (10.1016/j.knosys.2025.114771_bib0003) 2017; 37 Zhang (10.1016/j.knosys.2025.114771_bib0023) 2023 Liu (10.1016/j.knosys.2025.114771_bib0029) 2024; 305 10.1016/j.knosys.2025.114771_bib0010 Liu (10.1016/j.knosys.2025.114771_bib0013) 2016 Ding (10.1016/j.knosys.2025.114771_bib0030) 2023 Yun (10.1016/j.knosys.2025.114771_bib0032) 2024 Han (10.1016/j.knosys.2025.114771_bib0039) 2024; 20 Li (10.1016/j.knosys.2025.114771_bib0042) 2023; 17 Tsai (10.1016/j.knosys.2025.114771_bib0004) 2019 Song (10.1016/j.knosys.2025.114771_bib0038) 2024; 147 Deng (10.1016/j.knosys.2025.114771_bib0040) 2019; 88 Sun (10.1016/j.knosys.2025.114771_bib0041) 2024 Song (10.1016/j.knosys.2025.114771_bib0025) 2021 Yan (10.1016/j.knosys.2025.114771_bib0024) 2024; 15 Ge (10.1016/j.knosys.2025.114771_bib0047) 2013 10.1016/j.knosys.2025.114771_bib0020 Zhang (10.1016/j.knosys.2025.114771_bib0044) 2024; 83 Han (10.1016/j.knosys.2025.114771_bib0009) 2015 10.1016/j.knosys.2025.114771_bib0021 Zaken (10.1016/j.knosys.2025.114771_bib0034) 2022 Yang (10.1016/j.knosys.2025.114771_sbref0005) 2020 Jacob (10.1016/j.knosys.2025.114771_bib0011) 2018 Picard (10.1016/j.knosys.2025.114771_bib0001) 1997 Chen (10.1016/j.knosys.2025.114771_bib0048) 2025 Zhang (10.1016/j.knosys.2025.114771_sbref0015) 2018 Baltrušaitis (10.1016/j.knosys.2025.114771_bib0002) 2019; 41 Kollias (10.1016/j.knosys.2025.114771_bib0018) 2019 Poria (10.1016/j.knosys.2025.114771_bib0006) 2019 Zhang (10.1016/j.knosys.2025.114771_bib0037) 2018 Lu (10.1016/j.knosys.2025.114771_bib0026) 2025; 39 Johnson (10.1016/j.knosys.2025.114771_bib0007) 2021; 7 Sze (10.1016/j.knosys.2025.114771_bib0008) 2017; 105 Russell (10.1016/j.knosys.2025.114771_bib0017) 1980; 39 Li (10.1016/j.knosys.2025.114771_bib0049) 2025; 25 Cao (10.1016/j.knosys.2025.114771_bib0015) 2017 Johnson (10.1016/j.knosys.2025.114771_bib0035) 2019; 7 Li (10.1016/j.knosys.2025.114771_bib0036) 2016
References_xml	– volume: 305 year: 2024 ident: bib0029 article-title: MAS-DGAT-Net: a dynamic graph attention network with multibranch feature extraction and staged fusion for EEG emotion recognition publication-title: Knowl. Based Syst. – volume: 225 year: 2025 ident: bib0043 article-title: Region-focused CNN with dynamic adaptive graph attention for EEG-based emotion recognition publication-title: Expert Syst. Appl. – volume: 7 start-page: 535 year: 2021 end-page: 547 ident: bib0046 article-title: Billion-scale similarity search with GPUs publication-title: IEEE Trans. Big Data – start-page: 2740 year: 2016 end-page: 2749 ident: bib0036 article-title: Deep pairwise-supervised hashing for scalable image retrieval publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) – start-page: 347 year: 2020 end-page: 351 ident: bib0019 article-title: MTANet: facial affect recognition in the wild using multi-task learning convolutional network publication-title: Proc. IEEE FG Challenge & Workshop on Affective Behavior Analysis in the Wild – start-page: 4512 year: 2024 end-page: 4523 ident: bib0032 article-title: TelME: teacher-leading multimodal fusion network for emotion recognition in conversation publication-title: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) – year: 2025 ident: bib0048 article-title: Semi-Supervised online cross-modal hashing publication-title: Proc. AAAI – start-page: 1284 year: 2024 end-page: 1290 ident: bib0031 article-title: Contrastive transformer masked image hashing for efficient image retrieval publication-title: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) – volume: 17 year: 2023 ident: bib0042 article-title: Dual attention mechanism graph convolutional neural network for EEG emotion recognition publication-title: Front. Neurosci. – volume: 147 year: 2024 ident: bib0038 article-title: Deep self-enhancement hashing for robust multi-label cross-modal retrieval publication-title: Pattern Recognit. – volume: 41 start-page: 423 year: 2019 end-page: 443 ident: bib0002 article-title: Multimodal machine learning: a survey and taxonomy publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – start-page: 2314 year: 2023 end-page: 2323 ident: bib0030 article-title: Multimodal sentiment analysis via efficient multimodal transformer publication-title: Proceedings of the ACM International Conference on Multimedia (ACM MM) – year: 1997 ident: bib0001 article-title: Affective Computing – volume: 105 start-page: 2295 year: 2017 end-page: 2329 ident: bib0008 article-title: Efficient processing of deep neural networks: a tutorial and survey publication-title: Proc. IEEE – start-page: 7395 year: 2023 end-page: 7408 ident: bib0023 article-title: DualGATs: dual graph attention networks for emotion recognition in conversations publication-title: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) – year: 2025 ident: bib0027 article-title: Visual and textual prompts in VLLMs for enhancing emotion recognition publication-title: IEEE Trans. Circuits Syst. Video Technol. – volume: 33 start-page: 117 year: 2011 end-page: 128 ident: bib0012 article-title: Product quantization for nearest neighbor search publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: A. Ispas, T. Deschamps-Berger, L. Devillers, A Multi-Task, Multi-Modal Approach for Predicting Categorical and Dimensional Emotions, arXiv: – start-page: 1135 year: 2015 end-page: 1143 ident: bib0009 article-title: Learning both weights and connections for efficient neural networks publication-title: Proc. NeurIPS – volume: 39 start-page: 1447 year: 2025 end-page: 1455 ident: bib0026 article-title: Understanding emotional body expressions via large language models publication-title: Proceedings of the AAAI Conference on Artificial Intelligence – start-page: 8748 year: 2021 end-page: 8763 ident: bib0045 article-title: Learning transferable visual models from natural language supervision publication-title: Proceedings of the 38th International Conference on Machine Learning (ICML) – volume: 25 start-page: 4821 year: 2025 end-page: 4832 ident: bib0049 article-title: DSSTNet: emotion recognition using multi-scale EEG features through graph convolution publication-title: IEEE Sens J. – start-page: 2064 year: 2016 end-page: 2072 ident: bib0013 article-title: Deep supervised hashing for fast image retrieval publication-title: Proc. CVPR – volume: 39 start-page: 1161 year: 1980 end-page: 1178 ident: bib0017 article-title: A circumplex model of affect publication-title: J. Pers. Soc. Psychol. – start-page: 585 year: 2021 end-page: 595 ident: bib0025 article-title: Supervised prototypical contrastive learning for imbalanced emotion recognition publication-title: Proc. ACL – start-page: 2704 year: 2018 end-page: 2713 ident: bib0011 article-title: Quantization and training of neural networks for efficient integer-arithmetic-only inference publication-title: Proc. CVPR – reference: (2022). – volume: 110 year: 2025 ident: bib0028 article-title: Cross-subject emotion recognition by EEG driven spatio-temporal hybrid network based on domain adaptation and dynamic graph attention publication-title: Biomed. Signal Process. Control – year: 2024 ident: bib0041 article-title: HCH: hierarchical contrastive hashing for emotion video retrieval publication-title: IJCAI – start-page: 527 year: 2019 end-page: 536 ident: bib0006 article-title: MELD: a multimodal multi-party dataset for emotion recognition in conversations publication-title: Proc. ACL – reference: G. Hinton, O. Vinyals, J. Dean, Distilling the Knowledge in a Neural Network, arXiv: – volume: 202 start-page: 31292 year: 2023 end-page: 31311 ident: bib0033 article-title: UPop: unified and progressive pruning for compressing vision-language transformers publication-title: Proceedings of the 40Th International Conference on Machine Learning – start-page: 1 year: 2022 end-page: 9 ident: bib0034 article-title: BitFit: simple parameter-efficient fine-tuning for transformer-based masked language-models publication-title: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) – volume: 20 year: 2024 ident: bib0039 article-title: Supervised hierarchical online hashing for cross-modal retrieval publication-title: ACM Trans. Multimedia Comput. Commun. Appl. – reference: (2015). – start-page: 2946 year: 2013 end-page: 2953 ident: bib0047 article-title: Optimized product quantization for approximate nearest neighbor search publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) – start-page: 6558 year: 2019 end-page: 6569 ident: bib0004 article-title: Multimodal transformer for unaligned multimodal language sequences publication-title: Proc. ACL – reference: (2024). – start-page: 430 year: 2016 end-page: 438 ident: bib0014 article-title: Correlation hashing network for efficient cross-modal retrieval publication-title: Proc. CVPR – volume: 15 start-page: 2042 year: 2024 end-page: 2054 ident: bib0024 article-title: Bridge graph attention based graph convolution network with multi-scale transformer for EEG emotion recognition publication-title: IEEE Trans Affect Comput – start-page: 5609 year: 2017 end-page: 5618 ident: bib0015 article-title: HashNet: deep learning to hash by continuation publication-title: Proc. ICCV – volume: 88 start-page: 102 year: 2019 end-page: 112 ident: bib0040 article-title: HashEmotion: compact binary codes for facial valence-arousal retrieval publication-title: Image Vis. Comput. – year: 2020 ident: bib0005 article-title: MTAG: modal-temporal attention graph for unaligned human multimodal language sequences publication-title: Proc. EMNLP – year: 2018 ident: bib0037 article-title: SCH-GAN: semi-supervised cross-modal hashing via generative adversarial networks publication-title: ACM Multimedia – volume: 7 start-page: 535 year: 2019 end-page: 547 ident: bib0035 article-title: Billion-scale similarity search with GPUs publication-title: IEEE Trans. Big Data – volume: 37 start-page: 98 year: 2017 end-page: 125 ident: bib0003 article-title: Multimodal sentiment analysis: a review publication-title: Inf. Fusion – volume: 7 start-page: 535 year: 2021 end-page: 547 ident: bib0007 article-title: Billion-scale similarity search with GPUs publication-title: IEEE Trans. Big Data – reference: M. Shamsi, M. Tahon, Training Speech Emotion Classifier Without Categorical Annotations, arXiv: – start-page: 1227 year: 2024 end-page: 1235 ident: bib0022 article-title: Contrastive transformer cross-modal hashing for video-text retrieval publication-title: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24 – year: 2018 ident: bib0016 article-title: HashGAN: attention-aware deep adversarial hashing for cross-modal retrieval publication-title: Proc. ECCV – start-page: 2702 year: 2019 end-page: 2708 ident: bib0018 article-title: Aff-wild2: extending the aff-wild database for affect recognition publication-title: Proc. IEEE International Conference on Computer Vision Workshops (ICCVW) – volume: 83 start-page: 90487 year: 2024 end-page: 90509 ident: bib0044 article-title: Hierarchical modal interaction balance cross-modal hashing for unsupervised image-text retrieval publication-title: Multimed. Tools Appl. – start-page: 1284 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0031 article-title: Contrastive transformer masked image hashing for efficient image retrieval publication-title: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) – start-page: 430 year: 2016 ident: 10.1016/j.knosys.2025.114771_bib0014 article-title: Correlation hashing network for efficient cross-modal retrieval – volume: 147 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0038 article-title: Deep self-enhancement hashing for robust multi-label cross-modal retrieval publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2023.110079 – volume: 41 start-page: 423 issue: 2 year: 2019 ident: 10.1016/j.knosys.2025.114771_bib0002 article-title: Multimodal machine learning: a survey and taxonomy publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2018.2798607 – year: 2025 ident: 10.1016/j.knosys.2025.114771_bib0048 article-title: Semi-Supervised online cross-modal hashing – start-page: 1135 year: 2015 ident: 10.1016/j.knosys.2025.114771_bib0009 article-title: Learning both weights and connections for efficient neural networks – volume: 39 start-page: 1447 issue: 2 year: 2025 ident: 10.1016/j.knosys.2025.114771_bib0026 article-title: Understanding emotional body expressions via large language models publication-title: Proceedings of the AAAI Conference on Artificial Intelligence doi: 10.1609/aaai.v39i2.32135 – year: 2018 ident: 10.1016/j.knosys.2025.114771_bib0037 article-title: SCH-GAN: semi-supervised cross-modal hashing via generative adversarial networks – year: 2018 ident: 10.1016/j.knosys.2025.114771_sbref0015 article-title: HashGAN: attention-aware deep adversarial hashing for cross-modal retrieval – volume: 83 start-page: 90487 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0044 article-title: Hierarchical modal interaction balance cross-modal hashing for unsupervised image-text retrieval publication-title: Multimed. Tools Appl. doi: 10.1007/s11042-024-19371-w – volume: 7 start-page: 535 issue: 3 year: 2021 ident: 10.1016/j.knosys.2025.114771_bib0046 article-title: Billion-scale similarity search with GPUs publication-title: IEEE Trans. Big Data doi: 10.1109/TBDATA.2019.2921572 – start-page: 585 year: 2021 ident: 10.1016/j.knosys.2025.114771_bib0025 article-title: Supervised prototypical contrastive learning for imbalanced emotion recognition – start-page: 4512 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0032 article-title: TelME: teacher-leading multimodal fusion network for emotion recognition in conversation publication-title: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) – ident: 10.1016/j.knosys.2025.114771_bib0010 – volume: 7 start-page: 535 issue: 3 year: 2021 ident: 10.1016/j.knosys.2025.114771_bib0007 article-title: Billion-scale similarity search with GPUs publication-title: IEEE Trans. Big Data doi: 10.1109/TBDATA.2019.2921572 – start-page: 527 year: 2019 ident: 10.1016/j.knosys.2025.114771_bib0006 article-title: MELD: a multimodal multi-party dataset for emotion recognition in conversations – volume: 25 start-page: 4821 issue: 6 year: 2025 ident: 10.1016/j.knosys.2025.114771_bib0049 article-title: DSSTNet: emotion recognition using multi-scale EEG features through graph convolution publication-title: IEEE Sens J. – ident: 10.1016/j.knosys.2025.114771_bib0021 – start-page: 2704 year: 2018 ident: 10.1016/j.knosys.2025.114771_bib0011 article-title: Quantization and training of neural networks for efficient integer-arithmetic-only inference – volume: 105 start-page: 2295 issue: 12 year: 2017 ident: 10.1016/j.knosys.2025.114771_bib0008 article-title: Efficient processing of deep neural networks: a tutorial and survey publication-title: Proc. IEEE doi: 10.1109/JPROC.2017.2761740 – start-page: 5609 year: 2017 ident: 10.1016/j.knosys.2025.114771_bib0015 article-title: HashNet: deep learning to hash by continuation – volume: 37 start-page: 98 year: 2017 ident: 10.1016/j.knosys.2025.114771_bib0003 article-title: Multimodal sentiment analysis: a review publication-title: Inf. Fusion doi: 10.1016/j.inffus.2017.02.003 – start-page: 7395 year: 2023 ident: 10.1016/j.knosys.2025.114771_bib0023 article-title: DualGATs: dual graph attention networks for emotion recognition in conversations – ident: 10.1016/j.knosys.2025.114771_bib0020 doi: 10.1145/3610661.3616190 – year: 1997 ident: 10.1016/j.knosys.2025.114771_bib0001 – volume: 39 start-page: 1161 issue: 6 year: 1980 ident: 10.1016/j.knosys.2025.114771_bib0017 article-title: A circumplex model of affect publication-title: J. Pers. Soc. Psychol. doi: 10.1037/h0077714 – volume: 7 start-page: 535 issue: 3 year: 2019 ident: 10.1016/j.knosys.2025.114771_bib0035 article-title: Billion-scale similarity search with GPUs publication-title: IEEE Trans. Big Data doi: 10.1109/TBDATA.2019.2921572 – volume: 305 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0029 article-title: MAS-DGAT-Net: a dynamic graph attention network with multibranch feature extraction and staged fusion for EEG emotion recognition publication-title: Knowl. Based Syst. doi: 10.1016/j.knosys.2024.112599 – volume: 110 year: 2025 ident: 10.1016/j.knosys.2025.114771_bib0028 article-title: Cross-subject emotion recognition by EEG driven spatio-temporal hybrid network based on domain adaptation and dynamic graph attention publication-title: Biomed. Signal Process. Control doi: 10.1016/j.bspc.2025.108231 – start-page: 347 year: 2020 ident: 10.1016/j.knosys.2025.114771_bib0019 article-title: MTANet: facial affect recognition in the wild using multi-task learning convolutional network – start-page: 1 year: 2022 ident: 10.1016/j.knosys.2025.114771_bib0034 article-title: BitFit: simple parameter-efficient fine-tuning for transformer-based masked language-models – volume: 20 issue: 4 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0039 article-title: Supervised hierarchical online hashing for cross-modal retrieval publication-title: ACM Trans. Multimedia Comput. Commun. Appl. doi: 10.1145/3632527 – volume: 88 start-page: 102 year: 2019 ident: 10.1016/j.knosys.2025.114771_bib0040 article-title: HashEmotion: compact binary codes for facial valence-arousal retrieval publication-title: Image Vis. Comput. – year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0041 article-title: HCH: hierarchical contrastive hashing for emotion video retrieval – start-page: 2946 year: 2013 ident: 10.1016/j.knosys.2025.114771_bib0047 article-title: Optimized product quantization for approximate nearest neighbor search – volume: 202 start-page: 31292 year: 2023 ident: 10.1016/j.knosys.2025.114771_bib0033 article-title: UPop: unified and progressive pruning for compressing vision-language transformers – volume: 33 start-page: 117 issue: 1 year: 2011 ident: 10.1016/j.knosys.2025.114771_bib0012 article-title: Product quantization for nearest neighbor search publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2010.57 – start-page: 2064 year: 2016 ident: 10.1016/j.knosys.2025.114771_bib0013 article-title: Deep supervised hashing for fast image retrieval – start-page: 8748 year: 2021 ident: 10.1016/j.knosys.2025.114771_bib0045 article-title: Learning transferable visual models from natural language supervision – start-page: 2314 year: 2023 ident: 10.1016/j.knosys.2025.114771_bib0030 article-title: Multimodal sentiment analysis via efficient multimodal transformer publication-title: Proceedings of the ACM International Conference on Multimedia (ACM MM) – start-page: 6558 year: 2019 ident: 10.1016/j.knosys.2025.114771_bib0004 article-title: Multimodal transformer for unaligned multimodal language sequences – volume: 17 year: 2023 ident: 10.1016/j.knosys.2025.114771_bib0042 article-title: Dual attention mechanism graph convolutional neural network for EEG emotion recognition publication-title: Front. Neurosci. – volume: 15 start-page: 2042 issue: 4 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0024 article-title: Bridge graph attention based graph convolution network with multi-scale transformer for EEG emotion recognition publication-title: IEEE Trans Affect Comput doi: 10.1109/TAFFC.2024.3394873 – start-page: 2740 year: 2016 ident: 10.1016/j.knosys.2025.114771_bib0036 article-title: Deep pairwise-supervised hashing for scalable image retrieval – volume: 225 year: 2025 ident: 10.1016/j.knosys.2025.114771_bib0043 article-title: Region-focused CNN with dynamic adaptive graph attention for EEG-based emotion recognition publication-title: Expert Syst. Appl. – start-page: 1227 year: 2024 ident: 10.1016/j.knosys.2025.114771_bib0022 article-title: Contrastive transformer cross-modal hashing for video-text retrieval – start-page: 2702 year: 2019 ident: 10.1016/j.knosys.2025.114771_bib0018 article-title: Aff-wild2: extending the aff-wild database for affect recognition – year: 2020 ident: 10.1016/j.knosys.2025.114771_sbref0005 article-title: MTAG: modal-temporal attention graph for unaligned human multimodal language sequences – year: 2025 ident: 10.1016/j.knosys.2025.114771_bib0027 article-title: Visual and textual prompts in VLLMs for enhancing emotion recognition publication-title: IEEE Trans. Circuits Syst. Video Technol.
SSID	ssj0002218
Score	2.4404392
Snippet	•Reframe multimodal emotion recognition as affect-oriented binary hashing with the Cross-Modal Emotion Hashing Network (CEHN).•Employ a three-hop gated...
SourceID	crossref elsevier
SourceType	Index Database Publisher
StartPage	114771
SubjectTerms	Binary hashing Cross-modal attention Multimodal emotion recognition Polarity-aware contrastive distillation
Title	Cross-modal emotion hashing network: Efficient binary coding for large-scale multimodal emotion retrieval
URI	https://dx.doi.org/10.1016/j.knosys.2025.114771
Volume	331
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 issn: 0950-7051 databaseCode: AIEXJ dateStart: 19950201 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: false ssIdentifier: ssj0002218 providerName: Elsevier
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT-MwELbK48CF565gecgHbshVkqZxsjeEingJcWBFb5Ed26I8HFRKxXF_-o7jR1tYIThwiSonmVierzOTyecZhPYpZ6zLJCdpR0iScgZ_qZylJKtiCd6q6HIqmmYT9PIy7_eLq1brr98LM36gWuevr8XTt6oaxkDZZuvsF9QdhMIA_AalwxHUDsdPKf7I-D3yWAtYfGmb9Bzc2pZJB9qSvk0aoNfUjjBMAG635Fa18KzKB0MPJ8-gPmkZh7PShk0XrrGbiItsz31yjhjHKFyJ6BCxXwzsx32pb1k9wZLLVt8YzlIwQG7Q5LLD7Z45_KKn8xRJt-F8dGYSjhGhkSsv62yv365lrSe8m1HbkOWdYbc5hrv2va5h_m3zgPbk8tk62m_8W2AdekLbXWmllEZKaaXMoYUEoAmmfeHwtNc_C948SZoccZi9337ZcATfz-b_4c1UyHK9ipbduwY-tBhZQy2p19GK7-OBnVnfQIMpyGCnZOwggx1kfuMAGGwBgy1gMAAGTwEGTwATZAXA_EB_jnvXRyfEteAgFUS2I6JiEzBXWcIEL1KRqYjnCU2ZZIWopKIqjhSTMjVV9OBUbj6Kq4xFScFhOQXr_ETzutZyE2HBVRZzuFZS8M4p51UnjpXIc5HybpGoLUT8ypVPttJK-ZHGthD1y1u6aNFGgSVg5sM7f33xSdtoaQLoHTQ_Gr7IXbRYjUeD5-GeA8w_Y3uS4Q
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cross-modal+emotion+hashing+network%3A+Efficient+binary+coding+for+large-scale+multimodal+emotion+retrieval&rft.jtitle=Knowledge-based+systems&rft.au=Li%2C+Chenhao&rft.au=Huang%2C+Wenti&rft.au=Yang%2C+Zhan&rft.au=Long%2C+Jun&rft.date=2025-12-03&rft.issn=0950-7051&rft.volume=331&rft.spage=114771&rft_id=info:doi/10.1016%2Fj.knosys.2025.114771&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_knosys_2025_114771
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0950-7051&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0950-7051&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0950-7051&client=summon