Unconstrained vocal pattern recognition algorithm based on attention mechanism

Deep learning-based voiceprint recognition methods rely heavily on adequate datasets, especially those closer to the natural environment and more complex under unconstrained conditions. Yet, the data types of open-source speech datasets are too homogeneous nowadays, and there are some differences wi...

Full description

Saved in:
Bibliographic Details
Published in:Digital signal processing Vol. 136; p. 103973
Main Authors: Li, Yaqian, Zhang, Xiaolong, Zhang, Xuyao, Li, Haibin, Zhang, Wenming
Format: Journal Article
Language:English
Published: Elsevier Inc 01.05.2023
Subjects:
ISSN:1051-2004, 1095-4333
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Deep learning-based voiceprint recognition methods rely heavily on adequate datasets, especially those closer to the natural environment and more complex under unconstrained conditions. Yet, the data types of open-source speech datasets are too homogeneous nowadays, and there are some differences with the address collected in natural application environments. For few Chinese datasets used, this paper proposes and produces an unconstrained Chinese speech dataset with richer data types closer to those collected in a natural environment. To address the inadequate extraction of acoustic features in the unconstrained speech dataset, a new two-dimensional convolutional residual network structure based on the attention mechanism is designed and applied to acoustic feature extraction. The residual block structure in the residual network is improved by the SE module and the CBAM module to obtain the SE-Cov2d and CSA-Cov2d models respectively. Finally, it is experimentally demonstrated that the attention mechanism can help the network focus on more critical feature information and fuse more differentiated features in feature extraction. •An unconstrained Chinese speech dataset proposed in a natural environment.•A new two-dimensional convolutional residual network structure designed and applied to acoustic feature extraction.•SE-Cov2d obtains the smallest EER values of 2.72% and 6.76% on both VoxCeleb and CN-Human datasets.
AbstractList Deep learning-based voiceprint recognition methods rely heavily on adequate datasets, especially those closer to the natural environment and more complex under unconstrained conditions. Yet, the data types of open-source speech datasets are too homogeneous nowadays, and there are some differences with the address collected in natural application environments. For few Chinese datasets used, this paper proposes and produces an unconstrained Chinese speech dataset with richer data types closer to those collected in a natural environment. To address the inadequate extraction of acoustic features in the unconstrained speech dataset, a new two-dimensional convolutional residual network structure based on the attention mechanism is designed and applied to acoustic feature extraction. The residual block structure in the residual network is improved by the SE module and the CBAM module to obtain the SE-Cov2d and CSA-Cov2d models respectively. Finally, it is experimentally demonstrated that the attention mechanism can help the network focus on more critical feature information and fuse more differentiated features in feature extraction. •An unconstrained Chinese speech dataset proposed in a natural environment.•A new two-dimensional convolutional residual network structure designed and applied to acoustic feature extraction.•SE-Cov2d obtains the smallest EER values of 2.72% and 6.76% on both VoxCeleb and CN-Human datasets.
ArticleNumber 103973
Author Zhang, Xiaolong
Zhang, Wenming
Li, Yaqian
Zhang, Xuyao
Li, Haibin
Author_xml – sequence: 1
  givenname: Yaqian
  surname: Li
  fullname: Li, Yaqian
  email: yaqianli@126.com
  organization: Pattern Recognized, Electrical Engineering, Yanshan University, China
– sequence: 2
  givenname: Xiaolong
  orcidid: 0000-0003-4323-4985
  surname: Zhang
  fullname: Zhang, Xiaolong
  email: miu_zxl@163.com
  organization: Speaker Diarization, Electrical Engineering, Yanshan University, China
– sequence: 3
  givenname: Xuyao
  surname: Zhang
  fullname: Zhang, Xuyao
  email: 1106343910@qq.com
  organization: Speaker Verification, Electrical Engineering, Yanshan University, China
– sequence: 4
  givenname: Haibin
  surname: Li
  fullname: Li, Haibin
  email: hbli@ysu.edu.cn
  organization: Pattern Recognized, Electrical Engineering, Yanshan University, China
– sequence: 5
  givenname: Wenming
  surname: Zhang
  fullname: Zhang, Wenming
  email: zwmwen@ysu.edu.cn
  organization: Camera Calibration, Electrical Engineering, Yanshan University, China
BookMark eNp9kMtOwzAQRS1UJNrCB7DLD6T4lTgRK1TxkirY0LU1sZ3WVWJXtlWJv8ehrFh048d4jjX3LNDMeWcQuid4RTCpHw4rHY8riinLd9YKdoXmBLdVyRljs-lckZJizG_QIsYDxlhwWs_Rx9Yp72IKYJ3RxckrGIojpGSCK4JRfudsst4VMOx8sGk_Fh3E3DmVcpf7fRyN2oOzcbxF1z0M0dz97Uu0fXn-Wr-Vm8_X9_XTplS0FamkRhFoNTQdrXtoaF67BlqsOa45bogh0OmcAgRlDetZI0RfaVZRQXlXc86WiJz_VcHHGEwvj8GOEL4lwXISIg8yC5GTEHkWkhnxj1E2wTT_lH64SD6eSZMjnawJMiprnDLaZkVJam8v0D9v333C
CitedBy_id crossref_primary_10_3390_math11194205
crossref_primary_10_1177_14727978251321638
crossref_primary_10_1007_s00521_024_10548_w
Cites_doi 10.1016/0167-6393(90)90010-7
10.1109/ACCESS.2021.3084299
10.1002/ima.22337
10.1007/s00521-017-2848-4
ContentType Journal Article
Copyright 2023 Elsevier Inc.
Copyright_xml – notice: 2023 Elsevier Inc.
DBID AAYXX
CITATION
DOI 10.1016/j.dsp.2023.103973
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1095-4333
ExternalDocumentID 10_1016_j_dsp_2023_103973
S1051200423000684
GroupedDBID --K
--M
.DC
.~1
0R~
1B1
1~.
1~5
29G
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADFGL
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CAG
COF
CS3
DM4
DU5
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG5
LG9
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SET
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
WUQ
XPP
ZMT
ZU3
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c297t-2ec1a9da8b26fa8226fb8a90d4064081e1abd397a72383f3877f5d352724b6443
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000955179700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1051-2004
IngestDate Tue Nov 18 22:25:06 EST 2025
Sat Nov 29 07:08:40 EST 2025
Fri Feb 23 02:37:03 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Attention mechanism
Feature fusion
Voiceprint recognition
Unconstrained datasets
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c297t-2ec1a9da8b26fa8226fb8a90d4064081e1abd397a72383f3877f5d352724b6443
ORCID 0000-0003-4323-4985
ParticipantIDs crossref_primary_10_1016_j_dsp_2023_103973
crossref_citationtrail_10_1016_j_dsp_2023_103973
elsevier_sciencedirect_doi_10_1016_j_dsp_2023_103973
PublicationCentury 2000
PublicationDate May 2023
2023-05-00
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: May 2023
PublicationDecade 2020
PublicationTitle Digital signal processing
PublicationYear 2023
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Li, Wang, Fan (br0170) 2019; 29
Variani, Lei, McDermott (br0050) 2014
Zue, Seneff, Glass (br0190) 1990; 9
Kabir, Mridha, Shin (br0010) 2021
Sainath, Vinyals, Senior (br0080) 2015
McLaren, Ferrer, Castan (br0040) September 2016
Chung, Nagrani, Zisserman (br0220) 2018
Mallouh, Qawaqneh, Barkana (br0070) 2018; 30
Li, Ma, Jiang (br0130) 2017
Wang, Downey, Wan (br0120) 2018
Nagrani, Chung, Zisserman (br0200) 2017
Dawalatabad, Ravanelli, Grondin (br0030) 2021
Ollerenshaw, Jalal, Hain (br0230) 2022
Hu, Shen, Sun (br0150) 2018
Lei, Scheffer, Ferrer (br0020) 2014
Muckenhirn, Doss, Marcell (br0090) 2018
Dey, Salem (br0140) 2017
Panagiotakos, Haveles, Arjun (br0180) 2019
Snyder, Garcia-Romero, Sell (br0060) 2018
Rahman Chowdhury, Wang, Moreno (br0110) 2018
Panayotov, Chen, Povey (br0210) 2015
Heidari, Faris, Mirjalili (br0100) 2020
Woo, Park, Lee (br0160) 2018
Woo (10.1016/j.dsp.2023.103973_br0160) 2018
Hu (10.1016/j.dsp.2023.103973_br0150) 2018
Nagrani (10.1016/j.dsp.2023.103973_br0200)
Li (10.1016/j.dsp.2023.103973_br0130)
Variani (10.1016/j.dsp.2023.103973_br0050) 2014
Kabir (10.1016/j.dsp.2023.103973_br0010) 2021
Chung (10.1016/j.dsp.2023.103973_br0220)
Snyder (10.1016/j.dsp.2023.103973_br0060) 2018
Lei (10.1016/j.dsp.2023.103973_br0020) 2014
Dawalatabad (10.1016/j.dsp.2023.103973_br0030)
McLaren (10.1016/j.dsp.2023.103973_br0040) 2016
Mallouh (10.1016/j.dsp.2023.103973_br0070) 2018; 30
Heidari (10.1016/j.dsp.2023.103973_br0100) 2020
Wang (10.1016/j.dsp.2023.103973_br0120) 2018
Zue (10.1016/j.dsp.2023.103973_br0190) 1990; 9
Sainath (10.1016/j.dsp.2023.103973_br0080) 2015
Muckenhirn (10.1016/j.dsp.2023.103973_br0090) 2018
Panayotov (10.1016/j.dsp.2023.103973_br0210) 2015
Dey (10.1016/j.dsp.2023.103973_br0140) 2017
Li (10.1016/j.dsp.2023.103973_br0170) 2019; 29
Panagiotakos (10.1016/j.dsp.2023.103973_br0180) 2019
Rahman Chowdhury (10.1016/j.dsp.2023.103973_br0110) 2018
Ollerenshaw (10.1016/j.dsp.2023.103973_br0230)
References_xml – start-page: 5239
  year: 2018
  end-page: 5243
  ident: br0120
  article-title: Speaker diarization with LSTM
  publication-title: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– year: 2017
  ident: br0200
  article-title: Voxceleb: a large-scale speaker identification dataset
– start-page: 4884
  year: 2018
  end-page: 4888
  ident: br0090
  article-title: Towards directly modeling raw speech signal for speaker verification using CNNs
  publication-title: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– start-page: 79236
  year: 2021
  end-page: 79263
  ident: br0010
  article-title: A survey of speaker recognition: fundamental theories, recognition methods and opportunities
  publication-title: IEEE Access
– start-page: 818
  year: September 2016
  end-page: 822
  ident: br0040
  article-title: The speakers in the wild (SITW) speaker recognition database
  publication-title: Interspeech, Conference of the International Speech Communication Association
– start-page: 1597
  year: 2017
  end-page: 1600
  ident: br0140
  article-title: Gate-variants of gated recurrent unit (GRU) neural networks
  publication-title: 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)
– start-page: 8
  year: 2019
  ident: br0180
  article-title: Aberrant calcium channel splicing drives defects in cortical differentiation in Timothy syndrome
  publication-title: eLife
– start-page: 5329
  year: 2018
  end-page: 5333
  ident: br0060
  article-title: X-vectors: robust dnn embeddings for speaker recognition
  publication-title: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– start-page: 5206
  year: 2015
  end-page: 5210
  ident: br0210
  article-title: Librispeech: an asr corpus based on public domain audio books
  publication-title: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– start-page: 1695
  year: 2014
  end-page: 1699
  ident: br0020
  article-title: A novel scheme for speaker recognition using a phonetically-aware deep neural network
  publication-title: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– year: 2022
  ident: br0230
  article-title: Dynamic kernels and channel attention with multi-layer embedding aggregation for speaker verification
– volume: 30
  start-page: 2581
  year: 2018
  end-page: 2593
  ident: br0070
  article-title: New transformed features generated by deep bottleneck extractor and a GMM–UBM classifier for speaker age and gender classification
  publication-title: Neural Comput. Appl.
– start-page: 5359
  year: 2018
  end-page: 5363
  ident: br0110
  article-title: Attention-based models for text-dependent speaker verification
  publication-title: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– start-page: 3
  year: 2018
  end-page: 19
  ident: br0160
  article-title: Cbam: convolutional block attention module
  publication-title: Proceedings of the European Conference on Computer Vision (ECCV)
– start-page: 4580
  year: 2015
  end-page: 4584
  ident: br0080
  article-title: Convolutional, long short-term memory, fully connected deep neural networks
  publication-title: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– volume: 9
  start-page: 351
  year: 1990
  end-page: 356
  ident: br0190
  article-title: Speech database development at MIT: timit and beyond
  publication-title: Speech Commun.
– year: 2018
  ident: br0220
  article-title: Voxceleb2: deep speaker recognition
– volume: 29
  start-page: 577
  year: 2019
  end-page: 583
  ident: br0170
  article-title: Teeth category classification via seven-layer deep convolutional neural network with max pooling and global average pooling
  publication-title: Int. J. Imaging Syst. Technol.
– start-page: 23
  year: 2020
  end-page: 46
  ident: br0100
  article-title: Ant lion optimizer: theory, literature review, and application in multi-layer perceptron neural networks
  publication-title: Nature-Inspired Optimizers
– year: 2017
  ident: br0130
  article-title: Deep speaker: an end-to-end neural speaker embedding system
– start-page: 7132
  year: 2018
  end-page: 7141
  ident: br0150
  article-title: Squeeze-and-excitation networks
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– year: 2021
  ident: br0030
  article-title: ECAPA-TDNN embeddings for speaker diarization
– start-page: 4052
  year: 2014
  end-page: 4056
  ident: br0050
  article-title: Deep neural networks for small footprint text-dependent speaker verification
  publication-title: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– start-page: 1695
  year: 2014
  ident: 10.1016/j.dsp.2023.103973_br0020
  article-title: A novel scheme for speaker recognition using a phonetically-aware deep neural network
– start-page: 23
  year: 2020
  ident: 10.1016/j.dsp.2023.103973_br0100
  article-title: Ant lion optimizer: theory, literature review, and application in multi-layer perceptron neural networks
– ident: 10.1016/j.dsp.2023.103973_br0200
– start-page: 4052
  year: 2014
  ident: 10.1016/j.dsp.2023.103973_br0050
  article-title: Deep neural networks for small footprint text-dependent speaker verification
– ident: 10.1016/j.dsp.2023.103973_br0030
– start-page: 5239
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0120
  article-title: Speaker diarization with LSTM
– start-page: 3
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0160
  article-title: Cbam: convolutional block attention module
– volume: 9
  start-page: 351
  issue: 4
  year: 1990
  ident: 10.1016/j.dsp.2023.103973_br0190
  article-title: Speech database development at MIT: timit and beyond
  publication-title: Speech Commun.
  doi: 10.1016/0167-6393(90)90010-7
– start-page: 79236
  year: 2021
  ident: 10.1016/j.dsp.2023.103973_br0010
  article-title: A survey of speaker recognition: fundamental theories, recognition methods and opportunities
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2021.3084299
– ident: 10.1016/j.dsp.2023.103973_br0220
– start-page: 5206
  year: 2015
  ident: 10.1016/j.dsp.2023.103973_br0210
  article-title: Librispeech: an asr corpus based on public domain audio books
– volume: 29
  start-page: 577
  issue: 4
  year: 2019
  ident: 10.1016/j.dsp.2023.103973_br0170
  article-title: Teeth category classification via seven-layer deep convolutional neural network with max pooling and global average pooling
  publication-title: Int. J. Imaging Syst. Technol.
  doi: 10.1002/ima.22337
– start-page: 8
  year: 2019
  ident: 10.1016/j.dsp.2023.103973_br0180
  article-title: Aberrant calcium channel splicing drives defects in cortical differentiation in Timothy syndrome
  publication-title: eLife
– volume: 30
  start-page: 2581
  issue: 8
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0070
  article-title: New transformed features generated by deep bottleneck extractor and a GMM–UBM classifier for speaker age and gender classification
  publication-title: Neural Comput. Appl.
  doi: 10.1007/s00521-017-2848-4
– start-page: 7132
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0150
  article-title: Squeeze-and-excitation networks
– start-page: 818
  year: 2016
  ident: 10.1016/j.dsp.2023.103973_br0040
  article-title: The speakers in the wild (SITW) speaker recognition database
– start-page: 1597
  year: 2017
  ident: 10.1016/j.dsp.2023.103973_br0140
  article-title: Gate-variants of gated recurrent unit (GRU) neural networks
– start-page: 5329
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0060
  article-title: X-vectors: robust dnn embeddings for speaker recognition
– start-page: 5359
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0110
  article-title: Attention-based models for text-dependent speaker verification
– ident: 10.1016/j.dsp.2023.103973_br0230
– start-page: 4580
  year: 2015
  ident: 10.1016/j.dsp.2023.103973_br0080
  article-title: Convolutional, long short-term memory, fully connected deep neural networks
– start-page: 4884
  year: 2018
  ident: 10.1016/j.dsp.2023.103973_br0090
  article-title: Towards directly modeling raw speech signal for speaker verification using CNNs
– ident: 10.1016/j.dsp.2023.103973_br0130
SSID ssj0007426
Score 2.354084
Snippet Deep learning-based voiceprint recognition methods rely heavily on adequate datasets, especially those closer to the natural environment and more complex under...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 103973
SubjectTerms Attention mechanism
Feature fusion
Unconstrained datasets
Voiceprint recognition
Title Unconstrained vocal pattern recognition algorithm based on attention mechanism
URI https://dx.doi.org/10.1016/j.dsp.2023.103973
Volume 136
WOSCitedRecordID wos000955179700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1095-4333
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0007426
  issn: 1051-2004
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT-MwELag7AEOaIFFsC_5sCeqojZJa_uIWFbsClUcQCqnyI5tCIIU2lJ1__3OxI6bsoDgwCWqXGfymM8zX-zxDCE_mDCqnWjdYnEEHyjWJDDmcDca59qCf9dKlpo-Yf0-HwzEqQ8bG5flBFhR8NlM3L2rqqENlI1bZ9-g7iAUGuA3KB2OoHY4vkrx50WGnA9LPwCZnKKvwuypOPHXDOFCGIN8czkc5ZOr2yZ6Mo2rBtjLRT_eGtwRXKUX9OT1Z36JJUaaGPOBQt0eg8r3YVRPGRpwIe9rmAsz0oNcgqGdd57_8fBXDhdFHMtc-ZTgfkYiqsX_eSMKA92ppG5l47qdxAVoV8LkPxPuZhOu9_UY04lG8f6872K67EduLAQXVnFr1ymISFFE6kQsk5WIdQVvkJWD30eDP8Fjs6Qsyxfuu1r9LuMAH93H0_ylxknOPpJ1_zFBDxwINsiSKTbJWi3F5BbpL8CBlnCgHg60Bgca4EBLOFBsquBAAxw-kfNfR2eHxy1fQ6OVRYJNWpHJOlJoyVXUsxLYYM8qLkVbJ7iEyzumI5WG55JYfC62MWfMdjWwchYlCrhyvE0axbAwO4Qqq4WxrK1MHCdKcBjlzFjb6UrbS4B275J29WbSzCeYx6e7SZ_VyC7ZC6fcuewqL3VOqtedenroaF8K0Hn-tM9vucYXsjpH9FfSmIwezDfyIZtO8vHou8fNPzDxhtM
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Unconstrained+vocal+pattern+recognition+algorithm+based+on+attention+mechanism&rft.jtitle=Digital+signal+processing&rft.au=Li%2C+Yaqian&rft.au=Zhang%2C+Xiaolong&rft.au=Zhang%2C+Xuyao&rft.au=Li%2C+Haibin&rft.date=2023-05-01&rft.issn=1051-2004&rft.volume=136&rft.spage=103973&rft_id=info:doi/10.1016%2Fj.dsp.2023.103973&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_dsp_2023_103973
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-2004&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-2004&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-2004&client=summon