Spatial-temporal hierarchical decoupled masked autoencoder: A self-supervised learning framework for electrocardiogram

The difficulty of labeling Electrocardiogram (ECG) has prompted researchers to use self-supervised learning to enhance diagnostic performance. Masked autoencoders (MAE) are a mainstream paradigm where models learn a latent representation of the signal by reconstructing masked portions of the ECG. Ho...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Expert systems with applications Ročník 298; s. 129603
Hlavní autori: Wei, Xiaoyang, Li, Zhiyuan, Tian, Yuanyuan, Wang, Mengxiao, Jin, Yanrui, Ding, Weiping, Liu, Chengliang
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 01.03.2026
Predmet:
ISSN:0957-4174
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The difficulty of labeling Electrocardiogram (ECG) has prompted researchers to use self-supervised learning to enhance diagnostic performance. Masked autoencoders (MAE) are a mainstream paradigm where models learn a latent representation of the signal by reconstructing masked portions of the ECG. However, existing methods lack a specific design for the spatial–temporal characteristics of ECG. Specifically, leads represent spatial projections of cardiac activity, while timestamps capture temporal patterns, and the two correspond to different axes of information. Existing MAE frameworks tend to unify them prematurely, potentially weakening critical local dependencies. In this paper, we propose a Spatial-Temporal Hierarchical Decoupled Masked Autoencoder (STHD-MAE). This framework decouples ECG into isolated leads or time steps in the shallow layer to capture local dependencies with different views, then aligns spatial–temporal representations and re-establishes global dependencies in the deep layer to comprehensively represent pathological information. We also design a medical report fusion module during pre-training, which uses cross-attention to align the ECG report text encoded by a medical language model with the signal’s latent representation, thereby guiding the encoder to focus on pathological information through implicit cross-modal learning. We validate the effectiveness of STHD-MAE on multiple downstream classification and reconstruction tasks. The results show that STHD-MAE outperforms existing self-supervised learning methods by approximately 2% in F1-scores for both coarse-grained and fine-grained classification performance, and its reconstruction quality also exceeds the baseline generative model.
AbstractList The difficulty of labeling Electrocardiogram (ECG) has prompted researchers to use self-supervised learning to enhance diagnostic performance. Masked autoencoders (MAE) are a mainstream paradigm where models learn a latent representation of the signal by reconstructing masked portions of the ECG. However, existing methods lack a specific design for the spatial–temporal characteristics of ECG. Specifically, leads represent spatial projections of cardiac activity, while timestamps capture temporal patterns, and the two correspond to different axes of information. Existing MAE frameworks tend to unify them prematurely, potentially weakening critical local dependencies. In this paper, we propose a Spatial-Temporal Hierarchical Decoupled Masked Autoencoder (STHD-MAE). This framework decouples ECG into isolated leads or time steps in the shallow layer to capture local dependencies with different views, then aligns spatial–temporal representations and re-establishes global dependencies in the deep layer to comprehensively represent pathological information. We also design a medical report fusion module during pre-training, which uses cross-attention to align the ECG report text encoded by a medical language model with the signal’s latent representation, thereby guiding the encoder to focus on pathological information through implicit cross-modal learning. We validate the effectiveness of STHD-MAE on multiple downstream classification and reconstruction tasks. The results show that STHD-MAE outperforms existing self-supervised learning methods by approximately 2% in F1-scores for both coarse-grained and fine-grained classification performance, and its reconstruction quality also exceeds the baseline generative model.
ArticleNumber 129603
Author Ding, Weiping
Wang, Mengxiao
Jin, Yanrui
Li, Zhiyuan
Wei, Xiaoyang
Tian, Yuanyuan
Liu, Chengliang
Author_xml – sequence: 1
  givenname: Xiaoyang
  orcidid: 0009-0005-8775-6875
  surname: Wei
  fullname: Wei, Xiaoyang
  email: victor0926@sjtu.edu.cn
  organization: School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China
– sequence: 2
  givenname: Zhiyuan
  orcidid: 0000-0003-1323-6795
  surname: Li
  fullname: Li, Zhiyuan
  email: lzy2030@sjtu.edu.cn
  organization: School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China
– sequence: 3
  givenname: Yuanyuan
  orcidid: 0000-0003-4432-1546
  surname: Tian
  fullname: Tian, Yuanyuan
  email: tian102@sjtu.edu.cn
  organization: School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China
– sequence: 4
  givenname: Mengxiao
  orcidid: 0000-0002-4979-0723
  surname: Wang
  fullname: Wang, Mengxiao
  email: mengxiaowang@sjtu.edu.cn
  organization: School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China
– sequence: 5
  givenname: Yanrui
  orcidid: 0000-0001-9489-5447
  surname: Jin
  fullname: Jin, Yanrui
  email: jinyanrui@sjtu.edu.cn
  organization: School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China
– sequence: 6
  givenname: Weiping
  surname: Ding
  fullname: Ding, Weiping
  email: ding.wp@ntu.edu.cn
  organization: School of Artificial Intelligence and Computer Science, Nantong University, Nantong 226019, China
– sequence: 7
  givenname: Chengliang
  surname: Liu
  fullname: Liu, Chengliang
  email: chlliu@sjtu.edu.cn
  organization: School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China
BookMark eNp9kM1uwjAQhH2gUoH2BXrKC4TacX5w1QtC_ZOQemh7trb2GgwhjtYB1LevET33NNodzWrnm7BRFzpk7E7wmeCivt_OMJ5gVvCimolC1VyO2JirqslL0ZTXbBLjlnPRcN6M2fGjh8FDmw-47wNBm208EpDZeJMGiyYc-hZttoe4SwKHIWBngkV6yBZZxNbl8dAjHX1MdotAne_WmSPY4ynQLnOBMmzRDBQMkPVhnawbduWgjXj7p1P29fz0uXzNV-8vb8vFKjdFJYdcQmk4SsBvkRZOWQQluFSuqctGqBLnZaNkjU4ZOy8sSGGsNbys584qWRk5ZcXlrqEQI6HTPfk90I8WXJ9p6a0-09JnWvpCK4UeLyFMnx0TDh2NT6XReko9tA3-v_gvMyV7Dw
Cites_doi 10.1109/CVPR.2019.00065
10.22489/CinC.2017.173-154
10.1016/j.advwatres.2025.104952
10.1038/s41598-022-16828-6
10.1016/j.asoc.2025.112731
10.1109/TETCI.2024.3377676
10.1038/s41597-023-01945-2
10.1109/ACCESS.2025.3552626
10.1016/j.ins.2024.121576
10.1016/j.bspc.2023.105271
10.1016/j.specom.2025.103186
10.1016/j.ins.2022.03.046
10.36922/aih.3276
10.1016/j.patcog.2024.111311
10.1016/j.bspc.2008.04.002
10.36922/aih.3930
10.1109/CVPR52688.2022.01553
10.3389/fcvm.2024.1424585
10.1016/j.knosys.2024.112114
10.1016/j.measurement.2012.01.017
10.1109/JBHI.2020.3022989
10.18653/v1/W19-1909
10.1016/j.bspc.2023.104772
10.1145/3458754
10.3115/v1/D14-1162
10.1109/ICCV.2015.167
10.1371/journal.pone.0307978
10.1007/s10462-024-11080-y
10.1016/j.artmed.2023.102690
10.1002/cam4.6089
10.1109/TBME.2024.3517635
10.1038/s44325-024-00036-4
10.1038/s41597-020-0495-6
10.1016/j.asoc.2024.112536
10.1016/j.bspc.2024.106131
10.1038/s41597-020-0386-x
10.1016/j.compbiomed.2015.01.019
10.1016/j.compbiomed.2023.107903
10.1016/j.inffus.2024.102841
10.1161/01.CIR.101.23.e215
10.18653/v1/2020.clinicalnlp-1.17
10.1016/j.ins.2024.121516
ContentType Journal Article
Copyright 2025 Elsevier Ltd
Copyright_xml – notice: 2025 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.eswa.2025.129603
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
ExternalDocumentID 10_1016_j_eswa_2025_129603
S095741742503218X
GroupedDBID --K
--M
.DC
.~1
0R~
13V
1B1
1RT
1~.
1~5
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
9JO
AAAKF
AABNK
AAEDT
AAEDW
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AARIN
AATTM
AAXKI
AAXUO
AAYFN
AAYWO
ABBOA
ABFNM
ABJNI
ABMAC
ABMVD
ABUCO
ABUFD
ACDAQ
ACGFS
ACHRH
ACLOT
ACNTT
ACRLP
ACVFH
ACZNC
ADBBV
ADCNI
ADEZE
ADTZH
AEBSH
AECPX
AEIPS
AEKER
AENEX
AEUPX
AFJKZ
AFPUW
AFTJW
AGHFR
AGUBO
AGUMN
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIGII
AIIUN
AIKHN
AITUG
AKBMS
AKRWK
AKYEP
ALEQD
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
ANKPU
AOUOD
APLSM
APXCP
AXJTR
BJAXD
BKOJK
BLXMC
BNSAS
CS3
DU5
EBS
EFJIC
EFKBS
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HAMUX
IHE
J1W
JJJVA
KOM
LG9
LY1
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
ROL
RPZ
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SSB
SSD
SSL
SST
SSV
SSZ
T5K
TN5
~G-
~HD
29G
9DU
AAAKG
AAQXK
AAYXX
ABKBG
ABWVN
ABXDB
ACNNM
ACRPL
ADJOM
ADMUD
ADNMO
AGQPQ
ASPBG
AVWKF
AZFZN
CITATION
EJD
FEDTE
FGOYB
G-2
HLZ
HVGLF
HZ~
R2-
SBC
SET
WUQ
XPP
ZMT
ID FETCH-LOGICAL-c253t-3a4c0e3aeb1c25f9dea91039f7647194e847936ef9cd82da31cddc0468fd935c3
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001583846100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0957-4174
IngestDate Thu Nov 27 01:00:46 EST 2025
Wed Dec 10 14:25:24 EST 2025
IsPeerReviewed true
IsScholarly true
Keywords Representation learning
Masked autoencoder
Electrocardiogram (ECG)
Self-supervised learning
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c253t-3a4c0e3aeb1c25f9dea91039f7647194e847936ef9cd82da31cddc0468fd935c3
ORCID 0009-0005-8775-6875
0000-0003-1323-6795
0000-0003-4432-1546
0000-0002-4979-0723
0000-0001-9489-5447
ParticipantIDs crossref_primary_10_1016_j_eswa_2025_129603
elsevier_sciencedirect_doi_10_1016_j_eswa_2025_129603
PublicationCentury 2000
PublicationDate 2026-03-01
2026-03-00
PublicationDateYYYYMMDD 2026-03-01
PublicationDate_xml – month: 03
  year: 2026
  text: 2026-03-01
  day: 01
PublicationDecade 2020
PublicationTitle Expert systems with applications
PublicationYear 2026
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Sawano, Shinnosuke, Satoshi Kodera, Naoto Setoguchi, Kengo Tanabe, Shunichi Kushida, Junji Kanda, Mike Saji, et al. “Applying Masked Autoencoder-Based Self-Supervised Learning for High-Capability Vision Transformers of Electrocardiographies.” PLOS ONE 19, no. 8 (August 14, 2024): e0307978.
Hu, Rui, Jie Chen, and Li Zhou. “Spatiotemporal Self-Supervised Representation Learning from Multi-Lead ECG Signals.” Biomedical Signal Processing and Control 84 (July 1, 2023): 104772.
Yaqoob, Mohammed, Mohammed Yusuf Ansari, Mohammed Ishaq, Issac Sujay Anand John Jayachandran, Mohammed S. Hashim, and Thomas Daniel Seers. “MicroCrystalNet: An Efficient and Explainable Convolutional Neural Network for Microcrystal Classification Using Scanning Electron Microscope Petrography.” IEEE Access 13 (2025): 53865–84.
Chen, Ding, Huang, Zhang, Zhou (b0045) 2025
Radford, Jeff, Child, Luan, Amodei, Sutskever (b0265) 2019
Fan, Yuwei, Chenlong Feng, Rui Wu, Chao Liu, and Dongxiang Jiang. “Multiscale-Attention Masked Autoencoder for Missing Data Imputation of Wind Turbines.” Knowledge-Based Systems 299 (September 5, 2024): 112114.
Radford, Alec, and Karthik Narasimhan. “Improving Language Understanding by Generative Pre-Training,” 2018.
Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, M. Lewis, Luke Zettlemoyer, and Veselin Stoyanov. “RoBERTa: A Robustly Optimized BERT Pretraining Approach.” ArXiv, July 26, 2019.
Perkins, Mark, and Agnieszka Pregowska. “The Role of Artificial Intelligence in Higher Medical Education and the Ethical Challenges of Its Implementation.” Artificial Intelligence in Health 2, no. 1 (October 21, 2024): 1–13.
Lewis, Patrick, Myle Ott, Jingfei Du, and Veselin Stoyanov. “Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art.” In Proceedings of the 3rd Clinical Natural Language Processing Workshop, edited by Anna Rumshisky, Kirk Roberts, Steven Bethard, and Tristan Naumann, 146–57. Online: Association for Computational Linguistics, 2020.
Wei, Yufeng, Cheng Lian, Bingrong Xu, Pengbo Zhao, Honggang Yang, and Zhigang Zeng. “Bimodal Masked Autoencoders with Internal Representation Connections for Electrocardiogram Classification.” Pattern Recognition 161 (May 1, 2025): 111311.
Joo, Joo, Kim, Jin, Park, Im (b0175) 2023
Liu, Ran, Ellen L. Zippi, Hadi Pouransari, Chris Sandino, Jingping Nie, Hanlin Goh, Erdrin Azemi, and Ali Moin. “Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals.” arXiv, April 18, 2024.
Wagner, Strodthoff, Bousseljot, Samek, Schaeffter (b0310) 2020
Mukhopadhyay, S. K., S. Mitra, and M. Mitra. “An ECG Signal Compression Technique Using ASCII Character Encoding.” Measurement 45, no. 6 (July 1, 2012): 1651–60.
He, Kaiming, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollar, and Ross Girshick. “Masked Autoencoders Are Scalable Vision Learners.” In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15979–88. New Orleans, LA, USA: IEEE, 2022.
Zheng, Jianwei, Jianming Zhang, Sidy Danioko, Hai Yao, Hangyuan Guo, and Cyril Rakovski. “A 12-Lead Electrocardiogram Database for Arrhythmia Research Covering More than 10,000 Patients.” Scientific Data 7, no. 1 (February 12, 2020): 48.
Oord, Aaron van den, Oriol Vinyals, and Koray Kavukcuoglu. “Neural Discrete Representation Learning.” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–18. NIPS’17. Red Hook, NY, USA: Curran Associates Inc., 2017.
Strodthoff, Wagner, Schaeffter, Samek (b0295) May 2021; 25
He, Fan, Yuxin, Xie, Girshick (b0140) 2020
Gu, Yu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. “Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing.” ACM Trans. Comput. Healthcare 3, no. 1 (October 15, 2021): 2:1-2:23.
Petrėnas, Andrius, Vaidotas Marozas, and Leif Sörnmo. “Low-Complexity Detection of Atrial Fibrillation in Continuous Long-Term Monitoring.” Computers in Biology and Medicine 65 (October 1, 2015): 184–91.
Gao, Haotian, Renhe Jiang, Zheng Dong, Jinliang Deng, Yuxin Ma, and Xuan Song. “Spatial-Temporal-Decoupled Masked Pre-Training for Spatiotemporal Forecasting.” In Proceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence, 3998–4006. Jeju, South Korea: International Joint Conferences on Artificial Intelligence Organization, 2024.
Datta, Shreyasi, Chetanya Puri, Ayan Mukherjee, Rohan Banerjee, Anirban Dutta Choudhury, Rituraj Singh, Arijit Ukil, Soma Bandyopadhyay, Arpan Pal, and Sundeep Khandelwal. “Identifying Normal, AF and Other Abnormal ECG Rhythms Using a Cascaded Binary Classifier.” In 2017 Computing in Cardiology (CinC), 1–4, 2017.
Mikolov, Tomas, Kai Chen, G. Corrado, and J. Dean. “Efficient Estimation of Word Representations in Vector Space,” In Proceedings of the International Conference on Learning Representations (ICLR). 2013.
Rai, Pragati, Mohammed Yusuf Ansari, Mohammed Warfa, Hammad Al-Hamar, Julien Abinahed, Ali Barah, Sarada Prasad Dakua, and Shidin Balakrishnan. “Efficacy of Fusion Imaging for Immediate Post-Ablation Assessment of Malignant Liver Neoplasms: A Systematic Review.” Cancer Medicine 12, no. 13 (July 2023): 14225–51.
Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. “Language Models Are Few-Shot Learners.” In Proceedings of the 34th International Conference on Neural Information Processing Systems, 1877–1901. NIPS ’20. Red Hook, NY, USA: Curran Associates Inc., 2020.
Liu, Qian, Junchen Ye, Haohan Liang, Leilei Sun, and Bowen Du. “TS-MAE: A Masked Autoencoder for Time Series Representation Learning.” Information Sciences 690 (February 1, 2025): 121576.
He, Pengcheng, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. “DeBERTa: Decoding-Enhanced BERT with Disentangled Attention.” ArXiv, June 5, 2020.
Mao, Jiawei, Shujian Guo, Xuesong Yin, Yuanqi Chang, Binling Nie, and Yigang Wang. “Medical Supervised Masked Autoencoder: Crafting a Better Masking Strategy and Efficient Fine-Tuning Schedule for Medical Image Classification.” Applied Soft Computing 169 (January 1, 2025): 112536.
Johnson, Alistair, Bulgarelli, Lucas, Pollard, Tom, Horng, Steven, Celi, Leo Anthony, and Roger Mark. “MIMIC-IV” (version 2.2). PhysioNet (2023).
Zhang, Shubin, Dong An, Jincun Liu, and Yaoguang Wei. “EEG Generalizable Representations Learning via Masked Fractional Fourier Domain Modeling.” Applied Soft Computing 170 (February 1, 2025): 112731.
Johnson, Alistair E. W., Lucas Bulgarelli, Lu Shen, Alvin Gayles, Ayad Shammout, Steven Horng, Tom J. Pollard, et al. “MIMIC-IV, a Freely Accessible Electronic Health Record Dataset.” Scientific Data 10, no. 1 (January 3, 2023): 1.
Ansari, Mohammed Yusuf, Marwa Qaraqe, Fatme Charafeddine, Erchin Serpedin, Raffaella Righetti, and Khalid Qaraqe. “Estimating Age and Gender from Electrocardiogram Signals: A Comprehensive Review of the Past Decade.” Artificial Intelligence in Medicine 146 (December 1, 2023): 102690.
Dan, Xi, Kele Xu, Yihang Zhou, Chuanguang Yang, Yihao Chen, Yutao Dou, and Cheng Yang. “Spatio-Temporal Masked Autoencoder-Based Phonetic Segments Classification from Ultrasound.” Speech Communication 169 (April 1, 2025): 103186.
Song, Chaoyang, Zilong Zhou, Yue Yu, Manman Shi, and Jingxiang Zhang. “An Improved Bi-LSTM Method Based on Heterogeneous Features Fusion and Attention Mechanism for ECG Recognition.” Computers in Biology and Medicine 169 (February 1, 2024): 107903.
Wang, Lin, Xuerui Wang, and Rui Tao. “Wave Masked Autoencoder: An Electrocardiogram Signal Diagnosis Model Based on Wave Making Strategy.” Information Sciences 690 (February 1, 2025): 121516.
Ansari, Mohammed Yusuf, Iffa Afsa Changaai Mangalote, Pramod Kumar Meher, Omar Aboumarzouk, Abdulla Al-Ansari, Osama Halabi, and Sarada Prasad Dakua. “Advancements in Deep Learning for B-Mode Ultrasound Segmentation: A Comprehensive Review.” IEEE Transactions on Emerging Topics in Computational Intelligence 8, no. 3 (June 2024): 2126–49.
Wang, Qinghua, Ziwei Li, Shuqi Zhang, Yuhong Luo, Wentao Chen, Tianyun Wang, Nan Chi, and Qionghai Dai. “SMAE-Fusion: Integrating Saliency-Aware Masked Autoencoder with Hybrid Attention Transformer for Infrared–Visible Image Fusion.” Information Fusion 117 (May 1, 2025): 102841.
Ansari, Mohammed Yusuf, Yin Yang, Shidin Balakrishnan, Julien Abinahed, Abdulla Al-Ansari, Mohamed Warfa, Omran Almokdad, et al. “A Lightweight Neural Network with Multiscale Feature Enhancement for Liver CT Segmentation.” Scientific Reports 12, no. 1 (August 19, 2022): 14153.
Chen, Ting, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. “A Simple Framework for Contrastive Learning of Visual Representations.” In Proceedings of the 37th International Conference on Machine Learning, 119:1597–1607. ICML’20. JMLR.org, 2020.
Lan, Zhenzhong, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. “ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations,” 2019.
He, Tong, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. “Bag of Tricks for Image Classification with Convolutional Neural Networks.” In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 558–67. Long Beach, CA, USA: IEEE, 2019.
Devlin, Chang, Lee, Toutanova (b0075) 2019
Nguyen, Duy Kien, Yanghao Li, Vaibhav Aggarwal, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, and Xinlei Chen. “R-MAE: Regions Meet Masked Autoencoders,”In Proceedings of the International Conference on Learning Representations (ICLR). 2023.
Cheng, Mingyue, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, and Enhong Chen. “TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders.” arXiv, March 14, 2023.
Ding, Abdel-Basset, Hawash, Moustafa (b0080) 2022; 1
Ansari, Qaraqe, Righetti, Serpedin, Qaraqe (b0020) 2024; 11
Cai, Miao, and Yu Zeng. “MAE-EEG-Transformer:
10.1016/j.eswa.2025.129603_b0325
10.1016/j.eswa.2025.129603_b0005
10.1016/j.eswa.2025.129603_b0205
10.1016/j.eswa.2025.129603_b0090
10.1016/j.eswa.2025.129603_b0170
10.1016/j.eswa.2025.129603_b0290
10.1016/j.eswa.2025.129603_b0095
10.1016/j.eswa.2025.129603_b0050
10.1016/j.eswa.2025.129603_b0130
10.1016/j.eswa.2025.129603_b0250
10.1016/j.eswa.2025.129603_b0055
10.1016/j.eswa.2025.129603_b0330
10.1016/j.eswa.2025.129603_b0010
10.1016/j.eswa.2025.129603_b0255
He (10.1016/j.eswa.2025.129603_b0140) 2020
10.1016/j.eswa.2025.129603_b0210
10.1016/j.eswa.2025.129603_b0015
10.1016/j.eswa.2025.129603_b0135
10.1016/j.eswa.2025.129603_b0115
10.1016/j.eswa.2025.129603_b0315
10.1016/j.eswa.2025.129603_b0280
10.1016/j.eswa.2025.129603_b0040
10.1016/j.eswa.2025.129603_b0160
10.1016/j.eswa.2025.129603_b0240
10.1016/j.eswa.2025.129603_b0085
10.1016/j.eswa.2025.129603_b0360
10.1016/j.eswa.2025.129603_b0165
10.1016/j.eswa.2025.129603_b0120
10.1016/j.eswa.2025.129603_b0285
10.1016/j.eswa.2025.129603_b0200
10.1016/j.eswa.2025.129603_b0320
10.1016/j.eswa.2025.129603_b0125
Moody (10.1016/j.eswa.2025.129603_b0220) 1984; 11
10.1016/j.eswa.2025.129603_b0245
10.1016/j.eswa.2025.129603_b0105
10.1016/j.eswa.2025.129603_b0225
10.1016/j.eswa.2025.129603_b0305
Joo (10.1016/j.eswa.2025.129603_b0175) 2023
Wagner (10.1016/j.eswa.2025.129603_b0310) 2020
Radford (10.1016/j.eswa.2025.129603_b0265) 2019
10.1016/j.eswa.2025.129603_b0190
10.1016/j.eswa.2025.129603_b0070
10.1016/j.eswa.2025.129603_b0150
10.1016/j.eswa.2025.129603_b0270
10.1016/j.eswa.2025.129603_b0350
10.1016/j.eswa.2025.129603_b0030
10.1016/j.eswa.2025.129603_b0195
10.1016/j.eswa.2025.129603_b0110
10.1016/j.eswa.2025.129603_b0275
10.1016/j.eswa.2025.129603_b0230
Strodthoff (10.1016/j.eswa.2025.129603_b0295) 2021; 25
10.1016/j.eswa.2025.129603_b0035
10.1016/j.eswa.2025.129603_b0155
10.1016/j.eswa.2025.129603_b0235
Devlin (10.1016/j.eswa.2025.129603_b0075) 2019
10.1016/j.eswa.2025.129603_b0215
10.1016/j.eswa.2025.129603_b0335
Zhang (10.1016/j.eswa.2025.129603_b0355) 2023; 72
Ansari (10.1016/j.eswa.2025.129603_b0020) 2024; 11
10.1016/j.eswa.2025.129603_b0060
Ding (10.1016/j.eswa.2025.129603_b0080) 2022; 1
10.1016/j.eswa.2025.129603_b0180
10.1016/j.eswa.2025.129603_b0260
10.1016/j.eswa.2025.129603_b0185
10.1016/j.eswa.2025.129603_b0065
10.1016/j.eswa.2025.129603_b0340
10.1016/j.eswa.2025.129603_b0145
10.1016/j.eswa.2025.129603_b0100
Chen (10.1016/j.eswa.2025.129603_b0045) 2025
10.1016/j.eswa.2025.129603_b0345
10.1016/j.eswa.2025.129603_b0025
10.1016/j.eswa.2025.129603_b0300
References_xml – reference: Doersch, Carl, Abhinav Gupta, and Alexei A. Efros. “Unsupervised Visual Representation Learning by Context Prediction.” In 2015 IEEE International Conference on Computer Vision (ICCV), 1422–30. Santiago, Chile: IEEE, 2015.
– reference: Chen, Ting, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. “A Simple Framework for Contrastive Learning of Visual Representations.” In Proceedings of the 37th International Conference on Machine Learning, 119:1597–1607. ICML’20. JMLR.org, 2020.
– reference: Sawano, Shinnosuke, Satoshi Kodera, Naoto Setoguchi, Kengo Tanabe, Shunichi Kushida, Junji Kanda, Mike Saji, et al. “Applying Masked Autoencoder-Based Self-Supervised Learning for High-Capability Vision Transformers of Electrocardiographies.” PLOS ONE 19, no. 8 (August 14, 2024): e0307978.
– reference: Mikolov, Tomas, Kai Chen, G. Corrado, and J. Dean. “Efficient Estimation of Word Representations in Vector Space,” In Proceedings of the International Conference on Learning Representations (ICLR). 2013.
– reference: Pham, Chau, Juan C. Caicedo, and Bryan A. Plummer. “ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning.” arXiv, March 25, 2025.
– volume: 11
  start-page: 381
  year: 1984
  end-page: 384
  ident: b0220
  article-title: A noise stress test for arrhythmia detectors
  publication-title: Computers in Cardiology
– reference: Grill, Jean-Bastien, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, et al. “Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning.” In Advances in Neural Information Processing Systems, 33:21271–84. Curran Associates, Inc., 2020.
– reference: Lewis, Patrick, Myle Ott, Jingfei Du, and Veselin Stoyanov. “Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art.” In Proceedings of the 3rd Clinical Natural Language Processing Workshop, edited by Anna Rumshisky, Kirk Roberts, Steven Bethard, and Tristan Naumann, 146–57. Online: Association for Computational Linguistics, 2020.
– reference: Petrėnas, Andrius, Vaidotas Marozas, and Leif Sörnmo. “Low-Complexity Detection of Atrial Fibrillation in Continuous Long-Term Monitoring.” Computers in Biology and Medicine 65 (October 1, 2015): 184–91.
– reference: Fan, Yuwei, Chenlong Feng, Rui Wu, Chao Liu, and Dongxiang Jiang. “Multiscale-Attention Masked Autoencoder for Missing Data Imputation of Wind Turbines.” Knowledge-Based Systems 299 (September 5, 2024): 112114.
– reference: Wagner, Patrick, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I. Lunze, Wojciech Samek, and Tobias Schaeffter. “PTB-XL, a Large Publicly Available Electrocardiography Dataset.” Scientific Data 7, no. 1 (May 25, 2020): 154.
– reference: Gidaris, Spyros, Praveer Singh, and Nikos Komodakis. “Unsupervised Representation Learning by Predicting Image Rotations,” In Proceedings of the International Conference on Learning Representations (ICLR).2018.
– start-page: 9726
  year: 2020
  end-page: 9735
  ident: b0140
  article-title: Momentum Contrast for Unsupervised Visual Representation Learning
  publication-title: In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
– reference: Mukhopadhyay, S. K., S. Mitra, and M. Mitra. “An ECG Signal Compression Technique Using ASCII Character Encoding.” Measurement 45, no. 6 (July 1, 2012): 1651–60.
– reference: Wang, Lin, Xuerui Wang, and Rui Tao. “Wave Masked Autoencoder: An Electrocardiogram Signal Diagnosis Model Based on Wave Making Strategy.” Information Sciences 690 (February 1, 2025): 121516.
– reference: Hu, Rui, Jie Chen, and Li Zhou. “Spatiotemporal Self-Supervised Representation Learning from Multi-Lead ECG Signals.” Biomedical Signal Processing and Control 84 (July 1, 2023): 104772.
– reference: Schwingel, Paulo Adriano, Dino Schwingel, Samuel Ricarte de Aquino, Aline Rafaela Soares da Silva, Pedro Paulo Ramos da Silva, Renato Augusto da Cruz Pereira, Daniela Conceição Gomes Gonçalves e Silva, et al. “An Exploratory Study on the Potential of ChatGPT as an AI-Assisted Diagnostic Tool for Visceral Leishmaniasis.” Artificial Intelligence in Health 1, no. 4 (October 16, 2024): 97–106.
– reference: Cai, Miao, and Yu Zeng. “MAE-EEG-Transformer: A Transformer-Based Approach Combining Masked Autoencoder and Cross-Individual Data Augmentation Pre-Training for EEG Classification.” Biomedical Signal Processing and Control 94 (August 1, 2024): 106131.
– start-page: 184
  year: 2023
  end-page: 194
  ident: b0175
  article-title: Twelve-Lead ECG Reconstruction from Single-Lead Signals Using Generative Adversarial Networks
  publication-title: Medical Image Computing and Computer Assisted Intervention – MICCAI 2023
– volume: 25
  start-page: 1519
  year: May 2021
  end-page: 1528
  ident: b0295
  article-title: Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL
  publication-title: IEEE Journal of Biomedical and Health Informatics
– reference: M, Iffa Afsa C, Mohammed Yusuf Ansari, Santu Paul, Osama Halabi, Ezzedin Alataresh, Jassim Shah, Afaf Hamze, Omar Aboumarzouk, Abdulla Al-Ansari, and Sarada Prasad Dakua. “Development and Validation of a Class Imbalance-Resilient Cardiac Arrest Prediction Framework Incorporating Multiscale Aggregation, ICA and Explainability.” IEEE Transactions on Biomedical Engineering 72, no. 5 (May 2025): 1674–87.
– reference: Mehta, S. S., and N. S. Lingayat. “Development of SVM Based Classification Techniques for the Delineation of Wave Components in 12-Lead Electrocardiogram.” Biomedical Signal Processing and Control 3, no. 4 (October 1, 2008): 341–49.
– reference: Nie, Yuqi, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. “A Time Series Is Worth 64 Words: Long-Term Forecasting with Transformers,” In Proceedings of the International Conference on Learning Representations (ICLR).2022.
– reference: Dosovitskiy, Alexey, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, et al. “An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale,”In Proceedings of the International Conference on Learning Representations (ICLR). 2020.
– reference: Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, M. Lewis, Luke Zettlemoyer, and Veselin Stoyanov. “RoBERTa: A Robustly Optimized BERT Pretraining Approach.” ArXiv, July 26, 2019.
– reference: Wang, Qinghua, Ziwei Li, Shuqi Zhang, Yuhong Luo, Wentao Chen, Tianyun Wang, Nan Chi, and Qionghai Dai. “SMAE-Fusion: Integrating Saliency-Aware Masked Autoencoder with Hybrid Attention Transformer for Infrared–Visible Image Fusion.” Information Fusion 117 (May 1, 2025): 102841.
– reference: Rai, Pragati, Mohammed Yusuf Ansari, Mohammed Warfa, Hammad Al-Hamar, Julien Abinahed, Ali Barah, Sarada Prasad Dakua, and Shidin Balakrishnan. “Efficacy of Fusion Imaging for Immediate Post-Ablation Assessment of Malignant Liver Neoplasms: A Systematic Review.” Cancer Medicine 12, no. 13 (July 2023): 14225–51.
– reference: Mao, Jiawei, Shujian Guo, Xuesong Yin, Yuanqi Chang, Binling Nie, and Yigang Wang. “Medical Supervised Masked Autoencoder: Crafting a Better Masking Strategy and Efficient Fine-Tuning Schedule for Medical Image Classification.” Applied Soft Computing 169 (January 1, 2025): 112536.
– reference: Liu, Ran, Ellen L. Zippi, Hadi Pouransari, Chris Sandino, Jingping Nie, Hanlin Goh, Erdrin Azemi, and Ali Moin. “Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals.” arXiv, April 18, 2024.
– reference: Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. “Language Models Are Few-Shot Learners.” In Proceedings of the 34th International Conference on Neural Information Processing Systems, 1877–1901. NIPS ’20. Red Hook, NY, USA: Curran Associates Inc., 2020.
– reference: Datta, Shreyasi, Chetanya Puri, Ayan Mukherjee, Rohan Banerjee, Anirban Dutta Choudhury, Rituraj Singh, Arijit Ukil, Soma Bandyopadhyay, Arpan Pal, and Sundeep Khandelwal. “Identifying Normal, AF and Other Abnormal ECG Rhythms Using a Cascaded Binary Classifier.” In 2017 Computing in Cardiology (CinC), 1–4, 2017.
– reference: He, Tong, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. “Bag of Tricks for Image Classification with Convolutional Neural Networks.” In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 558–67. Long Beach, CA, USA: IEEE, 2019.
– reference: Nguyen, Duy Kien, Yanghao Li, Vaibhav Aggarwal, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, and Xinlei Chen. “R-MAE: Regions Meet Masked Autoencoders,”In Proceedings of the International Conference on Learning Representations (ICLR). 2023.
– reference: Dong, Jiaxiang, Haixu Wu, Haoran Zhang, Li Zhang, Jianmin Wang, and Mingsheng Long. “SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling.” In Advances in Neural Information Processing Systems, edited by A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, 36:29996–25. Curran Associates, Inc., 2023.
– reference: Cheng, Mingyue, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, and Enhong Chen. “TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders.” arXiv, March 14, 2023.
– reference: He, Pengcheng, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. “DeBERTa: Decoding-Enhanced BERT with Disentangled Attention.” ArXiv, June 5, 2020.
– year: 2020
  ident: b0310
  article-title: “PTB-XL, a large publicly available electrocardiography dataset”(version 1.0.1)
  publication-title: PhysioNet
– reference: Dan, Xi, Kele Xu, Yihang Zhou, Chuanguang Yang, Yihao Chen, Yutao Dou, and Cheng Yang. “Spatio-Temporal Masked Autoencoder-Based Phonetic Segments Classification from Ultrasound.” Speech Communication 169 (April 1, 2025): 103186.
– reference: Hu, Edward J., Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. “LoRA: Low-Rank Adaptation of Large Language Models.” arXiv, October 16, 2021.
– reference: Xia, Yong, Yueqi Xiong, and Kuanquan Wang. “A Transformer Model Blended with CNN and Denoising Autoencoder for Inter-Patient ECG Arrhythmia Classification.” Biomedical Signal Processing and Control 86 (September 1, 2023): 105271.
– reference: Zheng, Jianwei, Jianming Zhang, Sidy Danioko, Hai Yao, Hangyuan Guo, and Cyril Rakovski. “A 12-Lead Electrocardiogram Database for Arrhythmia Research Covering More than 10,000 Patients.” Scientific Data 7, no. 1 (February 12, 2020): 48.
– reference: Wei, Yufeng, Cheng Lian, Bingrong Xu, Pengbo Zhao, Honggang Yang, and Zhigang Zeng. “Bimodal Masked Autoencoders with Internal Representation Connections for Electrocardiogram Classification.” Pattern Recognition 161 (May 1, 2025): 111311.
– reference: Yaqoob, Mohammed, Mohammed Yusuf Ansari, Mohammed Ishaq, Issac Sujay Anand John Jayachandran, Mohammed S. Hashim, and Thomas Daniel Seers. “MicroCrystalNet: An Efficient and Explainable Convolutional Neural Network for Microcrystal Classification Using Scanning Electron Microscope Petrography.” IEEE Access 13 (2025): 53865–84.
– reference: Johnson, Alistair E. W., Lucas Bulgarelli, Lu Shen, Alvin Gayles, Ayad Shammout, Steven Horng, Tom J. Pollard, et al. “MIMIC-IV, a Freely Accessible Electronic Health Record Dataset.” Scientific Data 10, no. 1 (January 3, 2023): 1.
– reference: He, Kaiming, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollar, and Ross Girshick. “Masked Autoencoders Are Scalable Vision Learners.” In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15979–88. New Orleans, LA, USA: IEEE, 2022.
– reference: Lan, Zhenzhong, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. “ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations,” 2019.
– start-page: 4171
  year: 2019
  end-page: 4186
  ident: b0075
  publication-title: BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding
– reference: Alsentzer, Emily, John Murphy, William Boag, Wei-Hung Weng, Di Jindi, Tristan Naumann, and Matthew McDermott. “Publicly Available Clinical BERT Embeddings.” In Proceedings of the 2nd Clinical Natural Language Processing Workshop, edited by Anna Rumshisky, Kirk Roberts, Steven Bethard, and Tristan Naumann, 72–78. Minneapolis, Minnesota, USA: Association for Computational Linguistics, 2019.
– start-page: 1
  year: 2025
  end-page: 15
  ident: b0045
  article-title: Multigranularity Fuzzy Autoencoder for Discriminative Feature Selection in High-Dimensional Data
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– reference: Zhang, Shubin, Dong An, Jincun Liu, and Yaoguang Wei. “EEG Generalizable Representations Learning via Masked Fractional Fourier Domain Modeling.” Applied Soft Computing 170 (February 1, 2025): 112731.
– reference: Radford, Alec, and Karthik Narasimhan. “Improving Language Understanding by Generative Pre-Training,” 2018.
– reference: Ansari, Mohammed Yusuf, Iffa Afsa Changaai Mangalote, Pramod Kumar Meher, Omar Aboumarzouk, Abdulla Al-Ansari, Osama Halabi, and Sarada Prasad Dakua. “Advancements in Deep Learning for B-Mode Ultrasound Segmentation: A Comprehensive Review.” IEEE Transactions on Emerging Topics in Computational Intelligence 8, no. 3 (June 2024): 2126–49.
– reference: Pennington, Jeffrey, Richard Socher, and Christopher Manning. “Glove: Global Vectors for Word Representation.” In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–43. Doha, Qatar: Association for Computational Linguistics, 2014.
– reference: Chaudhari, Ashish, Herbst, Elizabeth, Moukheiber, Dana, Berkowitz, Seth, Mark, Roger, and Steven Horng. “MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset” (version 1.0). PhysioNet (2023).
– reference: Oord, Aaron van den, Oriol Vinyals, and Koray Kavukcuoglu. “Neural Discrete Representation Learning.” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–18. NIPS’17. Red Hook, NY, USA: Curran Associates Inc., 2017.
– reference: Yaqoob, Mohammed, Mohammed Ishaq, Mohammed Yusuf Ansari, Yemna Qaiser, Rehaan Hussain, Harris Sajjad Rabbani, Russell J. Garwood, and Thomas D. Seers. “Advancing Paleontology: A Survey on Deep Learning Methodologies in Fossil Image Analysis.” Artificial Intelligence Review 58, no. 3 (January 6, 2025): 83.
– reference: Chen, Jiarong, Wanqing Wu, Tong Liu, and Shenda Hong. “Multi-Channel Masked Autoencoder and Comprehensive Evaluations for Reconstructing 12-Lead ECG from Arbitrary Single-Lead ECG.” Npj Cardiovascular Health 1, no. 1 (December 4, 2024): 34.
– reference: Liu, Qian, Junchen Ye, Haohan Liang, Leilei Sun, and Bowen Du. “TS-MAE: A Masked Autoencoder for Time Series Representation Learning.” Information Sciences 690 (February 1, 2025): 121576.
– reference: Song, Chaoyang, Zilong Zhou, Yue Yu, Manman Shi, and Jingxiang Zhang. “An Improved Bi-LSTM Method Based on Heterogeneous Features Fusion and Attention Mechanism for ECG Recognition.” Computers in Biology and Medicine 169 (February 1, 2024): 107903.
– volume: 72
  start-page: 1
  year: 2023
  end-page: 15
  ident: b0355
  article-title: MaeFE: Masked Autoencoders Family of Electrocardiogram for Self-Supervised Pretraining and Transfer Learning
  publication-title: IEEE Transactions on Instrumentation and Measurement
– reference: Johnson, Alistair, Bulgarelli, Lucas, Pollard, Tom, Horng, Steven, Celi, Leo Anthony, and Roger Mark. “MIMIC-IV” (version 2.2). PhysioNet (2023).
– year: 2019
  ident: b0265
  publication-title: “Language Models Are Unsupervised Multitask Learners”
– reference: Yaqoob, Mohammed, Mohammed Yusuf Ansari, Mohammed Ishaq, Unais Ashraf, Saideep Pavuluri, Arash Rabbani, Harris Sajjad Rabbani, and Thomas D. Seers. “FluidNet-Lite: Lightweight Convolutional Neural Network for Pore-Scale Modeling of Multiphase Flow in Heterogeneous Porous Media.” Advances in Water Resources 200 (June 1, 2025): 104952.
– volume: 1
  start-page: 144
  year: 2022
  end-page: 165
  ident: b0080
  article-title: Interval Type-2 Fuzzy Temporal Convolutional Autoencoder for Gait-Based Human Identification and Authentication
  publication-title: Information Sciences 597 (June
– reference: Goldberger, A. L., L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley. “PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals.” Circulation 101, no. 23 (June 13, 2000): E215-220.
– reference: Radford, Alec, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, et al. “Learning Transferable Visual Models From Natural Language Supervision.” In Proceedings of the 38th International Conference on Machine Learning, 8748–63. PMLR, 2021.
– reference: Gao, Haotian, Renhe Jiang, Zheng Dong, Jinliang Deng, Yuxin Ma, and Xuan Song. “Spatial-Temporal-Decoupled Masked Pre-Training for Spatiotemporal Forecasting.” In Proceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence, 3998–4006. Jeju, South Korea: International Joint Conferences on Artificial Intelligence Organization, 2024.
– reference: Gu, Yu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. “Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing.” ACM Trans. Comput. Healthcare 3, no. 1 (October 15, 2021): 2:1-2:23.
– reference: Ansari, Mohammed Yusuf, Yin Yang, Shidin Balakrishnan, Julien Abinahed, Abdulla Al-Ansari, Mohamed Warfa, Omran Almokdad, et al. “A Lightweight Neural Network with Multiscale Feature Enhancement for Liver CT Segmentation.” Scientific Reports 12, no. 1 (August 19, 2022): 14153.
– reference: Ansari, Mohammed Yusuf, Marwa Qaraqe, Fatme Charafeddine, Erchin Serpedin, Raffaella Righetti, and Khalid Qaraqe. “Estimating Age and Gender from Electrocardiogram Signals: A Comprehensive Review of the Past Decade.” Artificial Intelligence in Medicine 146 (December 1, 2023): 102690.
– reference: Perkins, Mark, and Agnieszka Pregowska. “The Role of Artificial Intelligence in Higher Medical Education and the Ethical Challenges of Its Implementation.” Artificial Intelligence in Health 2, no. 1 (October 21, 2024): 1–13.
– volume: 11
  year: 2024
  ident: b0020
  article-title: Enhancing ECG-Based Heart Age: Impact of Acquisition Parameters and Generalization Strategies for Varying Signal Morphologies and Corruptions
  publication-title: Frontiers in Cardiovascular Medicine
– volume: 11
  start-page: 381
  year: 1984
  ident: 10.1016/j.eswa.2025.129603_b0220
  article-title: A noise stress test for arrhythmia detectors
  publication-title: Computers in Cardiology
– ident: 10.1016/j.eswa.2025.129603_b0145
  doi: 10.1109/CVPR.2019.00065
– ident: 10.1016/j.eswa.2025.129603_b0070
  doi: 10.22489/CinC.2017.173-154
– ident: 10.1016/j.eswa.2025.129603_b0170
– ident: 10.1016/j.eswa.2025.129603_b0345
  doi: 10.1016/j.advwatres.2025.104952
– ident: 10.1016/j.eswa.2025.129603_b0135
– ident: 10.1016/j.eswa.2025.129603_b0215
– ident: 10.1016/j.eswa.2025.129603_b0235
– ident: 10.1016/j.eswa.2025.129603_b0010
  doi: 10.1038/s41598-022-16828-6
– ident: 10.1016/j.eswa.2025.129603_b0350
  doi: 10.1016/j.asoc.2025.112731
– ident: 10.1016/j.eswa.2025.129603_b0150
– ident: 10.1016/j.eswa.2025.129603_b0055
– ident: 10.1016/j.eswa.2025.129603_b0190
– ident: 10.1016/j.eswa.2025.129603_b0060
– ident: 10.1016/j.eswa.2025.129603_b0015
  doi: 10.1109/TETCI.2024.3377676
– ident: 10.1016/j.eswa.2025.129603_b0165
  doi: 10.1038/s41597-023-01945-2
– year: 2020
  ident: 10.1016/j.eswa.2025.129603_b0310
  article-title: “PTB-XL, a large publicly available electrocardiography dataset”(version 1.0.1)
  publication-title: PhysioNet
– ident: 10.1016/j.eswa.2025.129603_b0335
  doi: 10.1109/ACCESS.2025.3552626
– ident: 10.1016/j.eswa.2025.129603_b0200
  doi: 10.1016/j.ins.2024.121576
– ident: 10.1016/j.eswa.2025.129603_b0255
– ident: 10.1016/j.eswa.2025.129603_b0230
– ident: 10.1016/j.eswa.2025.129603_b0330
  doi: 10.1016/j.bspc.2023.105271
– ident: 10.1016/j.eswa.2025.129603_b0065
  doi: 10.1016/j.specom.2025.103186
– volume: 1
  start-page: 144
  year: 2022
  ident: 10.1016/j.eswa.2025.129603_b0080
  article-title: Interval Type-2 Fuzzy Temporal Convolutional Autoencoder for Gait-Based Human Identification and Authentication
  publication-title: Information Sciences 597 (June
  doi: 10.1016/j.ins.2022.03.046
– ident: 10.1016/j.eswa.2025.129603_b0245
  doi: 10.36922/aih.3276
– ident: 10.1016/j.eswa.2025.129603_b0325
  doi: 10.1016/j.patcog.2024.111311
– ident: 10.1016/j.eswa.2025.129603_b0180
– ident: 10.1016/j.eswa.2025.129603_b0210
  doi: 10.1016/j.bspc.2008.04.002
– start-page: 1
  year: 2025
  ident: 10.1016/j.eswa.2025.129603_b0045
  article-title: Multigranularity Fuzzy Autoencoder for Discriminative Feature Selection in High-Dimensional Data
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– ident: 10.1016/j.eswa.2025.129603_b0285
  doi: 10.36922/aih.3930
– ident: 10.1016/j.eswa.2025.129603_b0130
  doi: 10.1109/CVPR52688.2022.01553
– volume: 11
  year: 2024
  ident: 10.1016/j.eswa.2025.129603_b0020
  article-title: Enhancing ECG-Based Heart Age: Impact of Acquisition Parameters and Generalization Strategies for Varying Signal Morphologies and Corruptions
  publication-title: Frontiers in Cardiovascular Medicine
  doi: 10.3389/fcvm.2024.1424585
– volume: 72
  start-page: 1
  year: 2023
  ident: 10.1016/j.eswa.2025.129603_b0355
  article-title: MaeFE: Masked Autoencoders Family of Electrocardiogram for Self-Supervised Pretraining and Transfer Learning
  publication-title: IEEE Transactions on Instrumentation and Measurement
– ident: 10.1016/j.eswa.2025.129603_b0120
– ident: 10.1016/j.eswa.2025.129603_b0030
– ident: 10.1016/j.eswa.2025.129603_b0040
– ident: 10.1016/j.eswa.2025.129603_b0100
  doi: 10.1016/j.knosys.2024.112114
– ident: 10.1016/j.eswa.2025.129603_b0195
– ident: 10.1016/j.eswa.2025.129603_b0225
  doi: 10.1016/j.measurement.2012.01.017
– volume: 25
  start-page: 1519
  issue: 5
  year: 2021
  ident: 10.1016/j.eswa.2025.129603_b0295
  article-title: Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL
  publication-title: IEEE Journal of Biomedical and Health Informatics
  doi: 10.1109/JBHI.2020.3022989
– ident: 10.1016/j.eswa.2025.129603_b0005
  doi: 10.18653/v1/W19-1909
– ident: 10.1016/j.eswa.2025.129603_b0155
  doi: 10.1016/j.bspc.2023.104772
– ident: 10.1016/j.eswa.2025.129603_b0125
  doi: 10.1145/3458754
– ident: 10.1016/j.eswa.2025.129603_b0240
  doi: 10.3115/v1/D14-1162
– ident: 10.1016/j.eswa.2025.129603_b0095
– ident: 10.1016/j.eswa.2025.129603_b0105
– ident: 10.1016/j.eswa.2025.129603_b0085
  doi: 10.1109/ICCV.2015.167
– ident: 10.1016/j.eswa.2025.129603_b0280
  doi: 10.1371/journal.pone.0307978
– start-page: 184
  year: 2023
  ident: 10.1016/j.eswa.2025.129603_b0175
  article-title: Twelve-Lead ECG Reconstruction from Single-Lead Signals Using Generative Adversarial Networks
– ident: 10.1016/j.eswa.2025.129603_b0340
  doi: 10.1007/s10462-024-11080-y
– ident: 10.1016/j.eswa.2025.129603_b0110
– start-page: 4171
  year: 2019
  ident: 10.1016/j.eswa.2025.129603_b0075
– ident: 10.1016/j.eswa.2025.129603_b0025
  doi: 10.1016/j.artmed.2023.102690
– year: 2019
  ident: 10.1016/j.eswa.2025.129603_b0265
  publication-title: “Language Models Are Unsupervised Multitask Learners”
– ident: 10.1016/j.eswa.2025.129603_b0275
  doi: 10.1002/cam4.6089
– ident: 10.1016/j.eswa.2025.129603_b0270
– ident: 10.1016/j.eswa.2025.129603_b0160
  doi: 10.1109/TBME.2024.3517635
– ident: 10.1016/j.eswa.2025.129603_b0050
  doi: 10.1038/s44325-024-00036-4
– ident: 10.1016/j.eswa.2025.129603_b0305
  doi: 10.1038/s41597-020-0495-6
– ident: 10.1016/j.eswa.2025.129603_b0205
  doi: 10.1016/j.asoc.2024.112536
– ident: 10.1016/j.eswa.2025.129603_b0035
  doi: 10.1016/j.bspc.2024.106131
– ident: 10.1016/j.eswa.2025.129603_b0300
– ident: 10.1016/j.eswa.2025.129603_b0360
  doi: 10.1038/s41597-020-0386-x
– ident: 10.1016/j.eswa.2025.129603_b0250
  doi: 10.1016/j.compbiomed.2015.01.019
– ident: 10.1016/j.eswa.2025.129603_b0290
  doi: 10.1016/j.compbiomed.2023.107903
– ident: 10.1016/j.eswa.2025.129603_b0320
  doi: 10.1016/j.inffus.2024.102841
– ident: 10.1016/j.eswa.2025.129603_b0260
– ident: 10.1016/j.eswa.2025.129603_b0115
  doi: 10.1161/01.CIR.101.23.e215
– start-page: 9726
  year: 2020
  ident: 10.1016/j.eswa.2025.129603_b0140
  article-title: Momentum Contrast for Unsupervised Visual Representation Learning
– ident: 10.1016/j.eswa.2025.129603_b0185
  doi: 10.18653/v1/2020.clinicalnlp-1.17
– ident: 10.1016/j.eswa.2025.129603_b0090
– ident: 10.1016/j.eswa.2025.129603_b0315
  doi: 10.1016/j.ins.2024.121516
SSID ssj0017007
Score 2.4849236
Snippet The difficulty of labeling Electrocardiogram (ECG) has prompted researchers to use self-supervised learning to enhance diagnostic performance. Masked...
SourceID crossref
elsevier
SourceType Index Database
Publisher
StartPage 129603
SubjectTerms Electrocardiogram (ECG)
Masked autoencoder
Representation learning
Self-supervised learning
Title Spatial-temporal hierarchical decoupled masked autoencoder: A self-supervised learning framework for electrocardiogram
URI https://dx.doi.org/10.1016/j.eswa.2025.129603
Volume 298
WOSCitedRecordID wos001583846100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  issn: 0957-4174
  databaseCode: AIEXJ
  dateStart: 19950101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: false
  ssIdentifier: ssj0017007
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELZKlwMX3ojlJR-4rbJK4zzsvVVoEXBYIbGIwiVyHEd0N6TVJul2b_x0ZmI7CV2EAIlLWo0ap_J8cj6Pv5kh5GXkywx8nXnC55kXZhHzuMKinyrPAqaAUhSyazaRnJzwxUK8n0y-u1yYTZlUFd9uxfq_uhps4GxMnf0Ld_eDggG-g9PhCm6H6x85HpsMw1M8W3SqPMBu1915QXcgA9vNdl0Czfwm63P4kG2zwmKWWEiky1KvdVl4dbvGRaSGH5QueFI4HZcpE27656hOz4oSr59i_FhAubFlol0C3eiofDgP6sQEi6VcXUn7DkV1UGf98nV51cqRUNjEaj-DbWz_ZCPeqM_dwkDjOEYQD0IuE1xzCTaDmslEKRMvnJlGPm7BDkzf6muLv4lDnB3q-hIrSgXRIZCZ2GfDq64XIH7AgXFcYIAMWM7iBtkLkkjwKdmbvz1evOtPohLfpNy7P2ITr4xGcPdJvyY3I8JyepfctjsNOjcIuUcmurpP7rguHtQu6g_IZhcwdAwY2gOGGsDQEWCO6JzuwIU6uNAeLhTgQq_B5SH5-Pr49NUbz3bj8FQQscZjMlS-ZhJe7mAoRK6lQB1BkcRAcESoOQZpY10IlfMgl2ym8lz5YcyLXLBIsUdkWq0q_ZhQ2HbkHIkjVm6KfcVFKAMWaTCxKJxl--TATWO6NkVXUqdGPEtx0lOc9NRM-j6J3EynljYaOpgCMH5z35N_vO8puTXg9xmZNhetfk5uqk2zrC9eWPz8AEq5mSg
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Spatial-temporal+hierarchical+decoupled+masked+autoencoder%3A+A+self-supervised+learning+framework+for+electrocardiogram&rft.jtitle=Expert+systems+with+applications&rft.au=Wei%2C+Xiaoyang&rft.au=Li%2C+Zhiyuan&rft.au=Tian%2C+Yuanyuan&rft.au=Wang%2C+Mengxiao&rft.date=2026-03-01&rft.pub=Elsevier+Ltd&rft.issn=0957-4174&rft.volume=298&rft_id=info:doi/10.1016%2Fj.eswa.2025.129603&rft.externalDocID=S095741742503218X
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon