Dimensionality Reduction and Denoising of Spatial Transcriptomics Data Using Dual-Channel Masked Graph Autoencoder

Recent advances in spatial transcriptomics (ST) technology allow researchers to comprehensively measure gene expression patterns at the level of individual cells or even subcellular compartments while preserving the spatial context of their tissue. Spatial domain identification is a critical task in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:bioRxiv
Hauptverfasser: Min, Wenwen, Fang, Donghai, Chen, Jinyu, Zhang, Shihua
Format: Paper
Sprache:Englisch
Veröffentlicht: Cold Spring Harbor Laboratory 02.06.2024
Ausgabe:1.1
Schlagworte:
ISSN:2692-8205
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Recent advances in spatial transcriptomics (ST) technology allow researchers to comprehensively measure gene expression patterns at the level of individual cells or even subcellular compartments while preserving the spatial context of their tissue. Spatial domain identification is a critical task in analyzing the ST data. However, effectively capturing distinctive gene expression features and relationships between genes poses a significant challenge. We develop a graph self-supervised learning method STMask for the analysis and exploration of the ST data. STMask combines the masking mechanism with a graph autoencoder, compelling the gene representation learning channel to acquire more expressive representations. Simultaneously, it combines the masking mechanism with graph self-supervised contrastive learning methods, pulling together the embedding distances between spatially adjacent points and pushing apart the representations of different clusters, allowing the gene relationship learning channel to learn more comprehensive relationships. The applications of STMask to four ST datasets demonstrate that STMask outperforms state-of-the-art methods in various tasks, including spatial clustering and trajectory inference. Source code is available at https://github.com/donghaifang/STMask. Spatial Transcriptomics (ST) is an emerging transcriptomic sequencing technology aimed at revealing the spatial distribution of gene expression and cell types within tissues. This method enables the acquisition of gene expression profiles at the level of individual cells or spots within the tissue, uncovering the spatial expression patterns of genes. However, accurately identifying spatial domains in ST data remains challenging. In our study, we introduce STMask, a self-supervised learning method that combines a dual-channel masked graph autoencoder with masking and contrastive learning. Our work contributes primarily in two aspects: (1) We propose a novel graph self-supervised learning method (STMask) specifically tailored for the analysis and research of ST data, which enhances the ability to capture the unique features of gene expression and spatial relationships within tissues. (2) Through comprehensive experiments, STMask provides valuable insights into biological processes, particularly in the context of breast cancer. It identifies enrichment of various differentially expressed genes in tumor regions, such as IGHG1, which can serve as effective targets for cancer therapy.
AbstractList Recent advances in spatial transcriptomics (ST) technology allow researchers to comprehensively measure gene expression patterns at the level of individual cells or even subcellular compartments while preserving the spatial context of their tissue. Spatial domain identification is a critical task in analyzing the ST data. However, effectively capturing distinctive gene expression features and relationships between genes poses a significant challenge. We develop a graph self-supervised learning method STMask for the analysis and exploration of the ST data. STMask combines the masking mechanism with a graph autoencoder, compelling the gene representation learning channel to acquire more expressive representations. Simultaneously, it combines the masking mechanism with graph self-supervised contrastive learning methods, pulling together the embedding distances between spatially adjacent points and pushing apart the representations of different clusters, allowing the gene relationship learning channel to learn more comprehensive relationships. The applications of STMask to four ST datasets demonstrate that STMask outperforms state-of-the-art methods in various tasks, including spatial clustering and trajectory inference. Source code is available at https://github.com/donghaifang/STMask. Spatial Transcriptomics (ST) is an emerging transcriptomic sequencing technology aimed at revealing the spatial distribution of gene expression and cell types within tissues. This method enables the acquisition of gene expression profiles at the level of individual cells or spots within the tissue, uncovering the spatial expression patterns of genes. However, accurately identifying spatial domains in ST data remains challenging. In our study, we introduce STMask, a self-supervised learning method that combines a dual-channel masked graph autoencoder with masking and contrastive learning. Our work contributes primarily in two aspects: (1) We propose a novel graph self-supervised learning method (STMask) specifically tailored for the analysis and research of ST data, which enhances the ability to capture the unique features of gene expression and spatial relationships within tissues. (2) Through comprehensive experiments, STMask provides valuable insights into biological processes, particularly in the context of breast cancer. It identifies enrichment of various differentially expressed genes in tumor regions, such as IGHG1, which can serve as effective targets for cancer therapy.
Author Fang, Donghai
Min, Wenwen
Zhang, Shihua
Chen, Jinyu
Author_xml – sequence: 1
  givenname: Wenwen
  orcidid: 0000-0002-2558-2911
  surname: Min
  fullname: Min, Wenwen
  organization: School of Information Science and Engineering, Yunnan University
– sequence: 2
  givenname: Donghai
  orcidid: 0009-0000-9050-0700
  surname: Fang
  fullname: Fang, Donghai
  organization: School of Information Science and Engineering, Yunnan University
– sequence: 3
  givenname: Jinyu
  surname: Chen
  fullname: Chen, Jinyu
  organization: School of Mathematics, Statistics and Mechanics, Beijing University of Technology
– sequence: 4
  givenname: Shihua
  orcidid: 0000-0003-0192-7118
  surname: Zhang
  fullname: Zhang, Shihua
  organization: Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences
BookMark eNotkF1LwzAYhYMoOOd-gHe59Kb1TbIm7eVYdQoTQed1eZukLtolJenE_Xs_5tXhwOGB81yQUx-8JeSKQc4YsBsOfJ5DkQvIi0oWkp-QCZcVz0oOxTmZpfQOALySTKj5hMTa7axPLnjs3Xigz9bs9fhTKXpDa-uDS86_0dDRlwFHhz3dRPRJRzeMYed0ojWOSF__VvUe-2y5Re9tTx8xfVhDVxGHLV3sx2C9DsbGS3LWYZ_s7D-nZHN3u1neZ-un1cNysc5aJXmmtALVMSM7wbTRmkvDBXJeSuRWyLnCkpWyLTvTiq4FpVrQP5-wspYZbYWYkusjtnUhfrnPZohuh_HQ_BpqoGgENEdD4hs8I189
ContentType Paper
Copyright 2024, Posted by Cold Spring Harbor Laboratory
Copyright_xml – notice: 2024, Posted by Cold Spring Harbor Laboratory
DBID FX.
DOI 10.1101/2024.05.30.596562
DatabaseName bioRxiv
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2692-8205
Edition 1.1
ExternalDocumentID 2024.05.30.596562v1
GroupedDBID 8FE
8FH
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BBNVY
BENPR
BHPHI
FX.
HCIFZ
LK8
M7P
NQS
PIMPY
PROAC
RHI
ID FETCH-LOGICAL-b762-7c707f1d6f31cdcc26d23a2286a2e3647a8186b8fdb3fb077b0c613a9ee1dce33
IngestDate Tue Jan 07 18:55:29 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
License This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at http://creativecommons.org/licenses/by/4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-b762-7c707f1d6f31cdcc26d23a2286a2e3647a8186b8fdb3fb077b0c613a9ee1dce33
Notes Competing Interest Statement: The authors have declared no competing interest.
ORCID 0009-0000-9050-0700
0000-0002-2558-2911
0000-0003-0192-7118
OpenAccessLink https://www.biorxiv.org/content/10.1101/2024.05.30.596562
PageCount 27
ParticipantIDs biorxiv_primary_2024_05_30_596562
PublicationCentury 2000
PublicationDate 20240602
PublicationDateYYYYMMDD 2024-06-02
PublicationDate_xml – month: 6
  year: 2024
  text: 20240602
  day: 2
PublicationDecade 2020
PublicationTitle bioRxiv
PublicationYear 2024
Publisher Cold Spring Harbor Laboratory
Publisher_xml – name: Cold Spring Harbor Laboratory
References Liu, Fang (2024.05.30.596562v1.8) 2024; 40
Liu, Zhang (2024.05.30.596562v1.15) 2021; 35
Long, Ang (2024.05.30.596562v1.20) 2023; 14
Hou, Liu (2024.05.30.596562v1.24) 2022
Leland, John (2024.05.30.596562v1.26) 2018; 3
Zhang, Gao (2024.05.30.596562v1.13) 2024; 20
Wolf, Hamey (2024.05.30.596562v1.27) 2019; 20
Svensson, Teichmann (2024.05.30.596562v1.29) 2018; 15
Hu, Li (2024.05.30.596562v1.14) 2021; 18
Xu, Wang (2024.05.30.596562v1.28) 2023; 14
Zhou, Dong, Zhang (2024.05.30.596562v1.2) 2023; 10
Ji, Rubin (2024.05.30.596562v1.5) 2022; 182
Pham, Tan (2024.05.30.596562v1.9) 2023; 14
Saunders, Bankovich (2024.05.30.596562v1.33) 2015; 7
Maynard, Collado (2024.05.30.596562v1.21) 2021; 24
Dong, Zhang (2024.05.30.596562v1.1) 2022; 13
Thrane, Eriksson (2024.05.30.596562v1.22) 2018; 78
Jin, Qiu (2024.05.30.596562v1.32) 2023; 22
Liu, Fang (2024.05.30.596562v1.11) 2023; 23
Zhiyuan, Fangyuan (2024.05.30.596562v1.3) 2024; 21
Zhao, Stone (2024.05.30.596562v1.10) 2021; 39
Min, Chang, Zhang, Wan (2024.05.30.596562v1.4) 2021; 17
Petar, William (2024.05.30.596562v1.19) 2018
Vincent, Larochelle (2024.05.30.596562v1.25) 2008
Lall, Ray, Bandyopadhyay (2024.05.30.596562v1.12) 2022; 18
Sunkin, Ng (2024.05.30.596562v1.31) 2012; 41
Zeng, Shen (2024.05.30.596562v1.30) 2012; 149
Li, Wu (2024.05.30.596562v1.23) 2023
Xu, Fu (2024.05.30.596562v1.17) 2024; 16
Zhang, Dong (2024.05.30.596562v1.7) 2023; 51
Zigler, Villares (2024.05.30.596562v1.34) 2011; 71
Raudvere, Kolberg (2024.05.30.596562v1.35) 2019; 47
Xu, Jin (2024.05.30.596562v1.16) 2022; 50
Chen, Liao (2024.05.30.596562v1.6) 2022; 185
Li, Chen (2024.05.30.596562v1.18) 2022; 2
References_xml – volume: 18
  start-page: e1009600
  issue: 3
  year: 2022
  ident: 2024.05.30.596562v1.12
  article-title: A copula based topology preserving graph convolution network for clustering of single-cell RNA-seq data
  publication-title: PLoS Comput Biol
– volume: 78
  start-page: 5970
  issue: 20
  year: 2018
  end-page: 5979
  ident: 2024.05.30.596562v1.22
  article-title: Spatially resolved transcriptomics enables dissection of genetic heterogeneity in stage III cutaneous malignant melanoma
  publication-title: Cancer Research
– year: 2018
  ident: 2024.05.30.596562v1.19
  article-title: Deep Graph Infomax
  publication-title: ArXiv
– volume: 14
  start-page: 1155
  issue: 1
  year: 2023
  end-page: 1173
  ident: 2024.05.30.596562v1.20
  article-title: Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST
  publication-title: Nature Communications
– volume: 3
  start-page: 861
  issue: 29
  year: 2018
  end-page: 911
  ident: 2024.05.30.596562v1.26
  article-title: UMAP: Uniform Manifold Approximation and Projection
  publication-title: Journal of Open Source Software
– volume: 21
  start-page: 1
  issue: 1
  year: 2024
  end-page: 11
  ident: 2024.05.30.596562v1.3
  article-title: Benchmarking spatial clustering methods with spatially resolved transcriptomics data
  publication-title: Nature Methods
– volume: 71
  start-page: 3494
  issue: 10
  year: 2011
  end-page: 3504
  ident: 2024.05.30.596562v1.34
  article-title: Expression of Id-1 is regulated by MCAM/MUC18: a missing link in melanoma progression
  publication-title: Cancer Research
– volume: 35
  start-page: 857
  issue: 1
  year: 2021
  end-page: 876
  ident: 2024.05.30.596562v1.15
  article-title: Self-supervised learning: Generative or contrastive
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 185
  start-page: 1777
  issue: 10
  year: 2022
  end-page: 1792
  ident: 2024.05.30.596562v1.6
  article-title: Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays
  publication-title: Cell
– volume: 40
  start-page: btae023
  issue: 1
  year: 2024
  ident: 2024.05.30.596562v1.8
  article-title: Assembling spatial clustering framework for heterogeneous spatial transcriptomics data with GRAPHDeep
  publication-title: Bioinformatics
– volume: 15
  start-page: 343
  issue: 5
  year: 2018
  end-page: 346
  ident: 2024.05.30.596562v1.29
  article-title: Stegle O SpatialDE: identification of spatially variable genes
  publication-title: Nature Methods
– volume: 13
  start-page: 1739
  issue: 1
  year: 2022
  end-page: 1750
  ident: 2024.05.30.596562v1.1
  article-title: Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder
  publication-title: Nature Communications
– volume: 41
  start-page: D996
  issue: 1
  year: 2012
  end-page: D1008
  ident: 2024.05.30.596562v1.31
  article-title: Allen Brain Atlas: an integrated spatio-temporal portal for exploring the central nervous system
  publication-title: Nucleic Acids Research
– volume: 23
  start-page: 106
  issue: 1
  year: 2023
  end-page: 128
  ident: 2024.05.30.596562v1.11
  article-title: A comprehensive overview of graph neural network-based approaches to clustering for spatial transcriptomics
  publication-title: Computational and Structural Biotechnology Journal
– volume: 16
  start-page: 12
  issue: 1
  year: 2024
  ident: 2024.05.30.596562v1.17
  article-title: Unsupervised spatially embedded deep representation of spatial transcriptomics
  publication-title: Genome Medicine
– volume: 24
  start-page: 425
  issue: 3
  year: 2021
  end-page: 436
  ident: 2024.05.30.596562v1.21
  article-title: Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex
  publication-title: Nature Neuroscience
– volume: 22
  start-page: 718
  issue: 6
  year: 2023
  end-page: 731
  ident: 2024.05.30.596562v1.32
  article-title: High expression of IGHG1 promotes breast cancer malignant development by activating the AKT pathway
  publication-title: Cell Cycle
– volume: 17
  start-page: e1009044
  issue: 6
  year: 2021
  ident: 2024.05.30.596562v1.4
  article-title: TSCCA: A tensor sparse CCA method for detecting microRNA-gene patterns from multiple cancers
  publication-title: PLOS Comput Biol
– start-page: 1268
  year: 2023
  end-page: 1279
  ident: 2024.05.30.596562v1.23
  article-title: What’s Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
  publication-title: In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
– volume: 20
  start-page: 1
  issue: 1
  year: 2019
  end-page: 9
  ident: 2024.05.30.596562v1.27
  article-title: PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells
  publication-title: Genome Biology
– volume: 14
  start-page: 7603
  issue: 1
  year: 2023
  end-page: 7620
  ident: 2024.05.30.596562v1.28
  article-title: SPACEL: deep learning-based characterization of spatial transcriptome architectures
  publication-title: Nature Communications
– start-page: 1096
  year: 2008
  end-page: 1103
  ident: 2024.05.30.596562v1.25
  article-title: Extracting and composing robust features with denoising autoencoders
  publication-title: In: Proceedings of the 25th International Conference on Machine Learning
– volume: 47
  start-page: W191
  issue: 1
  year: 2019
  end-page: W198
  ident: 2024.05.30.596562v1.35
  article-title: g: Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update)
  publication-title: Nucleic Acids Research
– volume: 39
  start-page: 1375
  issue: 11
  year: 2021
  end-page: 1384
  ident: 2024.05.30.596562v1.10
  article-title: Spatial transcriptomics at subspot resolution with BayesSpace
  publication-title: Nature Biotechnology
– volume: 50
  start-page: e131
  issue: 22
  year: 2022
  end-page: e131
  ident: 2024.05.30.596562v1.16
  article-title: DeepST: identifying spatial domains in spatial transcriptomics by deep learning
  publication-title: Nucleic Acids Research
– volume: 182
  start-page: 497
  issue: 2
  year: 2022
  end-page: 514
  ident: 2024.05.30.596562v1.5
  article-title: Multimodal analysis of composition and spatial architecture in human squamous cell carcinoma
  publication-title: Cell
– volume: 10
  start-page: 894
  issue: 1
  year: 2023
  end-page: 906
  ident: 2024.05.30.596562v1.2
  article-title: Integrating spatial transcriptomics data across different conditions, technologies and developmental stages
  publication-title: Nature Computational Science
– volume: 14
  start-page: 7739
  issue: 1
  year: 2023
  end-page: 7761
  ident: 2024.05.30.596562v1.9
  article-title: stlearn: Robust mapping of spatiotemporal trajectories and cell–cell interactions in healthy and diseased tissues
  publication-title: Nature Communications
– volume: 51
  start-page: e103
  issue: 20
  year: 2023
  end-page: e103
  ident: 2024.05.30.596562v1.7
  article-title: STAMarker: determining spatial domain-specific variable genes with saliency maps in deep learning
  publication-title: Nucleic Acids Research
– volume: 7
  start-page: 302ra136
  issue: 302
  year: 2015
  end-page: 302ra136
  ident: 2024.05.30.596562v1.33
  article-title: A DLL3-targeted antibody-drug conjugate eradicates high-grade pulmonary neuroendocrine tumor-initiating cells in vivo
  publication-title: Science Translational Medicine
– volume: 18
  start-page: 1342
  issue: 11
  year: 2021
  end-page: 1351
  ident: 2024.05.30.596562v1.14
  article-title: SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network
  publication-title: Nature Methods
– volume: 149
  start-page: 483
  issue: 2
  year: 2012
  end-page: 496
  ident: 2024.05.30.596562v1.30
  article-title: Large-scale cellular-resolution gene profiling in human neocortex reveals species-specific molecular signatures
  publication-title: Cell
– start-page: 594
  year: 2022
  end-page: 604
  ident: 2024.05.30.596562v1.24
  article-title: GraphMAE: Self-supervised masked graph autoencoders
  publication-title: In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
– volume: 20
  start-page: e1011935
  issue: 2
  year: 2024
  ident: 2024.05.30.596562v1.13
  article-title: STGIC: A graph and image convolution-based method for spatial transcriptomic clustering
  publication-title: PLoS Comput Biol
– volume: 2
  start-page: 399
  issue: 6
  year: 2022
  end-page: 408
  ident: 2024.05.30.596562v1.18
  article-title: Cell clustering for spatial transcriptomics data with graph neural networks
  publication-title: Nature Computational Science
SSID ssj0002961374
Score 1.7241029
SecondaryResourceType preprint
Snippet Recent advances in spatial transcriptomics (ST) technology allow researchers to comprehensively measure gene expression patterns at the level of individual...
SourceID biorxiv
SourceType Open Access Repository
SubjectTerms Genomics
Title Dimensionality Reduction and Denoising of Spatial Transcriptomics Data Using Dual-Channel Masked Graph Autoencoder
URI https://www.biorxiv.org/content/10.1101/2024.05.30.596562
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtZ3Li9swEIdFSVrorU_a7QMt9BbcOrJsRcd2s0sPSwhLoHszsiU1posc7DjN_vcdPbDTzR62h16MMX7AfEIajX8zg9AnqnmpKGxyCkqTiNJSRoIxHtEMVn_wsMEFF67ZBFssZtfXfBnktq1rJ8CMme33fPNfUcM1gG1TZ_8Bd_9SuADnAB2OgB2ODwI_t_X6fa0N72HLLrQDd6JjU1dtUDrbdsQ2YO7WKzd72BTlFkbCVky8lmDeiZvIZiAYq4AV7S9l41dis5587ba1LYIpg7w3OLhFVV_tq13P0Zco-KHM7yHn7CLEqOe1-bkW1aAwCIkilbntjgPa62rdicMgBaFOTDVsac_qG_CeXaTSJifB6J5c-jFeB7GzclMeyTjMz8QlYt8zubumAvb1tthqEn9OOXijZFjJen3h0T072CCPCUs5GaHxt_PF8qoPxBEOHg2j4Y83fOPL0dOwNwLrNWC9A99j9QyNl2KjmufokTIv0BPfPPT2JWr-Bo170BhA4x40rjUOoPEd0NiCxg40PgSNPWjsQOMD0K_Q6uJ8dfY9Ch00ogIWuYiVLGZ6KjOdTEtZliSTJBGEzDJBlG0cIGw9w2KmZZHoImasiEswhuBKTWWpkuQ1GpnaqDcIC5pOhY6p1pRTmiYcHiRwRpQAF19mb9FpMFG-8WVScmvGPE7zJM69GU8ecM879HQYP-_RaNt06gN6XO62Vdt8DOz-AHfuXDM
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Dimensionality+Reduction+and+Denoising+of+Spatial+Transcriptomics+Data+Using+Dual-Channel+Masked+Graph+Autoencoder&rft.jtitle=bioRxiv&rft.au=Min%2C+Wenwen&rft.au=Fang%2C+Donghai&rft.au=Chen%2C+Jinyu&rft.au=Zhang%2C+Shihua&rft.date=2024-06-02&rft.pub=Cold+Spring+Harbor+Laboratory&rft.eissn=2692-8205&rft_id=info:doi/10.1101%2F2024.05.30.596562&rft.externalDocID=2024.05.30.596562v1