Bibliographic Details
| Title: |
SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model |
| Authors: |
Zhan, Xianghao, Xu, Jingyu, Zheng, Yuanning, Good, Zinaida, Gevaert, Olivier |
| Publication Year: |
2026 |
| Collection: |
ArXiv.org (Cornell University Library) |
| Subject Terms: |
Machine Learning, Genomics, Quantitative Methods |
| Description: |
Spatial transcriptomics enables spatial gene expression profiling, motivating computational models that capture spatially conditioned regulatory relationships. We introduce SAGE-FM, a lightweight spatial transcriptomics foundation model based on graph convolutional networks (GCNs) trained with a masked central spot prediction objective. Trained on 416 human Visium samples spanning 15 organs, SAGE-FM learns spatially coherent embeddings that robustly recover masked genes, with 91% of masked genes showing significant correlations (p < 0.05). The embeddings generated by SAGE-FM outperform MOFA and existing spatial transcriptomics methods in unsupervised clustering and preservation of biological heterogeneity. SAGE-FM generalizes to downstream tasks, enabling 81% accuracy in pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and improving glioblastoma subtype prediction relative to MOFA. In silico perturbation experiments further demonstrate that the model captures directional ligand-receptor and upstream-downstream regulatory effects consistent with ground truth. These results demonstrate that simple, parameter-efficient GCNs can serve as biologically interpretable and spatially aware foundation models for large-scale spatial transcriptomics. ; 26 pages, 5 figures |
| Document Type: |
text |
| Language: |
unknown |
| Relation: |
http://arxiv.org/abs/2601.15504 |
| Availability: |
http://arxiv.org/abs/2601.15504 |
| Accession Number: |
edsbas.931A0B4 |
| Database: |
BASE |