Abstract
With the rapid generation of spatial transcriptomics (ST) data, integrative analysis of multiple ST datasets from different conditions, technologies and developmental stages is becoming increasingly important. Here we present a graph attention neural network called STAligner for integrating and aligning ST datasets, enabling spatially aware data integration, simultaneous spatial domain identification and downstream comparative analysis. We apply STAligner to ST datasets of the human cortex slices from different samples, the mouse olfactory bulb slices generated by two profiling technologies, the mouse hippocampus tissue slices under normal and Alzheimer’s disease conditions, and the spatiotemporal atlases of mouse organogenesis. STAligner efficiently captures the shared tissue structures across different slices, the disease-related substructures and the dynamical changes during mouse embryonic development. In addition, the shared spatial domain and nearest-neighbor pairs identified by STAligner can be further considered as corresponding pairs to guide the three-dimensional reconstruction of consecutive slices, achieving more accurate local structure-guided registration than the existing method.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Source data for Figs. 2–6 is available with this paper. The datasets analyzed in this study are all from publicly available datasets (Supplementary Table 1). Specifically, the human DLPFC dataset can be accessed in the spatialLIBD package (http://spatial.libd.org/spatialLIBD). The mouse olfactory bulb tissue data generated by Stereo-seq and Slide-seqV2 platforms can be accessed from https://github.com/JinmiaoChenLab/SEDR_analyses and https://singlecell.broadinstitute.org/single_cell/study/SCP815, respectively. The mouse sagittal posterior and anterior brain data can be accessed at https://support.10xgenomics.com/spatial-gene-expression/datasets/1.0.0/V1_Mouse_Brain_Sagittal_Posterior and https://support.10xgenomics.com/spatial-gene-expression/datasets/1.0.0/V1_Mouse_Brain_Sagittal_Anterior, respectively. The mouse embryo data can be accessed at https://db.cngb.org/stomics/mosta/. The human embryo data can be accessed at https://heoa.shinyapps.io/code/. The human breast cancer data can be accessed at https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Breast_Cancer_Block_A_Section_1/V1_Breast_Cancer_Block_A_Section_1_web_summary.html, https://cf.10xgenomics.com/samples/spatial-exp/1.2.0/V1_Human_Invasive_Ductal_Carcinoma/V1_Human_Invasive_Ductal_Carcinoma_web_summary.html and https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/CytAssist_FFPE_Human_Breast_Cancer/CytAssist_FFPE_Human_Breast_Cancer_web_summary.html. The normal and Alzheimer’s disease mouse hippocampus data can be accessed at https://singlecell.broadinstitute.org/single_cell/study/SCP815 and https://singlecell.broadinstitute.org/single_cell/study/SCP1663, respectively. The Slide-seq 3D mouse hippocampus slice can be accessed at https://singlecell.broadinstitute.org/single_cell/study/SCP354/slide-seq-study. The annotation images from the Allen Mouse Brain Atlas can be accessed at https://mouse.brain-map.org/static/atlas (Supplementary Table2).
Code availability
An open-source Python implementation of the STAligner package is available at https://github.com/zhanglabtools/STAligner and https://doi.org/10.5281/zenodo.8315415 ref. 56.
References
Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature 596, 211–220 (2021).
Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).
Moses, L. & Pachter, L. Museum of spatial transcriptomics. Nat. Methods 19, 534–546 (2022).
Rodriques, S. G. et al. Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science 363, 1463–1467 (2019).
Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat. Biotechnol. 39, 313–319 (2021).
Chen, A. et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell 185, 1777–1792. e1721 (2022).
Maynard, K. R. et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci. 24, 425–436 (2021).
Cable, D. M. et al. Cell type-specific inference of differential expression in spatial transcriptomics. Nat. Methods 19, 1076–1087 (2022).
Fu, H. et al. Unsupervised spatially embedded deep representation of spatial transcriptomics. Preprint at bioRxiv https://doi.org/10.1101/2021.06.15.448542 (2021).
Hu, J. et al. SpaGCN: integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat. Methods 18, 1342–1351 (2021).
Zhao, E. et al. Spatial transcriptomics at subspot resolution with BayesSpace. Nat. Biotechnol. 39, 1375–1384 (2021).
Dong, K. & Zhang, S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat. Commun. 13, 1739 (2022).
Sun, S., Zhu, J. & Zhou, X. Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies. Nat. Methods 17, 193–200 (2020).
Zhang, C., Dong, K., Aihara, K., Chen, L. & Zhang, S. STAMarker: determining spatial domain-specific variable genes with saliency maps in deep learning. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2022.11.07.515535v1. (2022).
Shao, X. et al. Knowledge-graph-based cell-cell communication inference for spatially resolved transcriptomic data with SpaTalk. Nat. Commun. 13, 4429 (2022).
Chen, S. et al. Spatially resolved transcriptomics reveals unique gene signatures associated with human temporal cortical architecture and Alzheimer’s pathology. Acta Neuropathol Commun 10, 188 (2022).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e1821 (2019).
Zeira, R., Land, M., Strzalkowski, A. & Raphael, B. Alignment and integration of spatial transcriptomics data. Nat. Methods 19, 567–575 (2022).
Villacampa, E. G. et al. Genome-wide spatial expression profiling in formalin-fixed tissues. Cell Genomics 1, 100065 (2021).
Yao, Z. et al. A taxonomy of transcriptomic cell types across the isocortex and hippocampal formation. Cell 184, 3222–3241.e3226 (2021).
Kadowaki, K. et al. Phosphohippolin expression in the rat central nervous system. Mol. Brain. Res. 125, 105–112 (2004).
Zacharias, D. A. & Kappen, C. Developmental expression of the four plasma membrane calcium ATPase (Pmca) genes in the mouse. Biochim. Biophys. Acta Gen. Subj. 1428, 397–405 (1999).
Mamoor, S. The α1 subunit of the γ-aminobutyric acid receptor, Gabra1, is differentially expressed in the brains of patients with schizophrenia. Preprint at OSF https://doi.org/10.31219/osf.io/m93ya. (2020).
Richardson, L. et al. EMAGE mouse embryo spatial gene expression database: 2014 update. Nucleic Acids Res. 42, D835–D844 (2014).
Savolainen, S. M., Foley, J. F. & Elmore, S. A. Histology atlas of the developing mouse heart with emphasis on E11. 5 to E18. 5. Toxicol. Pathol. 37, 395–414 (2009).
Haddad-Tóvolli, R., Szabó, N.-E., Zhou, X. & Alvarez-Bolado, G. Genetic manipulation of the mouse developing hypothalamus through in utero electroporation. J. Vis. Exp. 77, e50412 (2013).
Díaz-Guerra, E., Pignatelli, J., Nieto-Estévez, V. & Vicario-Abejón, C. Transcriptional regulation of olfactory bulb neurogenesis. Anat. Rec. 296, 1364–1382 (2013).
Chen, V. S. et al. Histology atlas of the developing prenatal and postnatal mouse central nervous system, with emphasis on prenatal days E7. 5 to E18. 5. Toxicol. Pathol. 45, 705–744 (2017).
Xu, Y. et al. A single-cell transcriptome atlas profiles early organogenesis in human embryos. Nat. Cell Biol. 25, 604–615 (2023).
Siddiqui, T. J. et al. An LRRTM4–HSPG complex mediates excitatory synapse development on dentate gyrus granule cells. Neuron 79, 680–695 (2013).
Elmore, M. R. et al. Colony-stimulating factor 1 receptor signaling is necessary for microglia viability, unmasking a microglia progenitor cell in the adult brain. Neuron 82, 380–397 (2014).
Zhang, B. et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell 153, 707–720 (2013).
Zhao, Y. et al. TREM2 is a receptor for β-amyloid that mediates microglial function. Neuron 97, 1023–1031. e1027 (2018).
Hong, S. et al. Complement and microglia mediate early synapse loss in Alzheimer mouse models. Science 352, 712–716 (2016).
Salceda, S. et al. The immunomodulatory protein B7-H4 is overexpressed in breast and ovarian cancers and promotes epithelial cell transformation. Exp. Cell. Res. 306, 128–141 (2005).
Zheng, Z.-Y. et al. Wild-type N-Ras, overexpressed in basal-like breast cancer, promotes tumor formation by inducing IL-8 secretion via JAK2 activation. Cell Rep. 12, 511–524 (2015).
Schelker, M. et al. Estimation of immune cell content in tumour tissue using single-cell RNA-seq data. Nat. Commun. 8, 2032 (2017).
Zeisel, A. et al. Molecular architecture of the mouse nervous system. Cell 174, 999–1014. e1022 (2018).
Tzingounis, A. V., Kobayashi, M., Takamatsu, K. & Nicoll, R. A. Hippocalcin gates the calcium activation of the slow afterhyperpolarization in hippocampal pyramidal cells. Neuron 53, 487–493 (2007).
Yan, C., Costa, R., Darnell, J. E. Jr, Chen, J. & Van Dyke, T. Distinct positive and negative elements control the limited hepatocyte and choroid plexus expression of transthyretin in transgenic mice. EMBO J. 9, 869–878 (1990).
Strange, B. A., Witter, M. P., Lein, E. S. & Moser, E. I. Functional organization of the hippocampal longitudinal axis. Nat. Rev. Neurosci. 15, 655–669 (2014).
Zeidman, P. & Maguire, E. A. Anterior hippocampus: the anatomy of perception, imagination and episodic memory. Nat. Rev. Neurosci. 17, 173–182 (2016).
Rood, J. E. et al. Toward a common coordinate framework for the human body. Cell 179, 1455–1467 (2019).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Salehi, A. & Davulcu, H. Graph attention auto-encoders. Preprint at https://arxiv.org/abs/1905.10715 (2019).
Simon, L. M., Wang, Y.-Y. & Zhao, Z. Integration of millions of transcriptomes using batch-aware triplet neural networks. Nat. Mach. Intell. 3, 705–715 (2021).
Dong, Z. et al. Registration of large-scale terrestrial laser scanner point clouds: a review and benchmark. ISPRS J. Photogramm. Remote Sens. 163, 327–342 (2020).
Umeyama, S. Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. 13, 376–380 (1991).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Clevert, D.-A., Unterthiner, T. & Hochreiter, S. Fast and accurate deep network learning by exponential linear units (ELUs). Preprint at https://arxiv.org/abs/1511.07289 (2015).
Fey, M. & Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. Preprint at https://arxiv.org/abs/1903.02428 (2019).
Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. Mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R Journal 8, 205–233 (2016).
Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics 16, 284–287 (2012).
Tran, H. T. N. et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol. 21, 12 (2020).
Zhou, X., Dong, K. & Zhang, S. Integrating spatial transcriptomics data across different conditions, technologies, and developmental stages. Zenodo https://doi.org/10.5281/zenodo.8315415 (2023).
Acknowledgements
This work has been supported by the National Key Research and Development Program of China (no. 2019YFA0709501 to S.Z.), the Strategic Priority Research Program of the Chinese Academy of Sciences (no. XDPB17), the National Natural Science Foundation of China (no. 12126605), the Key-Area Research and Development of Guangdong Province (no. 2020B1111190001) and the CAS Project for Young Scientists in Basic Research (no. YSBR-034 to S.Z.).
Author information
Authors and Affiliations
Contributions
S.Z. conceived and supervised the project. X.Z. developed and implemented the STAligner algorithm. X.Z., K.D. and S.Z. validated the methods and wrote the paper. All authors read and approved the final paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Qin Ma and Juexin Wang for their contribution to the peer review of this work. Primary Handling Editor: Fernando Chirigati, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Notes, Figs. 1–15, and Tables 1 and 2.
Source data
Source Data Fig. 2
Raw numerical data and cell labels behind the UMAP plots, all of the data points behind box plots.
Source Data Fig. 3
Raw numerical data and cell labels behind the UMAP plots.
Source Data Fig. 5
Raw numerical data and cell labels behind the UMAP plots.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, X., Dong, K. & Zhang, S. Integrating spatial transcriptomics data across different conditions, technologies and developmental stages. Nat Comput Sci 3, 894–906 (2023). https://doi.org/10.1038/s43588-023-00528-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43588-023-00528-w
This article is cited by
-
SPACEL: deep learning-based characterization of spatial transcriptome architectures
Nature Communications (2023)
-
Integrating spatial and single-cell transcriptomics data using deep generative models with SpatialScope
Nature Communications (2023)