Abstract
Spatially resolved genomic technologies have allowed us to study the physical organization of cells and tissues, and promise an understanding of local interactions between cells. However, it remains difficult to precisely align spatial observations across slices, samples, scales, individuals and technologies. Here, we propose a probabilistic model that aligns spatiallyresolved samples onto a known or unknown common coordinate system (CCS) with respect to phenotypic readouts (for example, gene expression). Our method, Gaussian Process Spatial Alignment (GPSA), consists of a twolayer Gaussian process: the first layer maps observed samples’ spatial locations onto a CCS, and the second layer maps from the CCS to the observed readouts. Our approach enables complex downstream spatially aware analyses that are impossible or inaccurate with unaligned data, including an analysis of variance, creation of a dense threedimensional (3D) atlas from sparse twodimensional (2D) slices or association tests across data modalities.
Main
Spatiallyresolved genomic technologies hold the promise to understand the spatial organization, variation and local effects of cellular morphology, gene expression, protein expression and other cellular phenotypes^{1,2,3,4,5,6,7,8,9,10}. As new technologies have been developed, several computational models and analysis pipelines have been proposed for processing and downstream analyses of singleslice data^{11,12,13,14,15,16,17}.
Although these technologies and methods have enabled scientific discoveries, it remains difficult to jointly analyze multiple phenotypic readouts from these technologies due to inevitable spatial warping and biological variation across slices, samples and individuals. Furthermore, the various spatial genomic platforms range widely in field of view, spatial resolution and number of phenotypic readouts that they measure. The standard analysis, in which each slice is analyzed separately, reduces the statistical power of the analyses or prohibits these analyses entirely. Thus, there remains a need for tools that enable a joint analysis across slices, samples, modalities and technologies.
The problem of integrating disparate spatially resolved samples arises in several fields. Spatial alignment has been well studied in the context of functional magnetic resonance imaging (fMRI) brain data^{18,19,20}. At a given time point, an fMRI scan produces measurements on a 3D grid across the brain across time, where the continuous level of blood flow is measured at each point (‘voxel’) in the grid. Given multiple scans across days or individuals, the alignment problem is to warp the spatial coordinates of each voxel in a scan so that the (x, y, z) voxels in each patient refer to approximately the same functional voxel in the brain.
Two major types of fMRI alignment have emerged: templatebased registration and hyperalignment. Templatebased registration methods seek to align scans from different individuals to a predefined CCS. This CCS is typically defined as a single individual’s scan or as the average across multiple manually aligned scans. The most popular approach uses a ‘template brain’ developed at the Montreal Neurological Institute^{21,22}. Next, fMRI samples’ voxel coordinates are warped such that the new samples’ voxel coordinates match this template in terms of both relative location in the brain and voxel behavior across time. Hyperalignment approaches seek to align different individuals’ data without a predefined template. In particular, hyperalignment methods compute scanspecific transformations of voxel space using the centroid of all scans as the CCS. Both linear^{23} and nonlinear^{24} hyperalignment approaches have been developed.
Alignment methods for fMRI data are not easily extensible to spatial genomic and histology images for three reasons. First, curated anatomical CCSs are not available for the diversity of tissue types, developmental stages and species that are studied using spatial genomics. Second, while the readout at each location for fMRI data is a single number representing blood flow, the readout in spatial genomics often has 10^{2}–10^{5} sparse features. Finally, while the spatial resolution of fMRI scans tends to be one of a few standard resolutions, there is a wide diversity of spatial technologies, each with their own resolution and field of view. Thus, there is a need for spatial genomicspecific alignment methods.
In spatial genomics, we are aware of four approaches for aligning samples’ spatial coordinates. Probabilistic alignment of spatial transcriptomic (ST) experiments (PASTE)^{25} was developed to align adjacent tissue slices in ST data^{1}. PASTE uses an optimal transport framework to identify mappings between the spatial locations of adjacent slices. Its objective function trades off transcriptional similarity and proximity of spatial locations. While PASTE is robust and fast, it is limited to linear alignments, which are often insufficiently expressive for complex distortions of data. While alignment is not the focus of Splotch^{26}, the method uses a linear autocorrelation model to shift ST slices to align specific tissue regions.
Two landmarkbased approaches to spatial alignment have also been proposed. Effortless generic Gaussian process (GP) landmark transfer (Eggplant) was developed using GP regression^{27}. Eggplant projects gene expression values of each misaligned slice onto a given CCS. Eggplant requires the user to identify a set of landmarks on each misaligned sample and the template sample. Eggplant performs this template transfer independently for each slice and each gene, ignoring any correlation between them. ST imaging framework (STIM) borrows techniques from computer vision to register ST data into a CCS^{28}. For both Eggplant and STIM, the identification of shared landmark locations may be difficult across slices from tissues without canonical structures, such as tumors.
In this work, we present a probabilistic model that aligns the coordinates of spatial genomic samples across tissue slices, individuals and data modalities. We apply our model to ST data and, in one experiment, paired histology data, at subcellular, cellular and supercellular resolutions from different platforms. Given a set of unaligned slices, our approach iteratively estimates a robust CCS (shared spatial locations capturing the full breadth of the slices) and maps the local coordinates from each slice onto the CCS. Our model, which leverages similarities in both spatial structure and phenotypic readouts between slices, enables the creation of a CCS onto which heterogeneous slices may be mapped and then analyzed jointly with respect to the shared CCS. The automated creation of a CCS is itself a contribution; few CCSs exist because of the challenges in creating them^{29,30}.
Our proposed generative model uses two stacked GPs to align spatial slices across samples and technologies in a 2D, 3D or potentially fourdimensional spatiotemporal coordinate system. Given a location in a slice, the first layer maps this location to the corresponding location in the CCS. The second layer generates the distribution of phenotypic readouts at that location (for example, the distribution of gene expression values). Together, the first layer representing a CCS and the second layer representing a map from each location in the CCS to estimates of phenotypic readouts represent an atlas. Our approach opens the door to de novo creation of large tissue atlases using collections of tissue samples. Our model allows for straightforward downstream analyses on the aligned slices, including imputation of sparse measurements, analysis of variation and joint mapping of slices with distinct modalities from different technologies.
Results
GPSA
Our GPSA is a Bayesian model for aligning spatial genomic and histology samples with spatial coordinates that are distorted or on different systems. Each slice is assigned its own warping function in the first layer of GPSA, which accounts for slicespecific deformations. In the second layer, GP functions model phenotypic readouts at each location in the CCS. Inference is guided by two competing objectives: retain the current position of each spot in a slice while warping each spot to ensure that the readouts within each warped slice match as closely as possible with the readout distributions encoded in the second layer of the deep GP (DGP). See Methods for details.
Although readouts are typically high dimensional, the readout features tend to be correlated, and this structure may be captured by a lowdimensional manifold^{31,32,33,34}. Thus, GPSA models the readouts as a weighted linear combination of a small number of GPs through a linear model of coregionalization (LMC^{35}; see Methods for details).
GPSA allows for joint modeling of multiple types of readout modalities. For example, many experiments collect both spatial expression profiles and histology images for each slice^{1,36}. These modalities contain complementary information, and it is of interest to analyze both modalities across multiple slices jointly. To do this, we augment our model of the phenotypic readouts (Equation (1)) to include a separate likelihood for each modality, allowing for straightforward multimodal alignment. See Methods for details.
De novo and templatebased CCSs
We propose two methods for aligning slices using GPSA: de novo alignment and templatebased alignment. A de novo alignment estimates a CCS from scratch using the slices while simultaneously projecting these slices onto the CCS. Alternatively, if a CCS exists for a tissue and context of interest, a templatebased alignment maps the samples to this given CCS. This is accomplished by fixing the warping function of the CCS to the identity. In practice, to avoid extreme warps in de novo alignment, we recommend arbitrarily choosing one of the input samples to fix as the CCS.
Simulations
We first validate the accuracy and robustness of our model using synthetic data generated under a variety of settings.
Recovery of true latent common coordinates
First, we generated synthetic spatial expression data for two slices from a known CCS, and we began with a onedimensional CCS to study and visualize the behavior of GPSA. We sampled spatial coordinates for n = 100 locations in the interval (0, 10). We then generated observed spatial coordinates for S = 2 slices by applying a GP warp (see Methods for details). We sampled synthetic gene expression y_{ij} for gene j using a GP:
where \({x}_{i}^{\star }\) is the location of the ith sample, 𝜖 is the local Gaussian error, f is a random nonlinear function generated from a GP, and σ2 is the variance term for the Gaussian error. We set k to be the radial basis function (RBF) with hyperparameter τ^{2} = 0.1 (Methods). We fit GPSA to this dataset using a de novo alignment and extracted the aligned coordinates for each slice. We found that the warped coordinates were well aligned between the two slices and that the relationship between spatial coordinates and gene expression was well preserved (Fig. 1a). The mean squared error (MSE) for the aligned coordinates was 0.000134 (where an MSE of 0 indicates perfect performance), while the MSE of the original spatial coordinates was 0.0345. This result suggests that GPSA is able to align distorted and disparate samples accurately.
Next, we extended this experiment to a more realistic setting in which the spatial coordinates are 2D. Here, the CCS was a 15 × 15 grid containing n = 225 spatial locations. The observed spatial coordinates for the first slice were kept at the original grid, and the observed coordinates for the second slice were generated by randomly warping the CCS with a GP warp (Methods) using an isotropic RBF with length scale ℓ = 10 and spatial variance τ^{2} = 0.5. We sampled synthetic expression values from a GP with the true spatial coordinates as inputs (Equation 1). We fit the model twice with this dataset: once using a de novo alignment and once using a templatebased alignment with the first slice as the template. For both alignments, the warped coordinates were aligned with minimal distortion using the latent CCS (Fig. 1b). The MSEs for the de novo and templatebased aligned coordinates were 0.000537 and 0.00725, respectively, while the MSE of the original spatial coordinates was 0.733. These results indicate that GPSA is a viable model for aligning distorted spatiallyresolved slices using both de novo and templatebased alignment.
Robustness of GPSA to observation noise
We next tested GPSA’s robustness to the number of readout features, the magnitude of distortion between the slices relative to the CCS and the noise variance in the readouts. To do so, we generated synthetic datasets with 2D spatial coordinates using the same approach as before. We varied the number of readout features p ∈ {1, 20, 50}. To vary the magnitude of distortion between slices, we varied the spatial variance τ^{2} of the RBF kernel of the GP warp (Methods; Equation 10), which corresponds to a larger distortion between slices. We fit GPSA to each of these datasets using a templatebased alignment with the first slice as the template, repeating the experiment five times for each condition.
For comparison, we ran PASTE^{25} and extracted the aligned coordinates. Every pair of simulated slices contains spots at identical locations. We measured the error between the warped locations for each of these spots between every pair of slices: \(\frac{1}{Sn}{\sum }_{s < {s}^{{\prime} }}\mathop{\sum }\nolimits_{i = 1}^{n}\parallel {\widetilde{{{{\bf{x}}}}}}_{i}^{s}{\widetilde{{{{\bf{x}}}}}}_{i}^{{s}^{{\prime} }}{\parallel }_{2}^{2}.\) We interpret a lower error to indicate superior performance in these experiments.
In this simulation, GPSA’s alignment error decreased with more readout features and increased with greater distortion (Fig. 1c–e). GPSA achieved a substantially lower error than PASTE in all settings. This difference in error is largely due to the fact that PASTE applies a linear transformation and is unable to account for local nonlinear distortions. Furthermore, GPSA also outperforms PASTE with much larger numbers of spatial locations (Supplementary Fig. 2). These results imply that our model is robust to nonlinear warpings, distortions of different magnitudes and differences in the number of readout features.
Assessing alignment via readout prediction
Our experiments thus far have tested whether GPSA can align the spatial coordinates of distorted samples. However, we expect that similar expression patterns across aligned slices should colocalize within the estimated CCS. To test this, we attempted to predict heldout readout values using the posterior estimates of expression values localized within the CCS. In particular, we repeated the 2D experiment in the previous Section but held out 20% of the readout values from one of the slices. We then fit templatebased GPSA and sampled the predicted expression values from the variational posterior predictive distribution at each CCS location:
where g captures the CCS and f captures the noiseless distribution of the readouts at each location of the CCS. Because the posterior mean is not available in closed form, we approximate the posterior mean across T = 10 samples, \({\widehat{\rm{f}}}_{ij}=\frac{1}{T}\mathop{\sum }\nolimits_{t = 1}^{T}{\widehat{\rm{f}}}_{ij}^{{\,}(t)}\). We compute the MSE between the predictions and the true values. We compared GPSA to two baseline approaches: we fit a GP to each slice separately (‘separate’ GP), and we fit a GP to a concatenation of the slices (‘union’ GP). We found that GPSA achieves lower prediction error than the two baseline methods (Fig. 1f). This result suggests that GPSA estimates readout value distributions within a CCS that have excellent predictive capabilities.
Estimating CCSs for ST
Having validated GPSA as a viable model for robust spatial alignment of highdimensional observations, we next applied GPSA to spatiallyresolved genomics data. Below, we present analyses of data collected from three technologies: ST^{1}, the Visium platform^{37} and SlideseqV2 (ref. ^{3}). We also performed analyses with images of hematoxylin and eosin (H&E) stains jointly with the spatial genomics data (Supplementary Table 1).
For all datasets, we removed mitochondrial genes and spatial locations with low counts, normalized readouts at each spatial location by the total number of counts at that location and log transformed, centered and standardized gene counts. We further filtered the data to include genes with spatial variability (see Methods for details). The spatial locations for each slice were normalized such that both coordinates were in the interval (0, 10), as all slices were produced with approximately the same field of view.
Aligning ST profiles of breast cancer samples
We tested GPSA on an ST^{1} dataset made up of four slices of a breast cancer tumor (Supplementary Fig. 4).
We first validated GPSA on the ST data by perturbing the samples with an artificial warp and examining whether the CCS estimated by GPSA approximately removed the perturbation. For this experiment, we analyzed each of the four samples separately. We applied a synthetic GP warp (Methods) with τ^{2} = 0.5 and ℓ^{2} = 10 to each of the slices and ran de novo GPSA on these misaligned samples. For comparison, we ran PASTE and visualized the aligned coordinates for each method.
GPSA was able to recover the CCS (Fig. 2). Moreover, GPSA corrected the local distortions in the spatial coordinates. By contrast, PASTE’s global correction did not correct these distortions. To quantify the alignments, we computed the MSE between the aligned coordinates and the true coordinates for three types of synthetic warps: GP, linear and polar warps. We ran ten repetitions of each experiment. GPSA outperformed PASTE under the GP and polar warps and performed roughly the same as PASTE under linear warps (Fig. 2). This result suggests that GPSA robustly corrects local distortions on spatial coordinates.
Estimating expression variability across spatial locations
We next asked whether we could estimate the variability of gene expression at each spatial location. An accurate estimate of the variance would allow for useful analyses, such as understanding the spatial heterogeneity of gene expression in a particular tissue region and analyzing changes in expression along the z axis. Better alignment of the slices will lead to more precise quantification of variance.
We used the ST breast cancer data^{1} to quantify expression variation across slices. Here, we aligned all four slices using a 2D templatebased alignment, using the second slice as the template. We then transferred each slice’s gene expression onto the template slice’s spatial coordinates by assigning each point in the template slice the average of its nearest neighbors in the corresponding aligned slice. Using these four slices within the CCS, we then computed the variance for each gene (Fig. 2e), representing a combination of experimental, biological and zaxis variability.
We found substantial variability in expression for several genes (Fig. 2f) that are related to tumor progression, including PRSS23 (ref. ^{38}) and CST4 (ref. ^{39}). To further investigate potential biological implications of this variation, we performed a gene set enrichment analysis, testing genes’ spatial variance (Fig. 2g) for overrepresentation or underrepresentation of specific gene modules. We found that ‘MYC targets’ and ‘upward KRAS signaling’, known to be associated with expression profiles in patients with breast cancer^{40,41,42,43}, were enriched in genes with high estimated spatial variance. Thus, variability of expression across aligned slices at single locations highlights biologically informative markers.
Aligning samples in 3D space to create an atlas
Our analyses of the ST data thus far have ignored the 3D nature of the contiguous slices. Thus, we asked whether we could infer the third dimension (the z axis) to create a CCS plus localized expression distributions for the 3D tumor, what we would call a ‘3D tumor atlas’^{30}.
To create a 3D tumor atlas, we fit GPSA on the four ST breast cancer slices, but we set the number of spatial dimensions to D = 3. We initialized the four slices’ zaxis coordinates as (0, 1, 2, 3) and allowed the model to warp these coordinates. Importantly, we used the same covariance function parameters for the warping GP across all spatial dimensions, which allows the alignment along the z axis to be informed by inferred spatial relationships along the x and y axes.
To perform the alignment, we used a twostep procedure. In the first step, we performed a templatebased alignment with the second slice as the fixed template. In the second step, we fixed aligned coordinates from warped slices 1, 3 and 4 as the template and fit GPSA again, warping the second slice’s coordinates. This process resulted in a 3D CCS for the tumor, where we have an estimate of gene expression at each location in the 3D CCS. The aligned z axis showed substantial adjustments from the original positions (Fig. 3); we hypothesize that GPSA’s secondlayer GP identified regions of the tumor that were functionally similar in terms of gene expression, and thus GPSA’s firstlayer GP warped those spatial locations to be near one another within the CCS.
We imputed a dense 3D model of gene expression within the estimated CCS (Supplementary Fig. 5). We found that expression of specific genes varied smoothly across this space, with substantial variation along the z axis (Fig. 3). These findings suggest that GPSA is a feasible model for creating 3D atlases using spatial genomic slices from sequential samples from a single tissue.
Aligning Visium profiles of the mouse cortex
Next, we applied GPSA to data collected using the Visium platform from 10x Genomics^{37}. These data (two adjacent slices) were collected from a crosssection of the sagittal–posterior region of the mouse brain. The slices contain measurements at 3,355 and 3,289 spatial locations. We again filtered the data, keeping spatially variable genes (Supplemental Methods 2.12) and leaving 135 genes. We fit templatebased GPSA, designating the first sample as the template. In the original data, there was a small spatial mismatch in the cerebellar folds of the two slices (Fig. 4a,c). Examining the aligned coordinates, we found that GPSA was able to correct this distortion by adjusting the second slice downward to match the first slice (Fig. 4b,d).
To quantify the alignment, we fit GPSA using all of the spots from the first slice and 80% of the spots from the second slice, reserving the remaining 20% of the spots for testing. We then made predictions for expression levels at the heldout spots using two strategies: (1) naively stacking the two slices and making predictions using a GP (the union GP) and (2) using our estimated CCS and localized expression estimates to predict expression values. We computed R^{2} for the masked predictions, repeating this experiment five times for random train–test splits. Predictions using the aligned coordinates from GPSA outperformed those using the original coordinates (Fig. 4e).
We next asked whether downstream analyses of these data could be strengthened following alignment with GPSA. To do this, we tested our ability to identify spatiallycorrelated genes before and after alignment by computing Moran’s I score for each gene. We found that the scores were consistently higher following alignment with GPSA, and scores for several genes were statistically significant only after alignment (false discovery rate ≤ 0.1; Fig. 4f), indicating improved statistical power. Under the union alignment, we identified 2,644 (of 4,260) genes with spatial autocorrelation, while, under the GPSA alignment, we identified 2,945 genes with spatial autocorrelation (Benjamini–Hochberg (BH)adjusted P < 0.1). Together, these findings imply that aligned coordinates strengthen downstream analyses of variation.
Aligning SlideseqV2 profiles of the mouse hippocampus
We next leveraged a set of two tissue slices collected from the hippocampus region of two mice using SlideseqV2^{3}. These samples are not immediately comparable due to major shifts in the field of view (Supplementary Fig. 7). Thus, as a preprocessing step, we first applied a coarse manual rotation and translation to put the samples approximately within the same field of view. We fit GPSA to these slices using a templatebased alignment with the first slice as the template.
The aligned coordinates showed correspondence between the two slices for multiple major landmark regions. In particular, we found that the dentate gyrus and the CA1–CA3 pyramidal layer were well aligned (Fig. 5). Due to differences in the field of view and in the structure of the brains of two different mice, we did not expect to achieve a perfect onetoone matching of the spatial coordinates. In particular, we observe that the choroid plexus was a prominent marker in the first slice but not in the second (Supplementary Fig. 8), and several other structures were not present in the second slice (Supplementary Fig. 9). The flexibility of GPSA allows for these distinctions. Moreover, a user could manually correct a deformation in the latent CCS if it were known to be incorrect (Discussion).
To further validate this CCS, we computed the distance between three landmark locations (two endpoints of the dentate gyrus and an edge of area CA3 (Supplementary Fig. 10)) in both slices before and after alignment. For all three landmarks, the distance between the slices decreased after alignment (Fig. 5i). This suggests that known structural landmarks are well aligned in GPSA even when hidden from the model.
Next, we fit GPSA while holding out a fraction of the spots and tested how well we could predict the expression values at the heldout spots before and after alignment. Again GPSA outperformed a concatenation of slices in prediction, and this improved prediction was largely consistent across genes (Supplementary Fig. 11).
Multimodal alignment: incorporating histology images
We jointly aligned spatiallyresolved gene expression and histology images. The histology images contain measurements for three color channels and thus are low dimensional relative to gene expression values; however, these images often contain interpretable features of anatomy widely used by pathologists, leading to their availability alongside spatial gene expression profiling. We hypothesize that including histology images in the alignment procedure may produce better alignments and enhance the interpretability of alignments in downstream analyses, such as enabling straightforward CCS annotation.
To test this hypothesis, we used mouse brain data from the Visium platform. For each slice, a histology image is prealigned to a Visium slice. We again fit templatebased GPSA using the S = 2 slices with the first slice as the template, but this time we also include the histology images as phenotypic readout features and spatial locations.
GPSA successfully aligned these multimodal samples (Fig. 6). In the original histology images, there was a slight misalignment in one of the cerebellar folds (Fig. 6c). After fitting GPSA, we observed that the alignment had been corrected (Fig. 6d). Examining gene expression in the corresponding region, we found that the darker histology region corresponded to higher levels of expression in the genes CAMK2A (BHadjusted P value ≤ 1.0 × 10^{−5}) and MTCO1 (BHadjusted P value ≤ 5.0 × 10^{−3}; Supplementary Fig. 12).
We computed a vector field showing the displacement of each spatial coordinate after the alignment. Substantial nonlinear warping was necessary to align the histology stains (Fig. 6e). These results suggest that GPSA may be used to align multimodal data including spatial gene expression and histology measurements, broadening its potential applications.
Discussion
We have presented GPSA, a Bayesian twolayer GP model for aligning multiple spatial genomic and histology slices into a known or an unknown CCS. We have shown that our model can flexibly align samples from multiple spatial sequencing technologies, fields of view and data modalities. Current approaches such as PASTE^{25} and Splotch^{26} rely on linear transformations of spatial coordinates. We showed the necessity of allowing nonlinear warpings. We imagine a twostage strategy for building tissue atlases: (1) running PASTE to find a coarse alignment and (2) running GPSA to tune the coarse alignment and produce a CCS with localized measurement distributions across the space. Given GPSA’s flexible assumptions, our model is applicable to many other spatial sequencing technologies (both present and future) with varying levels of resolution, fields of view and profiling.
Several future directions could be pursued. While we find that GPSA finds more accurate alignments than competing approaches, this comes at the cost of time. Our variational inducing point inference reduces the time complexity from O(n^{3}) to O(nm^{2}), where n and m are the number of readouts and inducing points, respectively. However, a large number of inducing points are often required for highresolution technologies. Nonetheless, we find that the time required to fit GPSA to the data presented here is not prohibitive (Supplementary Fig. 13). Moreover, researchers only need to align their samples once with GPSA. Including known anatomical landmarks could speed up the alignment and lead to more biologically interpretable coordinate systems. To incorporate prior anatomical knowledge, structural landmarks could easily be included in the GPSA framework by fixing the annotated landmark locations in the CCS. However, an attractive feature of GPSA is its reliance on almost no prior knowledge about the structure of the tissue. Future work may also use landmarks inferred from the histology images to annotate the CCS in an automated manner. Finally, there remains an opportunity for a deeper theoretical study of GPSA and DGPs in general. Studying the posterior consistency of the kernel parameters and the latent variable G could lead to theoretical guarantees for the resulting CCSs.
Methods
Problem definition and notation
Here, we formalize the problem that we wish to solve. We define a spatiallyresolved slice or slice from a spatial genomic or histology technology as a set of pairs \({\{({{{{\bf{x}}}}}_{i},{{{{\bf{y}}}}}_{i})\}}_{i = 1}^{n}\), where \({{{{\bf{x}}}}}_{i}\in {{\mathbb{R}}}^{D}\) is a vector of spatial coordinates encoding a single slice’s relative location in a Ddimensional space, and \({{{{\bf{y}}}}}_{i}\in {{\mathbb{R}}}^{p}\) is a vector of measured readout features at this location. Typically, D ∈ {2, 3, 4} in biomedical applications, where D = 4 corresponds to the spatiotemporal setting. We focus on D = {2, 3} in this paper. Following convention, we refer to a single location x_{i} as a spot, which may refer to a single cell, a subcellular location, a collection of cells or a single pixel depending on the technology. We arrange the observations from a slice into two data matrices: the spots’ relative locations \({{{\bf{X}}}}\in {{\mathbb{R}}}^{n\times D}\) and the phenotypic readouts associated with those spots \({{{\bf{Y}}}}\in {{\mathbb{R}}}^{n\times p}\).
To give concrete examples, in ST applications, x_{i} encodes a spatial location on a single tissue slice, and y_{i} is a vector of RNA transcript counts at this location for each of p genes. In a histology setting, x_{i} is a pixel location, and y_{i} is a vector containing the p image color channel readouts.
We assume that we have S spatiallyresolved slices collected from the same tissue type and similar tissue region. Often, these slices will be adjacent slices from a single tissue, but, as we showed in our results, our approach is extensible to datasets collected from different tissue samples or individuals. Suppose slice s (s ∈ {1, …, S}) contains n_{s} spots, and let \({{{{\bf{X}}}}}^{s}={[{{{{\bf{x}}}}}_{1}^{s},{{{{\bf{x}}}}}_{2}^{s},\ldots ,{{{{\bf{x}}}}}_{{n}_{s}}^{s}]}^{\top }\) denote its spatial locations. Similarly, let \({{{{\bf{Y}}}}}^{s}={[{{{{\bf{y}}}}}_{1}^{s},{{{{\bf{y}}}}}_{2}^{s},\cdots ,{{{{\bf{y}}}}}_{{n}_{s}}^{s}]}^{\top }\) be the sth readout of feature values. We denote the total number of spots across slices as \(N=\mathop{\sum }\nolimits_{s = 1}^{S}{n}_{s}\). We note that, in our framework, the slices may have different total numbers of spots and may be on different scales.
Our goal is to align these S slices’ spatial coordinates by creating a CCS such that the matching anatomical, structural and functional regions of each slice are mapped to the same absolute locations in the CCS. To do this, we seek correspondences between both the spatial coordinates and phenotypic readouts of each slice. Specifically, we seek S vectorvalued warping functions g^{1}, g^{2}, …, g^{S}, with \({g}^{s}:{{\mathbb{R}}}^{D}\to {{\mathbb{R}}}^{D}\), each of which maps a slice’s observed relative spatial coordinates into a shared CCS. Let \({{{{\bf{g}}}}}_{i}^{s}={g}^{s}({{{{\bf{x}}}}}_{i}^{s})\) denote evaluation of the sth warping function at spatial location \({{{{\bf{x}}}}}_{i}^{s}\). We call \({{{{\bf{g}}}}}_{i}^{s}\) the ‘aligned spatial location’ of this spot, and let the full set of aligned spatial locations be denoted as \({{{\bf{G}}}}=[{{{{\bf{g}}}}}_{1}^{1},{{{{\bf{g}}}}}_{2}^{1},\ldots ,{{{{\bf{g}}}}}_{i}^{s},{{{{\bf{g}}}}}_{i+1}^{s},\ldots ,{{{{\bf{g}}}}}_{{n}_{S}}^{S}]\).
Our goal is to estimate these warping functions \({\{{g}^{s}\}}_{s = 1}^{S}\) such that any two samples mapped to nearby points in the CCS, \({{{{\bf{g}}}}}_{i}^{s}\approx {{{{\bf{g}}}}}_{{i}^{{\prime} }}^{{s}^{{\prime} }}\), are structurally and functionally similar to one another. We consider three approaches that show the powerful behavior of our probabilistic model under uncertainty and censored information. First, we treat the multiple slices as biological replicates to leverage both spatial information and the measured readouts for alignment. Second, we consider multiple data modalities of the same biological system, assuming that the data come from approximately the same location in the absolute coordinate system to leverage the spatial locations and all modalities jointly. Third, we use multiple slices and infer their relationship along an unobserved z axis, assuming that the measured readouts vary across the z axis in a smooth manner. The flexibility of our GP framework allows each of these three approaches to alignment.
Gaussian processes
A GP is a stochastic process defined as a collection of random variables in which any subset follows a multivariate Gaussian distribution. Specifically, y_{1}, y_{2}, … constitute a GP if, for any finite set of indices i_{1}, i_{2}, …, i_{n}, it holds that
where μ is a mean vector and Σ is a positive definite covariance matrix. GPs are widely used in functional data analysis, machine learning and spatial statistics due to their flexibility and expressiveness in modeling complex dependent data^{44,45,46,47,48}. For example, in nonparametric regression models, GPs are commonly used to model unknown arbitrary functions; in Bayesian contexts, they act as priors over functions^{49}.
GPs are often used as prior distributions over functions, as in this paper. In this case, for a function f defined on the domain \({{\mathbb{R}}}^{D}\), we denote a GP prior as
where \(\mu :{{\mathbb{R}}}^{D}\to {\mathbb{R}}\) is a mean function and \(k:{{\mathbb{R}}}^{D}\times {{\mathbb{R}}}^{D}\to {\mathbb{R}}\) is a positive definite covariance function (also known as a kernel function or covariogram). For noisy responses from the noiseless function f, we include Gaussian noise: \(y \sim {{{\mathcal{N}}}}(f({{{\bf{x}}}}),{\sigma }^{2})\), where σ^{2} is often referred to as the ‘nugget’.
Deep Gaussian processes
DGPs were developed to further extend the expressivity of GPs^{50,51}. DGPs are a composition of functions, each of which is drawn from a GP. In the univariate case, the function drawn from an Llayer DGP is given by
where, for each ℓ = 1, …, L, we have \({f}_{\ell } \sim {{{\rm{GP}}}}({\mu }_{\ell },{k}_{\ell })\), and y_{ℓ} = f_{ℓ}(f_{ℓ−1}( ⋯ f_{1}(x))) is the output of the ℓth layer in an input sample x. In this work, we use twolayer DGPs or L = 2.
Gaussian Process Spatial Alignment
First layer: warping functions
GPSA places GP priors on the warping functions g^{1}, …, g^{S} that map the observed spatial coordinates onto a CCS. Focusing on the case with D = 2 spatial dimensions for demonstration, GPSA assumes that
where \({g}_{d}^{s}\) is the warping function for slice s for which the output is the dth spatial dimension, \({\mu }_{gd}:{{\mathbb{R}}}^{D}\to {\mathbb{R}}\) is a mean function, and \({k}_{g}:{{\mathbb{R}}}^{D}\times {{\mathbb{R}}}^{D}\to {\mathbb{R}}\) is a positive definite covariance function. We specify the mean of the aligned spatial location to be equal to the observed location, μ_{gd}(x) = x_{d}, which encourages the aligned coordinate for a given spatial location to be centered around the observed location. This assumption is useful to avoid extreme warps that drastically shift the mean of each observed location.
Second layer: modeling phenotypic readouts
GPSA then posits another set of functions \({\{{f}_{j}\}}_{j = 1}^{p}\) that describe the spatial organization of each phenotypic readout (for example, gene expression values) within the CCS. We place a GP prior on these functions as well. Letting \({y}_{ij}^{s}\) denote the value for feature j in spot i from slice s, GPSA assumes that
where \(\epsilon \sim {{{\mathcal{N}}}}(0,{\sigma }^{2})\) is a noise term, \({\mu }_{f}:{{\mathbb{R}}}^{D}\to {\mathbb{R}}\) is a mean function, and \({k}_{f}:{{\mathbb{R}}}^{D}\times {{\mathbb{R}}}^{D}\to {\mathbb{R}}\) is a positive definite covariance function. Let \({{{{\rm{f}}}}}_{ij}^{s}\in {\mathbb{R}}\) be the evaluation of f_{j} at input \({{{{\bf{g}}}}}_{i}^{s}\). We specify μ_{f} = 0, as we assume that the phenotypic readouts have been centered. Furthermore, let \({{{\bf{F}}}}=[{{{{\rm{f}}}}}_{11}^{1},{{{{\rm{f}}}}}_{21}^{1},\ldots ,{{{{\rm{f}}}}}_{ij}^{s},{{{{\rm{f}}}}}_{(i+1)j}^{s},\ldots ,{{{{\rm{f}}}}}_{{n}_{s}p}^{S}]\) denote the full set of function evaluations.
The above model results in a twolayer DGP where, for each slice s, the DGP is made up of a composition of two functions, f∘g^{s}.
Posterior inference for the common coordinate system
We have two statistical objectives with the GPSA model: estimating the CCS, as represented by the latent variable G, and estimating the warped, denoised and localized values for the phenotypic readouts for each slice, represented by F. The CCS gives us an atlas of the system; the warped and smoothed readouts may be used for downstream analysis of the aligned slices. Thus, in our Bayesian GPSA framework, the posterior distribution of interest is
where the vector Θ contains the parameters for the mean and covariance functions and Z = p(X, Y∣Θ) is a normalizing constant. However, Z is analytically intractable in DGPs^{50}. Thus, we use stochastic variational inference with inducing variables to approximate the posterior distribution over G and F (ref. ^{52}).
Stochastic variational inference for GPSA
Although closedform posterior distributions are available in GPs, this is not the case in DGPs. To perform approximate inference, we leverage a sparse GP framework using inducing points^{51,53,54}. Because GPSA is a twolayer DGP, we include inducing points at each of the two layers. In particular, suppose we have a set of M^{s} < n_{s} inducing locations (also known as pseudoinputs) for each slice \({\widetilde{{{{\bf{X}}}}}}^{1},\ldots ,{\widetilde{{{{\bf{X}}}}}}^{S}\in {{\mathbb{R}}}^{{M}^{s}\times D}\) and another set of M < N inducing locations in the CCS layer \(\widetilde{{{{\bf{G}}}}}\in {{\mathbb{R}}}^{M\times D}\). We then denote the associated set of inducing values (pseudooutputs) for the two layers as \({{{{\bf{U}}}}}^{{G}^{1}},\ldots ,{{{{\bf{U}}}}}^{{G}^{S}}\in {{\mathbb{R}}}^{{M}^{s}\times D}\) and \({{{{\bf{U}}}}}^{F}\in {{\mathbb{R}}}^{M\times p}\), respectively. The joint model (omitting dependence on the covariance function parameters Θ) is then
Note that p(F∣U^{F}, G) and p(G∣U^{G}, X) have closed forms because they are conditional multivariate Gaussians. If a Gaussian noise model is assumed, then p(Y∣F) also has a closed form. However, inference in this model scales cubically with the number of spots; therefore, we seek a faster variational approach.
We now specify a variational model Q, with parameters that we will optimize to approximate the exact posterior (equation (3)). Following earlier work^{50}, we use the following form for the approximate posterior:
where q(U^{F}) and q(U^{G}) are chosen to be multivariate normal distributions. We denote the variational parameters collectively as ϕ. Because all distributions are Gaussian, we can analytically marginalize out the pseudooutputs U^{F} and U^{F} (ref. ^{51}). See Appendix for details.
The optimization problem is then to minimize the Kullback Leibler (KL) divergence from the exact posterior (equation (3)) to the approximate posterior (equation (5)) with respect to the variational parameters. This is equivalent to maximizing a lower bound on the log marginal likelihood \({{{\mathcal{L}}}}\le \log p({{{\bf{Y}}}})\) (the evidence lower bound or ELBO). The variational parameters ϕ are made up of the parameters of the prior distributions for the pseudooutputs q(U^{F}) and q(U^{G}) and optionally the inducing locations \({\widetilde{{{{\bf{X}}}}}}^{1},\ldots ,{\widetilde{{{{\bf{X}}}}}}^{S},\widetilde{{{{\bf{G}}}}}\). More precisely, our optimization problem is
We provide a complete derivation and explanation of this lower bound in the next section. Although this lower bound cannot be evaluated in closed form, we can efficiently sample from it and use these samples to maximize with respect to the variational parameters ϕ.
Maximizing the ELBO in GPSA
Recall that the ELBO for a generic model with observed data x, latent variable z and approximating distribution q is given by
where p(x, z) is the joint model density, and q(z) is the variational distribution.
Plugging in our GPSA model, the ELBO is given by
We can split equation (6) into a term containing the expected log likelihood and two terms that are KL divergences:
Because we let q(U^{F}) and q(U^{G}) be multivariate Gaussians, the KL divergence has a closed form, and the only remaining term to estimate is the expected log likelihood (the first term in equation (8)). We estimate this term with a Monte Carlo approximation. Given T samples of F, our estimate is
where \({\widehat{{{{\bf{F}}}}}}_{1},\ldots ,{\widehat{{{{\bf{F}}}}}}_{T} \approx Q\). We use a twostage procedure to obtain these samples. First, we draw samples of \(\widehat{{{{\bf{G}}}}}\) from \(p({{{\bf{G}}}} {{{\bf{X}}}},\widetilde{{{{\bf{X}}}}})=\int\,p({{{\bf{G}}}} {{{{\bf{U}}}}}^{G},{{{\bf{X}}}},\widetilde{{{{\bf{X}}}}})p({{{{\bf{U}}}}}^{G} \widetilde{{{{\bf{X}}}}})d{{{{\bf{U}}}}}^{G}.\) Second, we draw samples of \(\widehat{{{{\bf{F}}}}}\) from \(p({{{\bf{F}}}} \widehat{{{{\bf{G}}}}},\widetilde{{{{\bf{G}}}}})=\int\,p({{{\bf{F}}}} {{{{\bf{U}}}}}^{F},{{{\bf{G}}}},\widetilde{{{{\bf{G}}}}})p({{{{\bf{U}}}}}^{F} \widetilde{{{{\bf{G}}}}})d{{{{\bf{U}}}}}^{F}.\) We can write each of these distributions in closed form.
Let \(q({{{{\bf{u}}}}}_{d}^{Gs})={{{\mathcal{N}}}}({{{{\bf{m}}}}}_{d}^{Gs},{{{{\bf{S}}}}}_{d}^{Gs}).\) The marginal for G is given by
with
Let \(q({{{{\bf{u}}}}}_{j}^{F})={{{\mathcal{N}}}}({{{{\bf{m}}}}}_{j}^{F},{{{{\bf{S}}}}}_{j}^{F}).\) The marginal for F is given by
with
We then maximize the ELBO with respect to the variational parameters, as well as the covariance function parameters. If the covariance function parameters are optimized, one can regularize the covariance function parameters to avoid unrealistic warping functions. Under our Bayesian framework, we can place a prior distribution on the covariance function parameters to limit the warps to be small and stable.
This procedure is also amenable to stochastic optimization algorithms^{52}. In terms of memory consumption, GPSA is extremely scalable. In particular, stochastic optimization algorithms open the door to scale GPSA to datasets with millions of spots by using a subset of the spots (a ‘minibatch’) on each iteration of optimization. The required memory consumption will thus scale with the chosen minibatch size, which can be made arbitrarily small depending on a user’s memory constraints.
Multivariate correlated outcomes
In its simplest form, GPSA assumes that feature readouts are independent of one another by modeling each with a separate GPdistributed function f_{j}. However, given that our phenotypic readouts of interest (gene expression, for example) are often highly correlated between features, we would like to leverage the correlation between readouts to fit f. There are several approaches to accounting for this correlation^{35,55,56}.
We choose to leverage the LMC^{35}. Rather than allowing p separate GPs, the LMC assumes that there are L < p latent GPs and that the observed readouts are a linear combination of the outputs of these latent GPs. To incorporate this into our registration model, we assume the following model for the secondlayer GP:
where \({{{\bf{W}}}}\in {{\mathbb{R}}}^{p\times L}\) is a loading matrix, \({{{\bf{F}}}}\in {{\mathbb{R}}}^{L\times N}\) is a matrix containing latent factors and I is the identity matrix. Given a set of warped coordinates G, our likelihood is then
Including the warp model, our entire joint model becomes
In our applications, we may not be interested in directly estimating the latent factors F. We can marginalize these out^{57} and write the likelihood as
where
If the latent covariance functions k_{1}, …, k_{L} are the same, the covariance simplifies as Σ = K_{GG} ⊗ WW^{⊤}.
Multimodal outcomes
We may sometimes have access to multiple samples from each slice, each of whose phenotypic readouts are collected from different modalities. For example, we may have an ST sample and a histology image in each slice. While both of these modalities lie in a 2D spatial coordinate system, they have different response values. In this example, the ST readouts will be \({{{{\bf{Y}}}}}^{1}\in {{\mathbb{R}}}^{n\times p}\), where p is the number of genes, while the histology image readouts will be \({{{{\bf{Y}}}}}^{2}\in {{\mathbb{R}}}^{m\times q}\), where q is the number of color channels.
Our model can easily accommodate this setting. We assume that the different modalities are already aligned within each slice, which is a reasonable assumption in practice. Instead of computing the likelihood for only one set of phenotypic readouts, we compute it for each modality’s phenotypic readouts. For example, the likelihood becomes
where M is the number of modalities, p_{m} is the number of readout features in modality m, Y^{m} is the set of readout features for modality m, and σ2 is the variance of the Gaussian noise of each readout.
NonGaussian likelihoods
We can accommodate nonGaussian likelihoods in this model. In particular, we can specify the likelihood in equation (4), p(Y∣F), to be any suitable data likelihood. In the setting of sequencing data, the measurements often come in the form of nonnegative integer counts, for which a Poisson likelihood is often a reasonable choice.
Imputing dense spatial readouts under GPSA
The second layer of GPSA represents a mapping from the CCS to the observed phenotypic readouts. Thus, for any location in the CCS (regardless of whether a sample location is mapped to this point in the first layer or not), we can compute the variational parameters for the phenotypic readouts at this location (equations (9) and (10)). This allows for querying across a dense grid of locations in the CCS, yielding a distribution over the phenotypic readouts at these locations.
Model settings and preprocessing for experiments
In our experiments, we normalize all spatial coordinates so that the minimum x and y coordinate values are 0, and the maximum coordinate values are 10.
For all experiments, we specify the mean function of the GP prior for the warping functions to be the identity function. This choice is motivated by our expectation that most distortions in tissue samples will be relatively small and local, with large translations between slices being uncommon. We use the RBF covariance function for the firstlayer GPs. The RBF covariance function is given by
where ℓ is the length scale parameter, and τ^{2} is the spatial variance parameter. Intuitively, ℓ controls how different the warping function is locally, and τ^{2} controls the overall magnitude of the warping function (Supplementary Fig. 5). For the second layer of the multioutput GP, with an LMC covariance function, we infer the covariance function parameters using maximum likelihood. Model parameters, including covariance function parameters, are fitted during training by maximizing a lower bound on the log marginal likelihood of the data. For the firstlayer GP (the warp GP) in de novo alignments, we fix the covariance function parameters before model fitting. Specifically, we fix the length scale as ℓ = 10 and the spatial variance as σ^{2} = 1 to ensure smooth and minimal warps. We found that these choices are relatively robust within a range (Supplementary Fig. 18). Our empirical tests show that the model’s performance tends to be stable for higher values of σ^{2} and ℓ. This is likely due to the fact that the model is more constrained with lower values of σ^{2} and ℓ (that is, it is difficult for the model to accommodate distortions with large magnitude under these parameter settings).
For our applications to spatial genomics data, we filter the readout features to features that show spatial correlation. Specifically, for each readout feature, we compute Moran’s I statistic^{58} (Supplementary Fig. 19) and retain features in the top 5% of I scores. We find that this approach identifies genes with high spatial variability (Supplementary Figs. 16 and 17) and identifies genes that were identified in previous work^{15}. More complicated procedures to identify spatially variable genes could be used^{15}, but this is not the primary focus of our work. This step increases the efficiency of GPSA not only by reducing the dimension of the readout features but also by removing features that are not correlated across space and would not aid a spatial alignment (Supplementary Fig. 8).
Synthetic warps
Throughout our experiments, we apply three different types of random warps, which we describe here.

1.
Linear warp: this warp applies a linear transformation to the observed spatial coordinates for each slice X^{s} such that \({\widetilde{x}}_{d}^{s}={({{{{\bf{x}}}}}^{s})}^{\top }{{{{\boldsymbol{\beta }}}}}_{d}^{s}+{\beta }_{d_0}^{s}+\epsilon\) for d ∈ {1, …, D}, where \({{{{\boldsymbol{\beta }}}}}_{d}^{s}\in {{\mathbb{R}}}^{D},{\beta }_{d_0}^{s}\in {\mathbb{R}}\) are the slope and intercept, respectively, and \(\epsilon \sim {{{\mathcal{N}}}}(0,{\sigma }^{2})\) is a noise term.

2.
Polar warp: for a single spatial sample to be represented as \({{{\bf{x}}}}={[{x}_{1},{x}_{2}]}^{\top }\), this function is defined as
$${g}^{s}(x;\theta )=\left[\begin{array}{c}{x}_{1}+r\cos \phi \\ {x}_{2}+r\sin \phi \end{array}\right],$$where θ = {r, ϕ}. We further parametrize θ to allow for locationspecific distortions. Thus, θ is implicitly a function of x as well,
$$\left[\begin{array}{c}{r}_{x}\\ {\phi }_{x}\end{array}\right]=\theta (x)={{{\bf{B}}}}{{{\bf{x}}}},$$where B is a 2 × 2 coefficient matrix. The full warping function can then be written as
$$\begin{array}{rcl}{g}^{s}(x;\theta )=\left[\begin{array}{c}{x}_{1}+{b}_{11}{x}_{1}\cos ({b}_{12}{x}_{1})\\ {x}_{2}+{b}_{21}{x}_{1}\sin ({b}_{22}{x}_{2})\end{array}\right].\end{array}$$ 
3.
GP warp applies a transformation function that is drawn from a GP:
$${\widetilde{{{{\bf{x}}}}}}_{d}^{s}={f}_{d}^{\,s}({{{{\bf{x}}}}}_{d}^{s})+\epsilon ,\quad {f}_{d}^{\,s}({{{{\bf{x}}}}}_{d}^{s}) \sim {{{\rm{GP}}}}({{{{\bf{x}}}}}_{d}^{s},{{{{\bf{K}}}}}_{{{{{\bf{x}}}}}_{d}^{s}{{{{\bf{x}}}}}_{d}^{s}}).$$(12)
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The following data are available: (1) ST data were obtained from the PASTE code repository: https://github.com/raphaelgroup/paste. All four layers from the ‘sample_data/’ directory were used. (2) Visium data were obtained from the 10x Genomics website. Data for the two slices were downloaded from the ‘Datasets’ page. Specifically, spatial gene expression and hematoxylin and eosin stains were downloaded from the following links: mouse brain serial section 1 (sagittal–posterior; https://www.10xgenomics.com/resources/datasets/mousebrainserialsection1sagittalposterior1standard110) and mouse brain serial section 2 (sagittal–posterior; https://www.10xgenomics.com/resources/datasets/mousebrainserialsection2sagittalposterior1standard110). (3) SlideseqV2 data were downloaded from the Broad Institute’s Single Cell Portal: https://singlecell.broadinstitute.org/single_cell/study/SCP815/highlysensitivespatialtranscriptomicsatnearcellularresolutionwithslideseqv2. Two pucks corresponding to the mouse hippocampus were used: Puck_191204_01 and Puck_200115_08.
Code availability
Code for the model and experiments is available at https://github.com/andrewcharlesjones/spatialalignment. All experiments in the paper can be run with the Python scripts in the ‘experiments/’ directory of the GitHub repository.
References
Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).
Rodriques, S. G. et al. Slideseq: a scalable technology for measuring genomewide expression at high spatial resolution. Science 363, 1463–1467 (2019).
Stickels, R. R. et al. Highly sensitive spatial transcriptomics at nearcellular resolution with SlideseqV2. Nat. Biotechnol. 39, 313–319 (2021).
Lee, Y. et al. XYZeq: spatially resolved singlecell RNA sequencing reveals expression heterogeneity in the tumor microenvironment. Sci. Adv. 7 eabg4755 (2021).
Zhao, T. et al. Spatial genomics enables multimodal study of clonal heterogeneity in tissues. Nature 601, 85–91 (2021).
Lubeck, E. & Cai, L. Singlecell systems biology by superresolution imaging and combinatorial labeling. Nat. Methods 9, 743–748 (2012).
Eng, C.H. L. et al. Transcriptomescale superresolved imaging in tissues by RNA seqFISH+. Nature 568, 235–239 (2019).
Goltsev, Y. et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell 174, 968–981 (2018).
Keren, L. et al. MIBITOF: a multiplexed imaging platform relates cellular phenotypes and tissue structure. Sci. Adv. 5, eaax5851 (2019).
Thornton, C. A. et al. Spatially mapped singlecell chromatin accessibility. Nat. Commun. 12, 1274 (2021).
Velten, B. et al. Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO. Nat. Methods 19, 179–186 (2022).
Townes, F. W. & Engelhardt, B. E. Nonnegative spatial factorization applied to spatial genomics. Nat. Methods 20, 229–238 (2022).
Atta, L. & Fan, J. Computational challenges and opportunities in spatially resolved transcriptomic data analysis. Nat. Commun. 12, 5283 (2021).
Verma, A. & Engelhardt, B. E. A Bayesian nonparametric semisupervised model for integration of multiple singlecell experiments. Preprint at bioRxiv https://doi.org/10.1101/2020.01.14.906313 (2020).
Svensson, V., Teichmann, S. A. & Stegle, O. SpatialDE: identification of spatially variable genes. Nat. Methods 15, 343–346 (2018).
Dries, R. et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol. 22, 78 (2021).
Palla, G. et al. Squidpy: a scalable framework for spatial single cell analysis. Nat. Methods 19, 171–178 (2022).
Brett, M., Christoff, K., Cusack, R. & Lancaster, J. et al. Using the Talairach atlas with the MNI template. NeuroImage 13, 85 (2001).
Klein, A. et al. Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration. NeuroImage 46, 786–802 (2009).
Lancaster, J. L. et al. Automated Talairach atlas labels for functional brain mapping. Hum. Brain Mapp. 10, 120–131 (2000).
Evans, A. C. An MRIbased stereotactic atlas from 250 young normal subjects. Society of Neuroscience Abstracts 18, 408 (1992).
Collins, D. L., Neelin, P., Peters, T. M. & Evans, A. C. Automatic 3D intersubject registration of MR volumetric data in standardized Talairach space. J. Comput. Assist. Tomogr. 18, 192–205 (1994).
Haxby, J. V. et al. A common, highdimensional model of the representational space in human ventral temporal cortex. Neuron 72, 404–416 (2011).
Lorbert, A. & Ramadge, P. J. Kernel hyperalignment. Adv. Neural Inf. Process. Syst. 25, 1790–1798 (2012).
Zeira, R., Land, M. & Raphael, B. Alignment and integration of spatial transcriptomics data. Preprint at bioRxiv https://doi.org/10.1101/2021.03.16.435604 (2021).
Äijö, T. et al. Splotch: robust estimation of aligned spatial temporal gene expression data. Preprint at bioRxiv https://doi.org/10.1101/757096 (2019).
Andersson, A. et al. A landmarkbased common coordinate framework for spatial transcriptomics data. Preprint at bioRxiv https://doi.org/10.1101/2021.11.11.468178 (2021).
Preibisch, S., Karaiskos, N. & Rajewsky, N. Imagebased representation of massive spatial transcriptomics datasets. Preprint at bioRxiv https://doi.org/10.1101/2021.12.07.471629 (2021).
Sunkin, S. M. et al. Allen Brain Atlas: an integrated spatio–temporal portal for exploring the central nervous system. Nucleic Acids Res. 41, D996–D1008 (2012).
RozenblattRosen, O. et al. The Human Tumor Atlas Network: charting tumor transitions across space and time at singlecell resolution. Cell 181, 236–249 (2020).
Linderman, G. C. Dimensionality reduction of singlecell RNAseq data. In RNA Bioinformatics 331–342 (Springer, 2021).
Zeisel, A. et al. Cell types in the mouse cortex and hippocampus revealed by singlecell RNAseq. Science 347, 1138–1142 (2015).
Pierson, E. & Yau, C. ZIFA: dimensionality reduction for zeroinflated singlecell gene expression analysis. Genome Biol. 16, 241 (2015).
Ding, J., Condon, A. & Shah, S. P. Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat. Commun. 9, 2002 (2018).
Goulard, M. & Voltz, M. Linear coregionalization model: tools for estimation and choice of crossvariogram matrix. Math. Geol. 24, 269–286 (1992).
Vickovic, S. et al. Highdefinition spatial transcriptomics for in situ tissue profiling. Nat. Methods 16, 987–990 (2019).
10x Genomics. Mouse Brain Serial Sections (Sagittal–Posterior), Spatial Gene Expression Dataset by Space Ranger 1.1.0, 10x Genomics (2020). https://www.10xgenomics.com/resources/datasets/mousebrainserialsection2sagittalposterior1standard110
Chan, H.S. et al. Serine protease PRSS23 is upregulated by estrogen receptor α and associated with proliferation of breast cancer cells. PLoS ONE 7, e30397 (2012).
Zhang, Y. Q., Zhang, J. J., Song, H. J. & Li, D. W. Overexpression of CST4 promotes gastric cancer aggressiveness by activating the ELFN2 signaling pathway. Am. J. Cancer Res. 7, 2290–2304 (2017).
Hwang, K.T. et al. Prognostic role of KRAS mRNA expression in breast cancer. J. Breast Cancer 22, 548–561 (2019).
Jančík, S., Drábek, J., Radzioch, D. & Hajdúch, M. Clinical relevance of KRAS in human cancers. J. Biomed. Biotechnol. 2010, 150960 (2010).
Xu, J., Chen, Y. & Olopade, O. I. MYC and breast cancer. Genes Cancer 1, 629–640 (2010).
Fallah, Y., Brundage, J., Allegakoen, P. & ShajahanHaq, A. N. MYCdriven pathways in breast cancer subtypes. Biomolecules 7, 53 (2017).
Rasmussen, C. E. & Williams, C. K. I. Gaussian Processes for Machine Learning 1st edn, Ch. 1 (MIT, 2005).
Stein, M. L. Interpolation of Spatial Data: Some Theory for Kriging (Springer Science & Business Media, 1999).
Gelfand, A. E., Diggle, P., Guttorp, P. & Fuentes, M. Handbook of Spatial Statistics (CRC, 2010).
Cressie, N. & Wikle, C. K. Statistics for Spatio–Temporal Data (John Wiley & Sons, 2011).
Banerjee, S., Carlin, B. P. & Gelfand, A. E. Hierarchical Modeling and Analysis for Spatial Data (CRC, 2014).
Ghosal, S. & Van der Vaart, A. Fundamentals of Nonparametric Bayesian Inference Vol. 44 (Cambridge University, 2017).
Damianou, A. & Lawrence, N. D. Deep Gaussian processes. In Proceedings of the Conference on Artificial Intelligence and Statistics (AISTATS) 207–215 (PMLR, 2013).
Salimbeni, H. & Deisenroth, M. Doubly stochastic variational inference for deep Gaussian processes. Adv. Neural Inf. Process. Syst. 30 (2017).
Hensman, J., Fusi, N. & Lawrence, N. D. Gaussian processes for big data. In Proceedings of Uncertainty in Artificial Intelligence (UAI; 2013).
Titsias, M. Variational learning of inducing variables in sparse Gaussian processes. In Proceedings of the Conference on Artificial Intelligence and Statistics (AISTATS) 567–574 (PMLR, 2009).
Snelson, E. & Ghahramani, Z. Sparse Gaussian processes using pseudoinputs. Adv. Neural Inf. Process. Syst. 18, 1257 (2006).
Boyle, P. & Frean, M. Dependent Gaussian processes. Adv. Neural Inf. Process. Syst. 17, 217–224 (2005).
Gelfand, A. E., Schmidt, A. M., Banerjee, S. & Sirmans, C. Nonstationary multivariate process modeling through spatially varying coregionalization. Test 13, 263–312 (2004).
Kyzyurova, K. N. On linear model of coregionalization. Technical note (2019). http://kseniak.ucoz.net/Ksenia_LMC.pdf
Moran, P. A. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).
Acknowledgements
This work was funded by Helmsley Trust grant AWD1006624, NIH NCI U2CCA233195, NIH NHLBI R01 HL133218 and NSF CAREER AWD1005627. D.L. was supported by NIH/NCATS award UL1 TR002489, NIH/NHLBI award R01 HL149683 and NIH/NIEHS award P30 ES010126.
Author information
Authors and Affiliations
Contributions
A.J., F.W.T., D.L. and B.E.E. designed the method. A.J. implemented the method and conducted data analysis. A.J., F.W.T. and B.E.E. analyzed the results. A.J. wrote the manuscript. A.J., D.L., F.W.T. and B.E.E. edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
B.E.E. is on the SAB of Creyon Bio, ArrePath and Freenome; B.E.E. consults for Neumora. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Stephan Preibisch, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–23 and Table 1
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jones, A., Townes, F.W., Li, D. et al. Alignment of spatial genomics data using deep Gaussian processes. Nat Methods 20, 1379–1387 (2023). https://doi.org/10.1038/s41592023019722
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592023019722
This article is cited by

Simultaneous Denoising and Heterogeneity Learning for Time Series Data
Statistics in Biosciences (2023)