Spatial Transcriptomics to define transcriptional patterns of zonation and structural components in the mouse liver

Reconstruction of heterogeneity through single cell transcriptional profiling has greatly advanced our understanding of the spatial liver transcriptome in recent years. However, global transcriptional differences across lobular units remain elusive in physical space. Here, we apply Spatial Transcriptomics to perform transcriptomic analysis across sectioned liver tissue. We confirm that the heterogeneity in this complex tissue is predominantly determined by lobular zonation. By introducing novel computational approaches, we enable transcriptional gradient measurements between tissue structures, including several lobules in a variety of orientations. Further, our data suggests the presence of previously transcriptionally uncharacterized structures within liver tissue, contributing to the overall spatial heterogeneity of the organ. This study demonstrates how comprehensive spatial transcriptomic technologies can be used to delineate extensive spatial gene expression patterns in the liver, indicating its future impact for studies of liver function, development and regeneration as well as its potential in pre-clinical and clinical pathology.


Supplementary figure 2:
Distribution of the number of unique transcripts across clusters. a The logarithmic number of unique transcripts (base10) for each cluster reveals equal numbers of unique transcripts for each cluster. b The log10 of unique transcripts exhibits a uniform distribution in the UMAP embedding of spatial data and across cluster annotations. Transcript numbers are ranging from low (light) to high (dark) values. Supplementary figure 3: Visualization of unsupervised clustering results of integrated sequencing data on spots under the tissue for a sample 1 of three sections of a part of a caudate lobe b sample 2 of three sections of a part of a caudate lobe and c sample 3 of two sections of a part of a right lobe.

Supplementary figure 4:
For spots belonging to each cluster identity (0-5) the most frequent cluster identities of the four spots within the closest proximity (1500 µm) were determined, ranging from fractions between 0 and 0.5 of all neighboring spots of each cluster. The cluster for which the neighboring fractions are determined is depicted in grey with a red dashed stroke. Dashed lines in magenta indicate the expected distribution of fractions among the neighboring clusters if the spots under the tissue were randomly assorted.
section 1 section 2 sample 3, right lobe Supplementary figure 5: Visualization of single cell data integration by stereoscope on spots under the tissue for all sections of this study. Spot-wise calculated proportion-values were scaled using quantile scaling between values of 0 to 1 for all 20 annotated cell types of the MCA single-cell data and each section. Proportion-values are visualized on spots under the tissue for section 1 (top) and section 2 (bottom) of the right lobe (sample 3). Supplementary figure 8: Spearman correlations between single cell periportal and pericentral cell-types and pericentral and periportally annotated spots in spatial transcriptomics data. a Spearman correlations between module score values of markers from spatial transcriptomics data (cluster 1 periportal scores, cluster 2 pericentral scores) and module score values of marker genes from single cell (periportal hepatocytes scores, pericentral hepatocyte scores) using spatial transcriptomics count-data. b Spearman correlations between module score values of markers from spatial transcriptomics data (cluster 1 periportal scores, cluster 2 pericentral scores) and module score values of marker genes from single cell (periportal hepatocytes scores, pericentral hepatocyte scores) using mouse cell atlas (MCA) data 39 .
Supplementary figure 10: Spatial autocorrelation of marker genes. Bar plot of genes with spatial autocorrelation above a threshold of 0.2 for spatial autocorrelation. Higher values indicate stronger spatial correlation of the gene, while lower values indicate a more random distribution of the gene across the tissue. The color refers to the identified cluster identity.
Supplementary Figure 11: a Heatmap depicting differentially expressed marker genes, associated with the GO-term "immune system process" (GO:0002376). Genes exhibiting highest elevation in either the portal area (cluster 1) or central area (cluster 2) are surrounded by a red box.These highlighted genes were selected to be analyzed with the bivariate expression by distance model. In b selected markers from a were subjected to bivariate expression analysis (methods Supplementary figure 12: Visualization of marker genes of periportal annotated cluster 1. a Sds expression b Cyp2f2 expression,and c module scores (see methods) for periportal markers of cluster 1 (bottom) on Hematoxylin-and Eosin-stained tissue for all sections of sample 1 to sample 3.
Descending expression values/scores is visualized by decreasing opacity of spots and a color gradient from dark blue for high expression to white for low expression. Supplementary figure 13: Visualization of marker genes of pericentral annotated cluster 2. a Glul expression b Cyp2e1 expression,and c module scores (see methods) for pericentral markers of cluster 2 (bottom) on Hematoxylin-and Eosin-stained tissue for all sections of sample 1 to sample 3.
Descending expression values/scores is visualized by decreasing opacity of spots and by a color gradient from dark red for high expression to white for low expression.
Supplementary Figure 14: Gene set enrichment of selected KEGG pathways for periportal and pericentral regions. Enrichment of established zonated metabolic pathways was determined and proportions of enriched genes sets were compared between the PP zone (cluster 1) and the PC zone (cluster 2) in our data. Negative values (red bars) represent enrichment in the central zone, while positive values (blue bars) represent enrichment in the portal area.

central markers portalmarkers Glul
Sds Oat Cyp2f2 Slc1a2 Hal Cyp2e1 Hsd17b13 Cyp2a5 Aldh1b1 Supplementary figure 15: Shortlist of five marker genes of cluster 1 (portal vein) and cluster 2 (central vein) identified by unsupervised clustering of ST data. The central and portal markers depicted here, exhibit the highest logFC for the respective cluster in the ST data and were used in computational analyses of expression by distance plots and the computational annotation of vein types in the tissue.

Aldh1a1
Cyp2a5 Cyp1a2 Cyp2c37 Cyp2d9 Cyp2e1 Gstm 3 Gulo Lect2 Oat Slc1a2 Slc22a1 Oat Cyp4a14 Glul Supplementary figure 16: Visualization of expression of pericentral marker genes identified by unsupervised clustering and DGEA of ST data across reconstructed spatial layers (1-9) of single cell data zonation matrix (Halpern et al., 2017). In combined plots, single cell expression data was scaled to the maximal expression value in the group of genes (see methods). Each pericentral marker gene is additionally plotted individually with original expression values of single cell data. Grey ribbons indicate standard error of gene expression across cells along layers.

Aldh1b1
Ctsc Cyp2f2 Supplementary figure 17: Visualization of expression of periportal marker genes identified by unsupervised clustering and DGEA of ST data across reconstructed spatial layers (1-9) of single cell data zonation matrix (Halpern et al., 2017). In combined plots, single cell expression data was scaled to the maximal expression value in the group of genes (see methods). Each periportal marker gene is additionally plotted individually with original expression values of single cell data. Grey ribbons indicate standard error of gene expression across cells along layers.  Supplementary figure 20: a ROC curve illustrating the performance of the expression-based vein type classifier. The blue line represents the average AUC taken over all the folds in the cross validation analysis. The red dashed line corresponds to the curve obtained from a completely random classifier. The gray shaded area represents the interval of the mean plus/minus one standard error. Av, AUC stands for average AUC and is the arithmetic mean taken across all folds. b Barplots illustrating the results of the prediction accuracy of the cross validation analysis between sections (experiments). The blue color denotes correct annotations while grey signifies incorrect annotations. The red dotted line denotes 50% classification accuracy. c Performance validation of binary vein classifier | For each sample prediction performance was evaluated by training according to an LOOCV. The performance is illustrated by the results of the accuracy of the cross validation analysis and area under the ROC curve (AUC). A value of 1 denotes highest accuracy, while a value of 0 denotes lowest accuracy.