Spatiotemporal single-cell RNA sequencing of developing hearts reveals interplay between cellular differentiation and morphogenesis

Single-cell RNA sequencing is a powerful tool to study developmental biology but does not preserve spatial information about cellular interactions and tissue morphology. Here, we combined single-cell and spatial transcriptomics with new algorithms for data integration to study the early development of the chicken heart. We collected data from four key ventricular development stages, ranging from the early chamber formation stage to the late four-chambered stage. We created an atlas of the diverse cellular lineages in developing hearts, their spatial organization, and their interactions during development. Spatial mapping of differentiation transitions revealed the intricate interplay between cellular differentiation and morphogenesis in cardiac cellular lineages. Using spatially resolved expression analysis, we identified anatomically restricted gene expression programs. Last, we discovered a stage-dependent role for the small secreted peptide, thymosin beta-4, in the coordination of multi-lineage cellular populations. Overall, our study identifies key stage-specific regulatory programs that govern cardiac development.


INTRODUCTION
The heart is the first fully functional organ to develop and is vital for embryogenesis [1]. 27 Cardiogenesis involves heterogeneous cell populations from multiple lineages that 28 spatiotemporally interact to drive cardiac fate decisions [2]. The heterogeneity of cell types 29 2 during cardiac development makes it difficult to study cardiac fate decisions using traditional 30 developmental biology techniques. Single-cell RNA-sequencing (scRNA-seq) has been used to 31 study the cellular mechanisms involved in driving heart development, but does not preserve 32 spatial information, and does not enable studies of the complex interplay between cellular 33 maturation and morphogenesis. Here, we combined spatially resolved RNA-seq with high-34 throughput scRNA-seq to study the spatiotemporal interactions and regulatory programs that 35 drive fetal development of the chicken heart. Current spatial transcriptomics approaches lack 36 single-cell resolution, which we addressed here using new approaches to integrate high-37 throughput spatial and single-cell transcriptomic data. 38 We used the chicken embryo as a model system to study cardiogenesis since the development of 39 the chick ex-utero in an egg allows unique access to early stages of development when the heart 40 consists of relatively few cells. The mature chick heart comprises four chambers with in-and 41 out-flow tracts, and despite some differences, the chick heart anatomy resembles the anatomy of 42 the human heart more closely than other non-mammalian vertebrate model organisms [3]. We lineages. In addition, we performed spatially resolved RNA-seq on a total of 12 heart tissue 47 sections collected at the same four stages. 48 As we demonstrate here, the combination of single-cell and spatial transcriptomics uniquely 49 enables to unravel cellular interactions that drive cardiogenesis. The data enabled us to 50 reconstruct a high-resolution, spatially resolved gene expression atlas of epi-, endo-, and 51 myocardial developmental lineages within cardiac tissue. We characterized and spatially 52 resolved progenitor and differentiated cell types, identified stage-specific transcriptional 53 programs and cellular interactions, reconstructed differentiation lineages, and delineated 54 important regulatory programs in cardiac development. We integrated scRNA-seq and spatial 55 RNA-seq data using an anchor-based method to predict cell type annotations for spatially 56 resolved transcriptomes. We used the cell-type predictions to construct proximity maps revealing 57 novel cellular interactions. Using the cell-type prediction scores, we uncovered local cellular 58 heterogeneity and spatially restricted regulatory programs in ventricular tissue and characterized 59 3 changes in cellular environments across ventricular spatial compartments. We furthermore 60 constructed a similarity map between single-cell and spatial transcriptomes, which enabled us to

69
Spatially resolved single-cell transcriptomics atlas of developing fetal chicken hearts 70 To study the complex interplay between differentiation and morphogenesis during cardiac 71 development, we combined single-cell and spatial transcriptomics. We profiled four key  To perform scRNA-seq (10x Genomics chromium platform), we enzymatically digested cardiac 82 ventricle tissue into single-cell suspensions (Fig. 1A, Methods). Pooling of cells from up to 72 83 fetal hearts enabled scRNA-seq on cardiac tissue at early stages of development. To perform 84 spatial transcriptomics (10X Genomics Visium platform), we cryosectioned 10 m coronal 85 tissue slices (chamber view) from fetal hearts at the same four stages (Fig. 1A, See Methods). 86 We generated single-cell transcriptomic data for 22,315 cells and spatial transcriptomics data for 87 12 tissue sections covering over 6,800 barcoded spots (Sup. Fig. 1A and 1B). We found that 88 4 spatial transcriptomes collected at the same developmental stage were strongly correlated 89 (Pearson correlation; R > 0.98, Sup. Fig. 1D), and that spatial transcriptomes and single-cell 90 transcriptomes collected at the same developmental stage were strongly correlated (Pearson 91 correlation; R 0.88-0.91, Sup. Fig. 1E). The combination of scRNA-seq and spatial 92 transcriptomics uniquely enabled us to spatially resolve cell-type specific gene expression in 93 cardiac tissue (see below).

94
To analyze single-cell transcriptomes, we filtered and preprocessed the data (Methods), 95 performed batch correction using scanorama[4] (Sup. Fig. 1C), performed dimensionality 96 reduction and cell clustering, and then visualized the data by Uniform Manifold Approximation 97 and Projection (UMAP, Methods). This analysis revealed 15 distinct cell type clusters (Fig. 1B). 98 We used marker gene and differential gene expression analysis to assign cell types to cell 99 clusters (Sup. Table 1, Fig. 1), and identified diverse cell clusters from myocardial, endocardial, 100 and epicardial cardiac lineages in the ventricles (Fig. 1C, Sup. Fig. 1F). In addition to cardiac 101 cell types, we detected a small number of circulating cell types including erythrocytes, 102 macrophages, and dendritic cells. Last, we identified a unique heterogeneous population of cells 103 that express high levels of thymosin beta-4 (TMSB4X, see below). A detailed overview of the 104 cell-types identified is provided in the supplement (Sup. Table 1). 105 Standalone analysis of the spatial transcriptomic data revealed anatomical regions with 106 differential transcriptional programs. To spatially resolve cell populations, the spatial 107 transcriptomics data was integrated with the scRNA-seq data using Seurat-v3 anchor-based 108 integration [5,6]. This approach first identifies anchors between datasets which represent pairwise 109 correspondences between elements in the two datasets that appear to originate from the same 110 biological state. The anchors are then used to harmonize the datasets by learning a joint structure 111 with canonical correlation analysis and to transfer annotation information from one dataset to the 112 other. Every spot in the spatial data could be considered a weighted mix of cell-types identified 113 by scRNA-seq. We used the prediction scores from label transfer to obtain weights for each of 114 the scRNA-seq-derived cell types for each spot (Fig. 1D, Sup. Fig. 2, Methods). In order to 115 understand the spatial organization of cell types in broad anatomical regions, spots were labeled 116 as cell types with maximum prediction score and visualized on H&E stained images of 117 respective stages (Fig. 1D). Cell-type prediction scores for spatial transcriptomes were further 118 5 used to estimate the abundance of pairs of specific cell types (Methods). As proximity is a 119 necessity for physical interactions between two or more cells, these cell-type proximity maps can 120 be used to guide the discovery of interactions between cell types from the same or different 121 lineages. We constructed proximity maps for all cardiac cell type pairs and visualized them as 122 chord diagrams (Fig. 1E). We found a significant number of cardiomyocytes colocalized with 123 myocardial progenitor cells and precursor cells in all stages, as expected. We furthermore found 124 a significant colocalization of myocardial cells with endocardial cells at day 7 and with vascular 125 endothelial and fibroblast cells at day 10 and day 14. This was also expected given that    Transition Embedding) to visualize differentiation trajectories because of its ability to learn and 151 conserve local and global structure in low dimensional space. We then spatially resolved sub-152 clusters of cell types and estimated the local PHATE1 dimension, a proxy for development time 153 (Methods, Fig. 2A). Utilizing this approach, we found that the epicardial lineage undergoes a 154 rich differentiation process while the endocardial and myocardial lineages mostly undergo a 155 maturation process across the timepoints investigated here.  cluster-C2 expressed bone morphogenetic protein 4 (BMP4), which is associated with an 174 epicardial progenitor-like phenotype [14], and lumican (LUM), which is known to be expressed  Fig. 3C). 199 We further employed lineage analysis to study maturation of the endocardial and myocardial  (Fig. 2J). Collectively, these results indicate that the endocardial and myocardial cells are 227 a differentiated phenotype as early as the day 4 heart stage and mature with development time 228 from day 4 to day 14. Gene ontology terms of the endocardial and myocardial lineage that 229 significantly correlated with pseudotime can be found in supplement (Sup. Fig. 3F-I). 230 Transferring the valve endocardial cell type labels from scRNA-seq to spatial transcriptomics 231 data confirmed that these cells are spatially restricted to atrioventricular heart valves at all four 232 stages in the spatial data (Sup. Fig. 2I). Therefore, day 4 and day 7 valve endocardial cells   (Fig. 3C, Sup. Fig. 2J). Overall, our analysis revealed region specific cell type heterogeneity 264 across the cardiac tissue and supports the idea of collective differentiation and morphogenesis.

265
To detect region specific markers, we performed differential gene expression analysis between 266 anatomical regions (Sup. Fig. 4A). Interestingly, we found stage dependent transcriptional  Fig. 4B). We observed differences in trabeculated versus compact 279 ventricular myocardium at day 7 and day 10 when the transition from trabeculated to compact 280 myocardium is underway. Myoglobin (MB) was spatially restricted to the developing compact 281 myocardial layers across developmental stages (Fig. 3G). The emergence of myoglobin in the

301
To categorize spatially variable genes (SVGs, genes that correlate with location within a tissue), 302 we used the "markvariogram" method[34] implemented in Seurat-v3 which models spatial 303 transcriptomics data as a mark point process and computes a 'variogram' to identify SVGs. This 304 method not only detects genes that are spatially restricted (i.e. gene markers specific to a 305 particular spatially restricted cell type) but also detects ubiquitously expressed genes whose 306 expression correlates with spatial location within anatomical regions. Using this approach, we

319
Because little is known about the spatiotemporal and cell-type specific expression profile of 320 thymosin beta-4 during cardiogenesis, we investigated the heterogeneity of cellular phenotypes 321 within the TMSB4X cluster in depth. We first collected and re-clustered cells from this cluster 322 and performed differential gene expression analysis to examine cell type composition (Fig. 4A,   323 Sup.  (Fig. 4A, right). Cluster 1 also contains 326 cardiomyocyte cells expressing MYH15 and ACTC1. A second subset (cluster 2) mainly 327 consisted of vascular smooth muscle-like cells that differentially express ACTA2 (Fig. 4A, 328 14 right). The last subset of these cells (cluster 3) mainly consisted of coronary vascular endothelial-329 like cells that differentially express FABP5 and TM4SF18 (Fig. 4A, right). Interestingly, most of  Fig. 5B). One other beta-thymosin, 337 TMSB15B, was expressed but TMSB15B expression did not change with stage (Sup. Fig. 5D). 338 To gain a functional understanding of the TMSB4X high cell cluster, we performed differential 339 gene expression analysis (Fig. 4B, Methods). This analysis revealed significant upregulation of  Fig. 5C). 351 We used spatially resolved RNA-seq to study TMSB4X expression across the tissue and to 352 characterize the TMSB4X enriched cells in a morphological context. TMSB4X was found to be 353 enriched in atrioventricular valves and the ventricular wall at later stages (Fig. 4C). Cell type 354 prediction scores revealed spots comprising cells from the 'TMSB4X high" cluster in cardiac 355 tissues across stages (Sup. Fig. 2K). Spots with a high prediction score for the "TMSB4X high  (Fig. 4D, Sup. Fig. 2K).

363
To further validate these observations, we performed independent immunostaining experiments vascular bundles) as well as lower intensity in the endocardium and epicardium (Fig. 4E). 375 Additional immunostaining images split by channel can be found in the supplement (Sup.  FindIntegrationAnchors command and then cell type labels were transferred to spatial data using 588 TransferData command. Cell type prediction values for Spatial RNA-seq spots were saved as an 589 assay and used for further analysis. Cell type colocalization values were calculated by counting 590 cell type pair abundances in spatial RNA-seq spots. Only cell types with top four prediction 591 scores in each spot were included. We also constructed a cell-spot similarity map by transferring 592 cell barcode IDs to spatial barcoded spots. The cell-spot similarity matrix containing scRNA-seq 593 cell similarity prediction for each spot in scRNA-seq data, which we further used to estimate 594 pseudotime for spatial RNA-seq spots.

595
Pseudotime analysis and trajectory construction. We used PHATE [7] (Potential of Heat-596 diffusion for Affinity-based Transition Embedding) to visualize developmental trajectories 597 because of its ability to learn and maintain local and global distances in low dimensional space. 598 We reclustered cells from individual lineages, performed PHATE dimension reduction on 599 scanorama integrated values, and used PHATE1 dimension as a proxy for development time.

600
PHATE reduction was performed using the phate command implemented in the R package: 601 phateR. We also used monocle-v2[20,21] to order the cells in epicardial, endocardial, and 602 24 myocardial lineages along pseudotime and reconstruct lineage trajectories. We filtered the genes 603 detected in our dataset and retained the top 2,000 highly variable genes calculated using Seurat-604 v3 in our monocle analysis. We further filtered these genes to genes differentially expressed in 605 cell type subclusters within the lineage using differentialGeneTest command and then reduced 606 the dimension of our data using the DDRTree method. We used the ReduceDimension function 607 in monocle-v2 to reduce the dimension to two DDRTree components, which is then used to 608 define a pseudotime scale. The cells were then ordered along pseudotime using monocle's 609 orderCells command, and the root of the trees was defined as the branch with maximum cells