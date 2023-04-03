Ethics

This study was approved by the Ethics Commission of the Technical University of Munich Faculty of Medicine (447/17S, 384/15) as part of the European Research Council grant ERC 788381 to A.M. Authorization to use the human embryonic stem cell (hESC) line HES-3 (hPSCreg ESIBIe003) generated by ES Cell International in Singapore was granted by the Central Ethics Committee for Stem Cell Research of the Robert Koch Institute to A.M. (AZ 3.04.02/0131). The Regional Ethical Review Board in Stockholm (Regionala etikprövningsnämnden i Stockholm) approved the study protocol using human aborted embryos with ethical permission number Dnr 2015/1369-31/2 (ref. 62). Informed consent was obtained from all donors of cells and tissues.

Culture of hPSCs

hiPSCs were generated using the CytoTune-iPS 2.9 Sendai reprogramming kit (Invitrogen, A16157) as previously described63. The following hiPSC lines were used in differentiation experiments: hPSCreg MRIi003-A (hiPSC1), MRIi001-A (hiPSC2), MRIi003-A-6 (AAVS1-CAG-VSFP; hiPSC3), MRIi003-A-9 (AAVS1-CAG-FRT-flanked STOP-mKate2-HA) and MRIi025-A (PTPN11N308S/+). The HES-3 line (hPSCreg ESIBIe003; hESC) was generously provided by D. A. Elliott of the Murdoch Children’s Research Institute and Monash Immunology and Stem Cell Laboratories, Monash University64. hPSCs were cultured on Geltrex-coated plates (Gibco, A14133-02) in essential 8 medium (Gibco, A1517001) containing 0.5% penicillin/streptomycin (Gibco, 15140-122). Cells were passaged every 4 d with 0.5 mM EDTA (Invitrogen, AM92606) in PBS without Ca2+ or Mg2+ (PBS−/−; Gibco, 10010023).

Three-dimensional cardiac induction

On day −1, 30,000–40,000 hPSCs were seeded into poly-HEMA-coated (Sigma-Aldrich, P3932) U-shaped 96-well plates in essential 8 medium containing 2 µM thiazovivin. The basal differentiation medium was prepared by mixing 247.36 ml of DMEM/F-12 with GlutaMAX (Gibco, 31331028), 237.36 ml of IMDM (Gibco, 21980032), 5 ml of chemically defined lipid concentrate (Gibco, 11905031), 10 ml of IMDM containing 10% bovine serum albumin (BSA), 250 µl of transferrin (Roche, 10652202001) and 20 µl of α-monothioglycerol (Sigma-Aldrich, M6145). On day 0, essential 8 medium was replaced with basal medium supplemented with 10 ng ml–1 BMP4 (R&D, 314-BP), 50 ng ml–1 activin A (Sigma-Aldrich, SRP3003), 30 ng ml–1 bFGF (R&D, 233-FB-025/CF), 5 µM LY-29004 (Tocris, 1130) and 1.5 µM CHIR-99021 (Axon Medchem, 1386). On day 2, the medium was replaced with basal medium supplemented with 10 µg ml–1 insulin (Sigma-Aldrich, I9278), 10 ng ml–1 BMP4, 8 ng ml–1 bFGF, 5 µM IWP2 (Tocris, 3533) and, where indicated, 0.5 µM RA (Sigma-Aldrich, R2625). This medium was refreshed every 24 h until day 6, at which point the medium was replaced with basal medium supplemented with 10 µg ml–1 insulin, 10 ng ml–1 BMP4 and 8 ng ml–1 bFGF. This medium was refreshed 24 h later on day 7. On day 8, spheroids were embedded in a collagen I solution consisting of 2.17 mg ml–1 collagen I (Corning, 354249), 20% distilled water (Gibco, 15230162), 5% 10× DPBS (Gibco, 14080055) and 8.3 mM NaOH freshly added to medium consisting of DMEM/F-12 with 20% fetal bovine serum, 1% non-essential amino acids (Gibco, 11140050), 1% penicillin–streptomycin–glutamine (Gibco, 10378016) and 0.1 mM β-mercaptoethanol (Sigma-Aldrich, M7522). Gel sheets were transferred to maintenance medium consisting of basal medium supplemented with 10 µg ml–1 insulin and 0.5% penicillin–streptomycin, and plates were placed on a rocking shaker (Assistant) at 40 r.p.m. Where indicated, 100 ng ml–1 VEGF (R&D, 293-VE-010) was freshly added to the medium at each medium change from this point on. For long-term culture, maintenance medium was replaced every 2–3 d.

Cell culture treatments

In cell–cell interaction experiments, epicardioids were treated with 0.25 µM, 0.5 µM or 1 µM linsitinib (Tocris, 7652) or 200 µg ml–1 or 500 µg ml–1 NRP2 blocking antibody (R&D, AF2215) in maintenance medium on days 11, 12, 13 and 14. Spheroids differentiated without RA were treated with 25 ng ml–1, 50 ng mL–1 or 100 ng ml–1 recombinant human IGF2 (R&D, 292-G2) in maintenance medium on days 11, 12, 13 and 14. DMSO was used as a vehicle control.

To induce hypertrophy, day 30 epicardioids were treated with 25 nM or 50 nM ET1 (Sigma-Aldrich, E7764) in maintenance medium for 6 d, and the medium was replaced every day. Epicardioids were then either dissociated with papain for reseeding, as described later, dissociated with TrypLE Express (Gibco, 12605010) for 15 min at 37 °C for RNA extraction or fixed.

Lineage tracing

Generation of the AAVS1-CAG-FRT-flanked STOP-mKate2-HA reporter line

To construct the donor plasmid pAAVS1-CAG-FRT-flanked STOP-mKate2-HA-poly(A), the pCAFNF-green fluorescent protein (pCAFNF-GFP) plasmid (Addgene, 13772) was digested with SpeI and SalI, and the CAG-FRT-flanked STOP cassette (CAG promoter and neomycin resistance gene flanked by FRT sites) was cloned into the pAAVS1-Nst-MCS vector (Addgene, 80487), which was digested with SpeI and SalI. The simian virus 40 poly(A) (Sv40-poly(A)) signal was then amplified by PCR from the pCAFNF-GFP plasmid using primers containing Pacl restriction sites at the 5′ end and EcoRI restriction sites at the 3′ end and introduced into the pAAVS1-CAG-FRT-flanked STOP plasmid, digested with PacI and EcoRI. The mKate2 coding sequence fused to an HA tag was amplified by PCR from the p3E-mKate2-HA no-pA plasmid (Addgene, 80810) as a template and inserted into SwaI–PacI sites on the pAAVS1-CAG-FRT-flanked STOP-poly(A) plasmid. Primers used for cloning and sequencing of the pAAVS1-CAG-FRT-flanked STOP-mKate2-HA-poly(A) construct are listed in Supplementary Table 7.

Healthy control hiPSCs (hPSCreg MRI003-A; 1 × 106) were nucleofected with 1 µg of pXAT2 plasmid (Addgene, 80494) containing sequences for an AAVS1 locus-specific single guide RNA (GGG GCC ACT AGG GAC AGG AT) and the Cas9 nuclease and 3 µg of donor construct (pAAVS1-CAG-FRT-flanked STOP-mKate2-HA-poly(A)) following the Lonza Amaxa 4D Nucleofector protocol for human stem cells. Cells were subsequently plated onto Matrigel-coated (BD, 354277) six-well plates (Nunclon, 150687) in mTeSR1 (Stemcell Technologies, 05854) with 10 μM thiazovivin. Twenty-four hours later, and every day afterward, the medium was replaced with fresh mTeSR1. Three days after nucleofection, 150 μg ml–1 neomycin (Gibco, 10131) was added into the mTeSR1 for selection for 2 weeks. When the hiPSC colonies were large enough, cells were dissociated with Accutase (Thermo Fisher Scientific, A11105-01) and replated for single-clone expansion at low density (1,000 cells per 10-cm Matrigel-coated dish). Single clones were then picked for PCR genotyping and further expansion into wells of a Matrigel-coated 96-well plate (Nunclon, 161093). The genotype of the selected clones was verified by PCR screening and confirmed by Sanger sequencing (Eurofins MWG Operon; primers listed in Supplementary Table 7).

Karyotype analysis after editing was performed at the Institute of Human Genetics of the Technical University of Munich using G-banding (20 metaphases counted). Three of ten potential off-target sites predicted by the CRISPOR tool (https://crispor.tefor.net) were amplified and verified by Sanger sequencing (primers are listed in Supplementary Table 7). To verify correct reporter expression, positive hiPSCs clones (1 × 106) were nucleofected with 3 µg of pCAGGS T2A FLPo plasmid (containing the coding sequence of puromycin in frame with FLPo; Addgene, 124835) and kept in culture as described above. Three days after nucleofection, antibiotic selection with 0.2 μg ml–1 puromycin (Calbiochem, 540411) was induced for 10 d. Cells were then fixed and immunostained with anti-HA tag as described later (antibodies are listed in Supplementary Tables 8 and 9).

Generation of lentiviral CDH1 and MAB21L2 promoter reporter constructs and lineage tracing of JCF and mesothelial epicardium

For the generation of the lentiviral transfer vector carrying an FLP under control of the human ∼1.37-kilobase (kb) CDH1 promoter, red fluorescent protein (RFP) from the lentiviral pHAGE-E-cadherin-promoter-RFP plasmid (Addgene, 79603) was replaced by an FLP from the plasmid pCAGS-T2A-FLP (Addgene, 123845). Lentiviral transfer vectors carrying a tamoxifen-inducible FLP under the control of the human ∼1.88-kb MAB21L2 promoter (chromosome 4: 150581151–150583029) were synthetized by Vectorbuilder.

Lentiviruses were produced in HEK293T cells by transient cotransfection of the lentiviral transfer vector, the CMVDR8.74 packaging plasmid and the VGV.G envelope plasmid using Fugene HD (Promega, E2311). Viral supernatants were collected after 48 h and used for infection of epicardioids derived from the AAVS1-CAG-FRT-flanked STOP-mKate2-HA reporter hiPSCs in the presence of 8 µg ml–1 polybrene (Sigma-Aldrich, 107689).

For lineage tracing of JCF cells, epicardioids were infected at day 3 with the MAB21L2-promoter-FLPERT2 lentivirus, and 2.5 µM 4-OHT (Sigma-Aldrich, H6278) was applied at days 4 and 5 or days 7 and 8 to induce FLP expression. Epicardioids were then collected at day 8 or day 12 for immunofluorescence analysis. For lineage tracing of mesothelial epicardial cells, epicardioids were infected at day 15 with the CDH1-promoter-FLP lentivirus and collected at day 18 or day 24 for immunofluorescence analysis.

Immunofluorescence analysis

Cryosections of spheroids were prepared as described by Lancaster and Knoblich, with some modifications65. Briefly, spheroids were washed with DPBS and fixed with 4% paraformaldehyde (Sigma-Aldrich, 158127) for 1 h at room temperature. After washing three times with DPBS, spheroids were kept in 30% sucrose at 4 °C overnight and embedded in a solution of 10% sucrose and 7.5% gelatin in DPBS before freezing in a 2-methyl-butane bath (Sigma-Aldrich, M32631) cooled with liquid nitrogen and transferring to −80 °C. Cryosections prepared with a Microm HM 560 cryostat (Thermo Fisher Scientific) were transferred onto poly-l-lysine slides (Thermo Fisher Scientific, J2800AMNT) and stored at −80 °C.

For immunostaining, samples were washed with DPBS and fixed with 4% paraformaldehyde at room temperature for 15 min (cells) or 10 min (cryosections). After washing three times with DPBS, samples were permeabilized with 0.25% Triton X-100 (Sigma-Aldrich, X100) in DPBS for 15 min at room temperature. After washing another three times with DPBS, samples were blocked with 3% BSA in DPBS + 0.05% Tween 20 (PBST; Sigma-Aldrich, P2287) for 1 h at room temperature. Primary antibodies (Supplementary Table 8) were then added at the indicated dilutions in 0.5% BSA in PBST and incubated overnight at 4 °C. After washing three times for 5 min (cells) or five times for 10 min (cryosections) with PBST, appropriate secondary antibodies (Supplementary Table 9) diluted 1:500 in 0.5% BSA (Sigma-Aldrich, A9647) in PBST were added for 1 h (cells) or 2 h (cryosections) at room temperature protected from light. After repeating the previous washing steps, Hoechst 33258 (Sigma-Aldrich, 94403) was added at a final concentration of 5 µg ml–1 in DPBS for 15 min at room temperature protected from light. Samples were mounted with fluorescence mounting medium (Dako, S3023) and stored at 4 °C until imaging with an inverted or confocal laser-scanning microscope (DMI6000B and TCS SP8, Leica Microsystems). Images were acquired and processed using the Leica Application Suite X software (v3.5.7.23225).

Cell preparation for single-cell sequencing

Epicardioids were dissociated to single cells using papain, as previously described66, by adapting the number of pooled epicardioids and dissociation time to the stage of development (Supplementary Table 10). Briefly, a 2× papain solution consisting of 40 U ml–1 papain (Worthington Biochemical, LS003124) and 2 mM l-cysteine (Sigma-Aldrich, C6852) in PBS−/− was incubated for 10 min at 37 °C to activate the papain before diluting 1:2 in PBS−/− to obtain the 1× solution. Spheroids were then removed from the collagen gel if necessary and washed twice with 2 mM EDTA in PBS−/−. Spheroids were then dissociated in 750 µl of 1× papain solution at 37 °C and 750 r.p.m. on a thermomixer (Eppendorf). The enzymatic reaction was stopped with 750 µl of stop solution consisting of 1 mg ml–1 trypsin inhibitor (Sigma-Aldrich, T9253) in PBS−/−. After pipetting up and down approximately 30 times to obtain a single-cell suspension, cells were passed through a 40-µm strainer and washed with 5 ml of 1% BSA (Gibco, 15260037) in PBS−/−. After centrifugation for 3 min at 200g, cells were resuspended in 500 µl of 0.5% BSA in PBS−/− for counting with trypan blue. For samples exceeding 15% cell death, dead cells were immediately depleted using a dead cell removal kit (Miltenyi Biotec, 130-090-101), according to the manufacturer’s instructions, before further processing. Cells from the same cell suspension were then used for scRNA-seq and scATAC-seq as described below.

scRNA-seq

After dissociation, samples were processed for scRNA-seq with a targeted cell recovery of 8,000. To generate Gel Bead-In-EMulsions (GEMs) and single-cell sequencing libraries, the Chromium Single Cell 3′ GEM Library & Gel Bead kit v3 (10x Genomics, 1000092), Chromium Chip B Single Cell kit (10x Genomics, 1000073) and Chromium i7 Multiplex kit v2 (10x Genomics, 120262) were used for samples from days 2 to 15, and the Chromium Next GEM Single Cell 3′ Library & Gel Bead kit v3.1 (10x Genomics, 1000128), Chromium Single Cell G Chip kit (10x Genomics, 1000127) and Single Index kit T set A (10x Genomics, 1000213) were used for the day 30 sample. Quality control of cDNA samples was performed on a Bioanalyzer (Agilent) using a high-sensitivity DNA kit (Agilent, 5067-4626). Library quantification was performed with the KAPA quantification kit (KAPA Biosystems, KK4824) following the manufacturer’s instructions. Libraries were pooled and sequenced using a NovaSeq S1 flow cell (Illumina) with 150-base pair (bp) paired-end reads with 28 cycles for read 1, 91 cycles for read 2, 8 cycles for i7 and 0 cycles for i5 and with a read depth of at least 25,000–30,000 paired-end reads per cell.

The Cell Ranger pipeline (v6.1.1) was used to perform sample demultiplexing and barcode processing and to generate the single-cell gene counting matrix. Briefly, samples were demultiplexed to produce a pair of FASTQ files for each sample. Reads containing sequence information were aligned using the reference provided with Cell Ranger (v6.1.1) based on the GRCh37 reference genome and ENSEMBL gene annotation. PCR duplicates were removed by matching the same unique molecular identifier (UMI), 10x barcode and gene and collapsing them to a single UMI count in the gene–barcode UMI count matrix. All the samples were aggregated using Cell Ranger with no normalization and treated as a single dataset. The R statistical programming language (v3.5.1) was used for further analysis.

The count data matrix was read into R and used to construct a Seurat object (v4.1.1). The Seurat package was used to produce diagnostic quality control plots and to select thresholds for further filtering. Filtering method was used to detect outliers and high numbers of mitochondrial transcripts. These preprocessed data were then analyzed to identify variable genes, which were used to perform a principal-component analysis (PCA). Statistically significant PCs were selected by PC elbow plots and used for uniform manifold approximation and projection (UMAP) analysis. Clustering parameter resolution was set to 1 for the function FindClusters() in Seurat. For subclustering analysis, we used the clustree package (v0.4.3). All DEGs were obtained using a Wilcoxon rank-sum test using as threshold P value of ≤0.05. We used adjusted P values based on Bonferroni correction using all features in the dataset. For cell-type-specific analyses, single cells of each cell type were identified using the FindConservedMarkers function, as described within the Seurat pipeline. Cellular dynamics were inferred based on the kinetics of gene expression using RNA velocity21. Analysis of cell–cell interactions was performed with CellPhoneDB v2.1.7 (ref. 30). For all the gene signatures analyzed, we used a function implemented in the yaGST R package v2017.08.25 (https://rdrr.io/github/miccec/yaGST/)67.

For analysis of the 2D epicardium scRNA-seq dataset from Gambardella et al.11, we downloaded the raw data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122827. Reads containing sequence information were aligned using the GRCh37 reference genome and ENSEMBL gene annotation, as used for the data generated in our study. The Seurat pipeline (v4.0.1) was used to produce diagnostic quality control plots and to select thresholds for further filtering to get the UMAP plot presented in Extended Data Fig. 6a.

To compare our dataset from day 15 and day 30 with a published scRNA-seq dataset of human embryonic heart development18, we downloaded the UMI counts of the Cui et al. dataset from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE106118. Identification of common genes between the Cui et al. dataset and ours was based on Homo sapiens gene symbols. Filtering of the data and annotating cell types were performed based on cell identity information provided in ref. 18. For the earliest epicardial population (referred to as proepicardial), no unique identifier was provided, and these cells were identified based on a de novo clustering of the Cui et al. dataset (Seurat pipeline with t-distributed stochastic neighbor embedding and standard settings), which allowed the identification of a distinct cluster of cells from the 5-week time point corresponding to the proepicardial transcriptional profile described in their manuscript. For correlation analysis, cell-type-specific genes were selected through differential expression analysis between the various cell types in the Cui et al. dataset (top 30 with the lowest adjusted P value; data were analyzed by Wilcoxon rank-sum test; adjusted P value of <0.01). We calculated the average log-normalized expression values for each cluster of the day 30 dataset and the various cell types of the Cui et al. dataset and then computed the Pearson correlation based on the above-mentioned cell-type-specific markers with the function cor() of the R package stats version 4.2.2. The results were plotted as a heat map showing Pearson correlation coefficients in pseudocolor.

scATAC-seq

After dissociation, nuclei isolation for scATAC-seq was performed following the recommendations of 10x Genomics. Briefly, ~500,000 cells from each sample were transferred to a 1.5-ml microcentrifuge tube and centrifuged at 300g for 5 min at 4 °C. The supernatant was removed without disrupting the cell pellet, and 100 μl of chilled lysis buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl 2 , 0.1% Tween 20, 0.01% NP-40 substitute, 0.01% digitonin and 1% BSA) was added and mixed by pipetting ten times. Samples were then incubated on ice for 30–120 s (the optimal incubation time was optimized in advance for each time point). Following lysis, 1 ml of chilled wash buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl 2 , 0.1% Tween 20 and 1% BSA) was added and mixed by pipetting. Nuclei were centrifuged at 500g for 5 min at 4 °C, the supernatant was removed without disrupting the pellet, and nuclei were resuspended in the appropriate volume of chilled diluted nuclei buffer (10x Genomics) to obtain a nuclei concentration suitable for a target nuclei recovery of 8,000.

Samples were then processed using the Chromium Next Single Cell ATAC Library & Gel Bead kit v1.1 (10x Genomics, 1000175), Chromium Single Cell H Chip kit (10x Genomics, 1000161) and Chromium Single Index kit N, set A (10x Genomics, 1000212) to generate GEMs and scATAC-seq libraries. Libraries were pooled and sequenced using a NovaSeq S1 flow cell (Illumina) with 150-bp paired-end reads with 50 cycles for reads 1 and 2, 8 cycles for i7 and 16 cycles for i5 and with a read depth of at least 25,000–30,000 paired-end reads per cell.

Sequencing raw data were processed using 10x Genomics Cell Ranger ATAC 1.2.0. Before alignment to the human reference genome, the ATAC-seq sequences were quality checked using FastQC. The parameters evaluated were (1) total number of reads, (2) sequencing length distribution, (3) sequence quality per base and (4) duplication level. Metrics were homogeneous among all samples (on average) with more than 91% with a Q score of ≥30 and percent duplicates of ≤15%. All samples were aggregated, and joint peak calling was performed using Cell Ranger ATAC aggr with no normalization.

R (v4.1.3) was used for further analysis of the count matrices using Signac68 (v1.7.0) and Seurat69 (v4.1.1). Quality control metrics (total number of fragments, transcription start site (TSS) enrichment score, nucleosome signal, the percentage of reads in peaks and the ratio of reads in genomic blacklist regions) were computed using Signac. Cells were filtered based on the following cutoffs: total number of fragments between 1,000 and 100,000 fragments per cell, TSS enrichment score between 2 and 10, nucleosome signal of <10, fraction of reads in peaks of >0.2 and blacklist ratio of <0.015. Doublets were detected and filtered out using AMULET70 v1.1, which finds cells that have significantly more regions with more than two aligned reads in one position than expected across the genome.

For downstream analysis, peak counts were normalized using the term frequency-inverse document frequency (tf-idf). Gene activities were calculated from the scATAC-seq data using Signac and log normalized with a normalization factor of 10,000.

Multiomic analyses

Integration of scRNA-seq and scATAC-seq data

The unmatched modalities were integrated using GLUE46 v0.2.3. The RNA modality input was preprocessed by first selecting the top 2,000 highly variable genes using scanpy71 (v1.9.1) with flavor ‘seurat_v3’. The features were then log normalized, and dimensionality reduction was performed using a PCA with 100 components. The PCA embedding was used as a first encoder transformation of the model. For the ATAC modality, we applied latent semantic indexing for dimension reduction as implemented in GLUE. GLUE takes a guidance graph as input that links both modalities. We used the default implementation that links an ATAC peak to a gene if it overlaps either the gene body or promoter region.

To match cells from both modalities, we performed minimum cost maximum flow bipartite matching on the joint embedding derived from GLUE as described and used previously50,72. The cost graph was inferred using get_cost_knn_graph with knn_k = 15, null_cost_percentile=99 and capacity_method = ‘uniform’. Using the bipartite matches, we matched each ATAC cell to an RNA cell. In cases where no ATAC match was found for an RNA cell, we used only the RNA information. The latent vector of the cell was calculated as the average latent vector of the matched cells. Gene activities were further denoised with MAGIC73 by smoothing over nearby cells in the joint embedding as proposed and benchmarked in ArchR74. The Python implementation of magic (v3.0.0) was used to smooth gene activities over the k-nearest neighbors graph of the joint embedding with k = 15 neighbors, decay = 1 and k-nearest neighbors autotune parameter ka = 4.

Clustering, DEGs and visualization

Leiden clustering75 was performed on the 15-nearest-neighbor graph that was calculated on the latent embedding from GLUE. We used the scanpy71 (v1.9.1) function scanpy.tl.leiden with the resolution set to 1. All DEGs were obtained with the Wilcoxon rank-sum test (scanpy.tl.rank_genes_groups) and corrected for multiple testing using the Benjamini–Hochberg method. We applied a significance threshold of 0.05 to the false discovery rate (FDR)-adjusted P values. For visualization, a 2D UMAP76 of integrated latent space was generated based on the 15-nearest-neighbor graph.

Inference of cell fate trajectories

Loom files containing raw spliced and unspliced counts were obtained by running the velocyto command line tool21. RNA velocity was calculated on the spliced and unspliced reads of the metacells using scVelo (v0.2.4)77. Moments were computed on the 2,000 highly variable features. The RNA velocity was inferred using the function scvelo.tl.velocity with mode = ‘dynamical’. Palantir78 was used with the default parameters to infer a pseudotime on the integrated dataset. The root cell was chosen based on the diffusion coefficient. We then used CellRank47 (v1.5.1) to compute lineages and absorption probabilities into terminal cell states. The transition matrix was constructed by combining a velocity kernel and a pseudotime kernel with weights of 0.3 and 0.7, respectively, to mainly capture the joint pseudotime. Terminal states were inferred using the compute_macrostates function with n_states = 15. Absorption probabilities for each of the terminal states were computed with the GPCCA estimator.

GRN inference

We constructed a GRN for JCF cells using Pando50 (v1.0.1). Pando takes the integrated metacells with RNA and ATAC measurements and constructs a GRN based on four main steps50:

1. Filtering for candidate regulatory genomic regions. 2. Scanning regions for TF binding motifs. 3. Creating region–TF pairs for each target gene. 4. Inferring relevant TF–region interactions by fitting a regression model with region–TF pairs as variables to predict the expression of the target gene.

We only included peak regions that overlap with PhastCons conserved elements79 from the alignment of 30 mammals using the Pando function initiate_grn. The conserved elements are already included in Pando, and we lifted them to the hg19 reference genome using the R package liftOver (v1.18.0). Pando contains a curated motif database that consists of binding motifs from JASPAR (2020 release)80 extended by motifs from the CIS-BP database81. We considered all TFs and their motifs that were found in the top 4,000 highly variable genes to be relevant. Subsequently, selected peak regions were scanned for motifs using the Pando function find_motifs. We then used the Pando function infer_grn to fit a linear model for each target gene to infer interactions between TF binding site pairs and the gene. TF binding sites in peak regions were considered for a target gene if they overlapped the gene body or 100 kb upstream of the TSS.

Gene module construction

The inferred network was further pruned using the Pando function find_modules. Briefly, Pando assesses significance of the inferred coefficients using analysis of variance (ANOVA) and corrects for multiple testing using the Benjamini–Hochberg method. We applied a significance threshold of 0.05 to the FDR-adjusted P values. The inferred connections to target genes were then summarized into positive and negative modules of a TF. The module activity of a TF can be represented by the expression of the set of target genes that it regulates. We calculated the gene module activity with the Seurat function AddModuleScore with all genes included in the Pando model as the set of background genes.

Visualization of GRN

The GRN was visualized using the Pando function get_network_graph and plot_network_graph with the option umap_method = ‘weighted’, which computes a UMAP embedding of the TFs in the graph based on coexpression and regulatory relationship as measured by the inferred coefficients. Nodes are sized by the PageRank centrality of each TF. To determine whether a TF is more important for the epicardial or the CM lineage, we computed an absorption probability weighted expression50. Specifically, we multiplied the z-scaled epicardial absorption probability by the expression of a TF in each gene and formed the average over all cells. This way, TFs that show a strong expression correlation with the epicardial absorption probabilities will have a positive weighted expression, while TFs that correlate with the CM lineage will have a negative weighted expression.

Branch-specific TF activity

We first clustered the JCF cell population into cells with more epicardial and more CM potential based on our previous CellRank analysis results. Using the absorption probabilities into both fates as features, we applied k-means clustering as implemented in the scikit-learn package (v1.1.1) with k = 2. Branch-specific TF activity was defined as the product of the mean TF expression per branch and Pando coefficient for all downstream targets.

Subclustering of epicardial cells

To determine different lineages in the epicardial cells, we filtered all cells in clusters 17 and 14 originating from day 15 and recomputed the neighborhood graph on the metacell embedding with n_neighbors = 15. Leiden clustering75 was performed with a resolution of 0.7. We again used CellRank47 (v1.5.1) to get a more fine-grained set of terminal states. As for the inference of cell fate trajectories, the transition matrix was constructed by combining a velocity kernel and a pseudotime kernel with weights of 0.3 and 0.7, respectively. We used partition-based graph abstraction82 to infer the connectivity of the inferred clusters. The graph was further pruned to only contain edges with a connectivity score of >0.2. Imputed gene activities and gene expression were visualized along paths in the abstracted graph using the function scanpy.pl.paga_path.

Vibratome sectioning

To prepare live sections, spheroids were removed from the collagen gel and placed in 4% agarose (Biozym, 840004) in sterile DPBS+/+. Once the agarose had solidified, it was trimmed down to a block of approximately 1 cm × 1 cm × 1 cm with a scalpel, and 250-µm-thick slices were cut with a vibratome (VT1200S, Leica Biosystems) in a DPBS bath, following the manufacturer’s instructions. The spheroid slices were then kept in maintenance medium for 3–5 d before functional assays.

Optical action potential measurements

For optical action potential measurements, 250-µm-thick slices of spheroids derived from the AAVS1-CAG-VSFP hiPSC line29 (hPSCreg MRI003-A-6) were transferred to Tyrode’s solution (135 mM NaCl, 5.4 mM KCl, 1 mM MgCl 2 , 10 mM glucose, 1.8 mM CaCl 2 and 10 mM HEPES, pH 7.35) before imaging at 100 fps on an inverted epifluorescence microscope (DMI6000B, Leica Microsystems) equipped with a Zyla V sCMOS camera (Andor Technology). The VSFP was excited at 480 nm, and the emitted GFP and RFP fluorescence signals were separated using an image splitter (OptoSplit II, Caim Research). The fluorescence of regions of interest relative to background regions was quantified in ImageJ (National Institutes of Health), and subsequent analysis was performed in RStudio83 using custom-written scripts to determine the duration at 50% (APD 50 ) or 90% repolarization (APD 90 ). APD 50 maps were generated by aligning the split image stacks with a custom algorithm in MatLab (The MathWorks), denoising them with the CANDLE algorithm84 and calculating the ratio between the two. For each action potential, the APD was calculated directly based on the amplitude on each pixel.

Calcium imaging

Calcium imaging was performed as previously described, with some modifications85. Briefly, 250-µm-thick spheroid slices were loaded with 1 µM Fluo-4-AM (Thermo Fisher Scientific, F14201) in Tyrode’s solution (135 mM NaCl, 5.4 mM KCl, 1 mM MgCl 2 , 10 mM glucose, 1.8 mM CaCl 2 and 10 mM HEPES, pH 7.35) containing 0.01% Pluronic F-68 (Gibco, 24040-032) for 50 min at 37 °C. The solution was replaced with Tyrode’s solution for 30 min at 37 °C for deesterification of the dye before imaging at 100 fps on an inverted epifluorescence microscope (DMI6000B, Leica Microsystems) equipped with a Zyla V sCMOS camera (Andor Technology). Pacing was performed with field stimulation electrodes (RC-37FS, Warner Instruments) connected to a stimulus generator (HSE Stimulator P, Hugo-Sachs Elektronik) providing depolarizing pulses at the indicated frequencies. The fluorescence of regions of interest relative to background regions was quantified in ImageJ (National Institutes of Health), and subsequent analysis was performed in RStudio83 using custom-written scripts to determine the transient duration at 50% or 90% decay.

Quantitative real-time PCR

Total RNA was isolated from cells using the Absolutely RNA Microprep kit (Agilent, 400805), and cDNA was prepared using the high-capacity cDNA RT kit (Applied Biosystems, 4368814) according to the manufacturers’ instructions. Quantitative real-time PCR was performed using Power SYBR Green PCR master mix (Applied Biosystems, 4368706; primers are listed in Supplementary Table 11) on a 7500 Fast real-time PCR instrument (Applied Biosystems). The mRNA expression levels of genes of interest were quantified relative to GAPDH expression using the cycling threshold (ΔC t ) method.

Measurement of CM size

For cell size measurements, epicardioids were dissociated to single cells with papain as described above and reseeded at a density of 25,000 cells per cm2 on plates coated with 2 µg cm–2 fibronectin (Sigma-Aldrich, F1141). After 4 d, cells were fixed for immunofluorescence staining for cTnT and the desmosomal marker plakophilin-2 to visualize cell membranes, as described above. The area of CMs was quantified in ImageJ (National Institutes of Health).

Statistics

Statistical analysis was performed with GraphPad Prism version 9.1.0. Box-and-whiskers plots indicate the median and 25th and 75th percentiles, with whiskers extending to the 5th and 95th percentiles; bar graphs indicate the mean ± s.e.m. of all data points, unless otherwise indicated. Normally distributed data from two experimental groups were compared by Student’s t-test; otherwise a Mann–Whitney–Wilcoxon test was applied. Normally distributed data from more than two experimental groups were compared using one- or two-way ANOVA. In the case of multiple comparisons, an appropriate post hoc test was applied as indicated. A P value of <0.05 was considered statistically significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.