Descriptor : Single-cell transcriptome and epigenomic reprogramming of cardiomyocyte-derived cardiac progenitor cells

The molecular basis underlying the dedifferentiation of mammalian adult cardiomyocytes (ACMs) into myocyte-derived cardiac progenitor cells (mCPCs) during cardiac tissue regeneration is poorly understood. We present data integrating single-cell transcriptome and whole-genome DNA methylome analyses of mouse mCPCs to understand the epigenomic reprogramming governing their intrinsic cellular plasticity. Compared to parental cardiomyocytes, mCPCs display epigenomic reprogramming with many differentially-methylated regions, both hypermethylated and hypomethylated, across the entire genome. Correlating well with the methylome, our single-cell transcriptomic data show that the genes encoding cardiac structure and function proteins are remarkably down-regulated in mCPCs, while those for cell cycle, proliferation, and stemness are significantly up-regulated. In addition, implanting mCPCs into infarcted mouse myocardium improves cardiac function with augmented left ventricular ejection fraction. This dataset suggests that the cellular plasticity of mammalian cardiomyocytes is the result of a wellorchestrated epigenomic reprogramming and a subsequent global transcriptomic alteration. Understanding cardiomyocyte epigenomic reprogramming may enable the design of future clinical therapies that induce cardiac regeneration, and prevent heart failure.


Bi-transgenic mouse model
All experimental protocols were approved by the Institutional Biosafety Committee and Animal Care and Use Committee at Cedars-Sinai Medical Center. The methods were carried out in accordance with the approved guidelines. Bi-transgenic MerCreMer/ZEG mice were produced by cross-breeding the cardiomyocyte-specific αMHC-MerCreMer mice and the ZEG reporter mice (Jackson Laboratory) as described previously 6,[26][27][28] . To induce Cre-mediated gene recombination for GFP labeling specifically in cardiomyocytes, genotype-verified double heterozygous MerCreMer-Z/EG mice (2-month old) were treated with 4-OH-tamoxifen for 2 weeks followed by a waiting period of 2 weeks as described previously 6,9 . Cardiomyocytes were isolated using enzymatic dissociation method as described 6,9 .

Adult cardiomyocyte enrichment
Adult mouse cardiomyocytes were enriched as previously described 6 . Briefly,~3 month old bi-transgenic mouse hearts were enzymatically digested to dissociate ACMs. The non-cardiomyocyte cell types were removed from a total cell population by sequential sedimentation and Percoll gradient centrifugation. The enriched cardiomyocytes were then pre-plated in preparations of over 100,000 cells onto 22 mm culture glasses and assayed for positive and negative expression markers via confocal microscopy. RT-PCR amplification of cardiomyocyte expression markers was also conducted to confirm their enrichment.

mCPC enrichment
To maximize the viability of transgenic mouse myocytes and follow their dedifferentiation, modified cardiac explant culture techniques described previously were used to generate myocyte-derived cardiac progenitor cells (mCPCs) 6,29 . Briefly, tamoxifen-treated bitransgenic mouse hearts were partially digested in calcium-free Tyrode's physiological solution supplemented with 0.15 Wünsch unit ml À 1 of collagenase made from Liberase Blendzyme 4 (Roche Applied Science), followed by washing for 3 min in KB solution. The hearts were cut into small pieces in~0.1 mm 3 and rinsed in KB solution for 3 times. Tissues were transferred to laminin-coated 10 mm tissue culture dishes, with M199 medium containing 100 U mL À 1 penicillin, 100 μg mL À 1 streptomycin, 5% FBS (Invitrogen), 25 μM Blebbistatin, ITS (5 μg ml À 1 insulin and transferrin, 5 ng ml À 1 selenium), and 10 mM β-hydroxybutyric acid for the first two days of culture. Blebbistatin, ITS, and β-hydroxybutyric acid were replaced with bFGF 0.1 ng ml À 1 and TGF-β3 1 ng ml À 1 , and FBS increased to 20% in subsequent cell culture. Medium was partially replaced every 2-3 days. Loosely adherent mCPCs were harvested by gentle pipetting 3 times with a disposable transfer pipette. Flow cytometry was used to characterize the immuno phenotypes of cardiac explant-derived cells as described previously 30 . Dedifferentiated cells spontaneously flattened, lost their striation and inward rectifier potassium current (decreased electrical capacitance), and ultimately re-entered the cell cycle as GFP-expressing mCPCs.

Microfluidic capture of single cardiomyocytes and single cardiac progenitor cells
Freshly isolated adult (~3-month old) cardiomyocytes and mCPCs (derived from GFP-myocytes) were captured using microfludic chip (Fig. 1f in the accompanying paper) 31,32 . The adult cardiomyocytes were used as controls. Briefly, isolated adult cardiomyocytes were preserved in EGTA-free KB solution 6 , while mCPCs were kept in cold PBS during microfludic separation. Single-cells were isolated by encapsulation in droplet using a microfluidic device as described previously 32,33 . Once individual cells were isolated and verified under the microscope, each was lysed with 2 μl of a cell lysis buffer for cDNA amplification and single-cell whole transcriptome profiling. Alternatively, 5 μl Cell-to-Ct lysis buffer supplemented with DNase I (Ambion) was used to lyse each single cell for the validation of gene expression by Real-Time TaqMan single qPCR assays or by TLDA qPCR array.
cDNA synthesis and amplification from single cardiomyocytes and single cardiac progenitor cells Single-cell cDNA synthesis and amplification were performed using NuGen's Ovation One-Direct System. Briefly, the first strand cDNA was synthesized from single cell lysate using a unique first strand DNA/RNA chimeric primer mix and reverse transcriptase. The primers had a DNA portion that hybridizes either to the 5′ portion of the poly (A) sequence or randomly across the transcript. Reversetranscription extended the 3′ DNA end of each primer generating first strand cDNA. The resulting cDNA/mRNA hybrid molecule contained a unique RNA sequence at the 5′ end of the cDNA strand and subsequently, double stranded cDNA with DNA/RNA heteroduplex at one end was generated. Ribo-SPIA amplification of cDNA produced enough amount of cDNA for labeling and microarray hybridization. group, was used for two types of arrays 36,37 . Restriction enzyme McrBC was used to digest genomic DNA as it recognizes the site (A/G) m C(N 40-3000 )(A/G) m C, with an optimal separation of 55-103 bp, covering nearly half of all possible 5-methylcytosine nucleotides in the genome 38 . For each sample, half of the fragmented gDNA was subjected to McrBC digestion thereby methylated cytosines were cut into smaller fragments; the other half was not treated with McrBC enzyme. Both McrBC-treated and untreated portions were fractionated by 1% agarose gel according to the CHARM protocol. The McrBC-treated portion was methyl-depleted (MD) DNA and the untreated (UT) portion represented the total genomic DNA input. Recovered MD and UT DNA from agarose gel were subsequently amplified using a GenomePlex Complete Whole Genome Amplification (WGA2) kit (Sigma) according to manufacture protocol. Amplified MD (equivalent to experimental sample) and UT (equivalent to input/control sample) samples were labeled with Cy5 and Cy3, respectively, according to standard NimbleGen Array protocol. Relative DNA Methylation in cardiomyocytes and mCPCs was analyzed using Roche NimbleGen 2.1 M mouse genome CHARM 2.1M array and NimbleGen DNA Methylation CpG Islands Pus RefSeq Promoter array (3 × 720 K). Labeled DNA samples were hybridized to arrays according to NimbleGen Array User Guide, and arrays scanned using MS 200 Microarray Scanner (Roche NimbleGen) and features extracted by NimbleScan software.

Single-cell whole-transcriptomic profiling by microarray
Validation of gene expression by TaqMan RT-qPCR and TLDA at single-cell level cDNA synthesis from individual cells of additional single cardiomyocytes and mCPCs was performed using the Cell-to-Ct protocol and pre-amplification option (Ambion). The cardiomyocyte-specific gene (Myh6) and putative progenitor and stem cell genes (c-kit and Sox2) together with two endogenous controls (Actb and Gapdh) were validated by TaqMan RT-qPCR assays (Applied Biosystems) at singlecell level. Moreover, a panel of genes for stem cell pluripotency and differentiation included in TaqMan Low Density Array (TLDA, 96 genes including house-keeping genes such as GAPDH; Applied Biosystems) were profiled for a separated subset of single cells using our optimized streamline protocols including single-cell lysis, cDNA synthesis, and pre-amplification. Real-time qPCR was performed on a 7900HT Fast Real-Time PCR System (Applied Biosystems) and data was collected and analyzed using SDS 2.3 software suite. Ct values were normalized to endogenous controls, and comparative 2 − ΔΔCt method was used to evaluate the relative gene expression in mCPCs versus cardiomyocytes 6 . DataAssist 2.0 (Applied Biosystems) was used to analyze the expression changes. As for comparison, normalized intensity of probe(s) of specific genes was used to calculate gene expression fold changes detected by microarray.

Microarray data analysis
NimbleGen Mouse CHARM 2.1M and 3 × 720 K Arrays. The CHARM package in R/Bioconductor was used to generate QC reports for NimbleGen Mouse CHARM 2.1M Array to identify globally differentially methylated regions of the genome for each biological sample 36 . The differentially methylated regions were defined by a criterion of unadjusted P-value less than 0.05 in either type of array. The regions (genes) were defined as differentially methylated if a CHARM peak was within 2,000 bases pairs upstream and/or 500 base pairs downstream of the transcription start site.
Partek Genomics Suite software (Version 6.6; Partek, Inc., St Louis, MO) was used to analyze the NimbleGen Mouse 3 × 720 K Array. The Loess normalization method was used to preprocess the raw intensity data and obtain M-values. The ANOVA method was used to identify differentially methylated regions by criterion of P-values less than 0.001 and |MAT scores|>2.
Affymetrix Mouse 430 2.0. The Partek Genomics Suite was used to assess the quality of the spike-in control probes used in the Affymetrix Mouse Genome 430 2.0 Array. The raw intensity data from each biological sample was preprocessed using Partek's RMA method as a means to generate backgroundnormalized gene expression data. Differential gene expression was analyzed using the ANOVA method in the Partek Genomics Suite. Significant differentially expressed genes were selected if they had fold-change values greater than 2 plus P ≤ 0.05 with FDR value as ≤0.05.

Data Records
We deposited the raw data sets including two different DNA methylome microarray data sets (NimbleGen Mouse CHARM 2.1M and 3 × 720 K Arrays) in pair format, and an Affymetrix GeneChip Mouse Genome 430 2.0 in CEL format (see Data Citation 1,Data Citation 2, and Data Citation 3). Each dataset contains 6 samples in two groups, i.e., mCPC versus control cardiomyocytes, with 3 biological replicates in each group. We provided qPCR and TLDA data with descriptions in figshare as well (see Data Citation 4).

Technical Validation
Quality Metrics for NimbleGen Mouse CHARM 2.1M and 3 × 720 K Arrays QC metrics were generated using the CHARM R/Bioconductor package for the two types of NimbleGen tiling arrays which were used to analyze each biological sample's methylome. Figures 2a and 3a show the three quality metrics that were calculated for the NimbleGen Mouse CHARM 2.1M and 3 × 720 K Arrays. untreated channel signal probes among the background probes, indicating the quality of hybridization and scanning. In ideal cases, a score of 100 means all signal probes rank above all background probes. In the NimbleGen Mouse CHARM 2.1M and 3 × 720 K Arrays, the signal strength scores range from 88.8 to 93.2, indicating acceptable hybridization and scanning for each sample. The signal standard deviation of untreated (column 2 in Figs 2a and 3a) and methyl-depleted (column 3 in Figs 2a and 3a) channels indicates the level of spatial artifacts in each array. In our arrays, most of the standard deviations are around 0.2 and the maximum standard deviation is only 0.34, which suggests that the arrays had limited spatial artifacts.
Quality metrics for Affymetrix Mouse Genome 430 2.0 data QC metrics were generated based on CEL files using Partek Genomics Suite, which allow us to summarize intensity values of the Affymetrix control probe-sets. A summary plot with the control probe-sets and intensity data for all six samples is depicted in Fig. 4. Figure 4a shows the hybridization concentration order of four exogenous, pre-labeled hybridization probes, which is consistent with the concentrations of the four spike-in probes (BioBoBioC oBioDoCre). Figure 4b shows the 3′/5′ ratio of beta-actin with an acceptable ratio ( o3) for all six samples, suggesting that the input RNA and cDNA derived from each single-cell sample used to generate the array data were of overall high quality. Microarray preprocessing is the most pivotal step in gene expression analyses. In our analysis, we applied the RMA method to preprocess the data, which includes background correction, normalization, and summarization. Figure 4c is the box plots of the probe intensities prior to the RMA preprocessing. Figure 4d shows the comparable range of probe sets intensities after RMA preprocessing.
Biological replicates and reproducibility of single-cell transcriptome and population cell methylome data The reproducibility of transcriptome and DNA methylome for three biological replicates within the same sample group can be used as another metric for assessing the quality of the data sets. A Principal Component Analysis (PCA) (Figs 2b, 3b and 4e,) was performed to assess the quality of reproducibility of the biological replicates. Good separation between mCPC and control samples was observed after performing a PCA on single-cell transcriptome data and on the M-values of DNA methylome data using Partek Genomics Suite. Furthermore, we generated scatter plots and calculated the pair-wise Pearson correlation coefficient (R) between any two of the three biological replicates within the same group based on 2167126 features of NimbleGen Mouse CHARM 2.1M Array, 716168 features of NimbleGen Mouse 3 × 720 K Array, and 45101 genes of Affymetrix Mouse Genome 430 2.0 Array (Figs. 2c,d and 3c,d, Fig. 4f,g). The mean r-value and standard error (s.e.) were calculated per group, yielding 6 mean r-values and 6 s.e. values with a grand mean and s.e. of 0.847 and 0.029, respectively, indicating a high-level of measurement consistency and reproducibility among the biological replicates.

Usage Notes
Briefly, we used Partek Genomics Suite and R/Bioconductor to perform data analysis in 'Technical Validation' section. In quality metrics of 'Technical Validation' section, we applied Partek Genomics Suite QC module to analyze Affymetrix Mouse 430 2.0 data. We used 'readCharm' and 'qcReport' function in CHARM package in R/Bioconductor to load data and perform QC metrics for two NimbleGen methylation tiling arrays. In biological replicates and reproducibility validation of 'Technical Validation' section, Partek Genomics Suite was used to perform 3D PCA for all three data sets. The 'pairs' function in graphics package in R was used to generate Pearson correlation figures for all three data sets.