Extended Data Figure 4: Cell grouping, representative gene expression, and number of genes detected in single-cell RNA-seq data and population-based gene expression profiling during reprogramming. | Nature

Extended Data Figure 4: Cell grouping, representative gene expression, and number of genes detected in single-cell RNA-seq data and population-based gene expression profiling during reprogramming.

From: Single-cell transcriptomics reconstructs fate conversion from fibroblast to cardiomyocyte

Extended Data Figure 4

Related to Fig. 1a–h. a, PCA scree plot showing variance of top 10 PCs. Related to Fig. 1a–c. b, Violin plots showing the expression levels of representative cardiac, fibroblast and cell-cycle genes in the seven cell groups identified by hierarchical clustering and PCA (Fig. 1a–c). c, Gene ontology analysis from Fig. 1a showing the P value of each gene ontology term. dg, Determination of the proliferation status of each single cell using genes periodically expressed during the cell cycle that were identified in a previous report39. d, A 3D LLE plot calculated based on the expression of these cell cycle genes in each of the 454 cells. e, Frequency of cells in LLE component 3. The dark red plane in d and the red dotted line in e indicate the threshold for proliferating (Pro) and non-proliferating (NP) cells. f, g, PCA plots as in Fig. 1c, but colour- and shape-coded for proliferating and non-proliferating (f) or CCA and CCI (g) cells. h, i, tSNE plots of all single cells colour- and shape-coded by hierarchical clustering and PCA cell groups (h) or proliferating and non-proliferating cell groups (i). The cells that were grouped as intermediate fibroblasts, pre-iCMs and iCMs constituted 30.6% (77 out of 252), 24.6% (62 out of 252) and 44.8% (113 out of 252) of all cells transduced with M + G + T, respectively. In contrast to previous population- and marker-based studies, our single-cell RNA-seq data suggests that the fate conversion from fibroblast to iCM occurs rapidly (approximately 3 days) with nearly 45% of the cardiac fibroblasts exhibiting transcriptomic signatures indicative of a cardiac fate. j, Live fluorescent images of day-5 MGT-transduced cardiac fibroblasts showing co-expression of α-MHC–GFP and Thy1 (surface labelling). Double-positive cells are labelled with an asterisk (*). All images were taken at 40× magnification. Scale bar, 100 μm. k, α-MHC–GFP+Thy1+ and α-MHC–GFPThy1+ cells were FACS-sorted from day-7 MGT-transduced cardiac fibroblasts and expression of representative cardiac (Myl4 and Actc1) and fibroblast (Col3a1 and Postn) markers were determined by qRT–PCR. Day-7 mock-transduced cells were included as control. Data are mean ± s.d. n = 4 samples. One-way ANOVA followed by Bonferroni correction (two-sided), **P < 0.01, *** P < 0.001; ns, not significant. Myl4 and Actc1 expression increased 80–100-fold and reached approximately the same level as Gapdh in α-MHC–GFP+Thy1+ cells compared to mock transduction. Expression level of the fibroblast marker Postn was maintained at a high level in GFP+Thy1+ cells. For the other fibroblast marker, Col3a1, although its relative expression in GFP+Thy1+ cells was decreased compared to mock-transduced and GFPThy1+ cells, but its absolute expression was still high compared to Gapdh (around 1.4-fold of Gapdh). The data strongly support the existence of cardiomyocyte- and fibroblast-marker double-positive pre-iCM and suggest that the pre-iCM state represents an intermediate cell population that is transitioning from intermediate fibroblast to iCM or that is locked between intermediate fibroblast and iCM during reprogramming. l, To determine whether iCMs may be differentiated from rare cardiac stem/progenitor cells, we plotted the expression of cardiac stem/progenitor cell markers in each of the hierarchical clustering and PCA single-cell groups using violin plots. All of these markers were nearly undetectable in fibroblasts, intermediate fibroblasts, pre-iCMs and iCMs, suggesting the direct conversion from cardiac fibroblast to iCM without going through a stem/progenitor cell stage. m, Distribution of gene expression levels in single cells. Data are mean ± s.e.m. n = 454 cells. The limit of gene detection was set to 1 based on this plot. n, Distribution of the number of genes detected in all, CCI or CCA single cells. Comparison of the distributions in CCI and CCA cells using a two-sample Kolmogorov–Smirnov test resulted in a one-sided P value of 5.248 × 10–11, suggesting that the number of genes in CCI is significantly smaller than in CCA. On the basis of this result, only CCI cells were used in o. o, Distribution of the number of genes detected in each CCI cell group. Analysis using a one-sided, two-sample Kolmogorov–Smirnov test (P values: 0.00521 for intermediate fibroblasts versus fibroblasts, 0.00481 for pre-iCMs versus intermediate fibroblasts and 1.104 × 10−6 for iCMs versus pre-iCMs) suggests that the number of genes expressed decreased when the cells adopted the iCM fate. This observation demonstrates a dynamic re-patterning of transcription machinery during reprogramming and is consistent with the hierarchical clustering analysis and experimental evidence that pre-iCMs co-expressed both cardiac and fibroblast markers, further indicating that the pre-iCM state constitutes an intermediate population during iCM reprogramming. pv, Population-based gene expression profiling of reprogramming cardiac fibroblasts at day 0, 3, 5, 7, 10 and 14. p, q, Results from PCA analysis using all genes were similar to those using the top 400 genes (rv). p, Scree plot of the top 10 PCs. q, A 3D PCA score plot. rv, Analyses using the top 400 PCA genes. Related to Fig. 1g, h. r, Scree plot of top 10 PCs. s, PCA score plot using PC1 and PC3. t, Hierarchical clustering identified four major gene clusters: gradually upregulated during reprogramming (red, mainly cardiac genes), downregulated in MGT-transduced compared to LacZ-transduced (blue, mainly extracellular matrix (ECM) genes) and gradually upregulated (light grey) or downregulated (dark grey) in both LacZ and MGT cells (culture or viral effects, mainly ECM and immune-response genes). The results are consistent with the expression of representative genes selected from single-cell data (Fig. 1h), showing gradually increased expression of cardiomyocyte markers during reprogramming, first increased and then decreased expression of cell-cycle genes in both MGT and LacZ cells, and significantly lower fibroblast markers in MGT compared to LacZ cells at each time point. u, Heat map showing loading (weight) of the genes in t in PC1, 2 and 3. Upregulated (cardiac) genes are highly weighted in PC1, and the other three gene clusters are highly weighted in PC2 and PC3. The results are consistent with s and Fig. 1g. v, Gene ontology analysis of the four gene clusters in t, showing gene ontology terms and their corresponding P values (listed on the right).

Source data

Back to article page