Main

During development, cells progressively differentiate with phenotypically distinct fates. These cellular identities, established by cell type-specific gene expression programs, are sustained over cell divisions throughout an organism’s lifespan. However, this view of irreversible identity has been challenged by the discovery that somatic cells display a certain degree of plasticity in numerous contexts, including pluripotent reprogramming (hereafter called reprogramming) and oncogenic transformation (hereafter called transformation)1,2. Here, cellular plasticity is defined as the capability of cells to change identity outside normal development and tissue homeostasis3.

During reprogramming, the transcription factors (TFs) Oct4, Sox2, Klf4 and c-Myc (OSKM) trigger widespread reconfiguration of chromatin and TF occupancy, which orchestrates a gradual gain of cellular plasticity and a concomitant loss of cellular identity in mouse embryonic fibroblasts (MEFs)4. Activation of the pluripotent network occurs later on, leading to the generation of induced pluripotent stem cells (iPS cells)5,6,7,8,9,10. Despite the description of reprogramming roadmaps of diverse cell types6,9,11,12,13,14,15, the molecular mechanisms coordinating the stepwise gain of plasticity and loss of identity remain largely unknown, yet they are critical for acquiring pluripotency.

Transformation shares features with reprogramming: both processes are constrained by oncogenic barriers and subjected to significant latencies16,17,18,19,20. Moreover, premature termination of reprogramming facilitates cancer development21,22. Gain of plasticity and loss of identity are also critical in various transformation contexts1,20,23. Cancer formation frequently relies on the activation of developmental programs that increase cellular plasticity, thus fuelling tumour heterogeneity1,24. Recent findings point to a crucial role for the oncogenic variant of the K-ras gene, which harbors a substitution of glycine for aspartic acid at codon 12 (K-rasG12D), in triggering such changes. When combined with c-Myc exogenous expression and p53 depletion, K-rasG12D drives changes of cellular plasticity early during MEF transformation18,25,26. In the lung and pancreas, K-rasG12D alters the identity of specific cell types and increases their plasticity, fostering early tumorigenesis27,28,29.

Altogether, these findings indicate that coordinated changes of cellular plasticity and identity are crucial for reprogramming and transformation. However, the sequence of events that orchestrate these modifications, as well as their degree of interdependency, have never been comparatively dissected. Tackling these questions is instrumental for the safe design of regenerative strategies based on in vivo reprogramming30,31 but also to discover regulators of cancer cell plasticity1,24. Here, we combined a variety of single-cell, multiomics and phenotypic assays to address these questions. First, by defining the single-cell trajectories of reprogramming and transformation, we unveiled that both processes intersect early before they bifurcate. Next, we identified the TF B cell leukaemia/lymphoma 11B (Bcl11b) as a crucial regulator of reprogramming and transformation but also as a pertinent marker that, when combined with thymus cell antigen 1 theta (Thy1), delineates an ordered sequence of cellular intermediates emerging during reprogramming and transformation. Multiomics characterization led us to unveil a regulatory axis, centred on the TF atonal bHLH transcription factor 8 (Atoh8), that acts as a broad-range lock of cellular plasticity during reprogramming, neuron transdifferentiation and transformation.

Results

Deciphering and comparing reprogramming and transformation

We developed a mouse model, entitled repro-transformable, to conditionally induce reprogramming or transformation in the same population of cells (Fig. 1a). OSKM was selected as the prototypical cocktail of reprogramming4,30,31,32. The cooperation between K-rasG12D and c-Myc was chosen as it triggers MEF transformation18,25,26. R26rtTA;Col1a14F2A mice33 carrying an inducible OSKM cassette were crossed with LSL-K-rasG12D;R26cre-ERT2 mice harbouring an excisable K-rasG12D allele34 and MEFs were derived (Fig. 1a). Doxycycline treatment led to the formation of iPS colonies (15 days (d); efficiency = 0.21 ± 0.1%) expressing Nanog and Ssea1 (Fig. 1b) and capable of undergoing multilineage differentiation in teratomas (Fig. 1c). Tamoxifen treatment to induce K-rasG12D, combined with c-Myc expression (Fig. 1a), triggered transformation after serial passaging (30 d). Foci assays indicated clonal loss of contact inhibition (efficiency = 0.66 ± 0.3%) (Extended Data Fig. 1a). Soft agar assays revealed the acquisition of anchorage-independent growth potential (Extended Data Fig. 1b). Injection of transformed cells (TC) into mice led to the formation of liposarcoma-like tumours (Fig. 1d).

Fig. 1: Comparing single-cell trajectories of reprogramming and transformation.
figure 1

a, Schematic of the repro-transformable mouse model. Reprogramming (repro; doxycycline-induced OSKM expression) or transformation (transfo; tamoxifen-induced K-rasG12D expression combined with c-Myc overexpression) gave rise to iPS cells or transformed cells (TCs), respectively. b, Immunofluorescence staining of repro-induced iPS cells for Ssea1 and Nanog. Scale bar, 100 µm. c, Histological analysis of teratomas derived from iPS cells. Scale bar, 1 mm. d, Tumour generated by transformed cells injected into nude mice. Scale bar, 0.2 mm. e, Proliferation curves of MEFs upon induction of repro, transfo and repro plus transfo. The data from one representative experiment out of two are shown. f,g, T-SNE visualization of scRNA-Seq profiles integrating the replicate values of 30,146 preprocessed cells (individual dots), corresponding to two biological replicates run in one sequencing experiment. The cells are coloured by sample (f) or by cluster (g). h,i, Diffusion maps of scRNA-Seq profiles where the cells are coloured by sample (h) or by cluster (i). The trajectories defined by Slingshot are represented by red (repro) and blue lines (transfo). The intersection area is indicated by a red box. j, Composition of samples in the intersection area. k,l, Patterns of the MEF identity signature score using gene lists from Schiebinger et al.13, with the score represented on the diffusion map (k) or on the calculated pseudotime trajectories (l).

Source data

This model provides a unique opportunity to compare reprogramming and transformation in a genetically matched manner. We showed that MEF proliferation increased in response to both processes and this effect was cumulative (Fig. 1e). Next, we evaluated the impact of 3 d of reprogramming or transformation on DNA damage. As expected, K-rasG12D/c-Myc triggered the formation of γH2AX phosphorylation foci in 45.1 ± 10.0% of cells (Extended Data Fig. 1c,d) and similar results were obtained with other oncogenic events, including p53 depletion and H-rasG12V expression (Extended Data Fig. 1e). Conversely, reprogramming did not significantly induce γH2AX foci. Moreover, when both processes were simultaneously induced, OSKM significantly prevented γH2AX foci formation triggered by K-rasG12D/c-Myc (Extended Data Fig. 1c,d). A preventive effect of OSKM was also observed on the changes of cell cycle features but not on apoptosis induced by K-rasG12D/c-Myc (Extended Data Fig. 1f–i). Altogether, we identified similar and divergent responses to reprogramming and transformation, as well as a preventive action of OSKM on cell cycle and DNA damage induced by K-rasG12D/c-Myc.

Single-cell trajectories of reprogramming and transformation

Next, we compared the cellular trajectories of reprogramming and transformation. Single-cell RNA sequencing (scRNA-Seq) was conducted on MEFs either left untreated or induced for 5 or 10 d of reprogramming or transformation, as well as on fully reprogrammed (iPS) and transformed cells (TCs). After preprocessing 30,146 cells, principal component analysis (PCA) and t-distributed stochastic neighbour embedding (t-SNE) defined 12 clusters of cells (Extended Data Fig. 1j,k). To focus on the early dynamics, we defined eight clusters by excluding the iPS and transformed cell samples (Fig. 1f,g). Diffusion maps35 and Slingshot36 were used to establish pseudo-temporal ordering of cells in a high-dimensional gene expression space and to infer the cellular trajectories (Fig. 1h,i). This unveiled that single reprogramming and transforming cells (mainly from clusters 3, 4, 7 and 8) intersect within a reprogramming–transformation area before they bifurcate (Fig. 1i,j), suggesting the existence of shared transcriptomic features. Single-sample gene set enrichment analysis37 was used next to compute activity scores for different pathways. The use of two independent scores12,13 revealed a progressive decrease in MEF identity in cells progressing into reprogramming and transformation trajectories, as well as in iPS and transformed cells (Fig. 1k,l and Extended Data Fig. 1l–n). In contrast, proliferation was modulated mainly independently of the trajectories (Extended Data Fig. 1o). Collectively, we unveiled an early intersection between the trajectories of reprogramming and transformation that suggests the existence of molecular similarities in individual cells.

Bcl11b hinders reprogramming, transformation and transdifferentiation

The scRNA-Seq dataset constitutes a unique tool to identify somatic barriers. By computing marker genes of each cluster, we identified 150 genes expressed predominantly in MEFs (clusters 1 and 2 (C1 and C2, respectively)) (Fig. 2a and Supplementary Table 1). Gene set enrichment analysis (PantherDB) highlighted enrichment for embryo development and transcription regulation (Fig. 2b). Among them, we identified the glycoprotein Thy1, which has already been reported as a MEF marker during reprogramming6. We assessed whether Thy1 levels correlated with reprogramming and immortalization potential. For reprogramming, Thy1low and Thy1high cells were sorted by fluorescence-activated cell sorting (FACS) after 5 d of OSKM induction and replated at similar densities. Thy1low cells formed significantly more alkaline phosphatase-positive (AP+) iPS colonies than Thy1high cells, as reported previously6,15 (Extended Data Fig. 2a,b). For transformation, with a similar sorting (5 d post K-rasG12D/c-Myc induction), Thy1low cells formed fourfold more foci than Thy1high cells (Extended Data Fig. 2c,d). Even if the observed differences are limited, Thy1 can be used to slightly enrich fractions of cells prone to reprogramming and transformation.

Fig. 2: Bcl11b broadly constrains cell fate changes.
figure 2

a, Patterns of the gene signature score composed of 150 genes enriched in C1 and C2 on the diffusion map. The graph integrates the 30,146 preprocessed cells of two biological replicates that were run in one sequencing experiment. b, Statistical over-representation assays conducted with PantherDB on the gene signature. c, Patterns of Thy1 and Bcl11b transcript levels on the diffusion map. d, Western blot for Bcl11b in MEFs, Thy1low and Thy1high cells after 5 d of reprogramming and transformation. e, Top, experimental design. Bottom, pictures of soft agar colonies, representative of four independent experiments. f, Colony quantification (n = 4 independent experiments). g, Top, experimental design. Bottom, pictures of iPS colonies stained for AP. Dox, doxycycline. h, Colony quantification (n = 3 independent experiments). i, Pou5f1+ colony quantification (n = 3 independent experiments). j, Pictures depicting the histological analysis of teratomas. Two independent teratomas were analysed per cell line. Scale bars, 1 mm. k, Top, experimental design. Bottom, pictures depicting mouse T cells and iPS cells obtained following reprogramming. cKO, conditional knockout; Tamox, tamoxifen. Scale bars, 120 µm. l, Colony quantification (n = 2 independent experiments). m, Top, experimental design. Bottom, pictures of MAP2+ neural progenitors. Scale bars, 100 µm. n, MAP2+ cell quantification (n = 2 independent experiments). o, Venn diagram showing the numbers of differentially expressed genes in control versus Bcl11b KD MEFs (orange) and control versus Bcl11b OE MEFs (blue) (log2[FC] < −0.5 or >0.5; adjusted P value < 0.05). p, Distribution of endogenous Bcl11b peaks in relation to genes. UTR, untranslated region. q, Distribution of Bcl11b peaks in relation to the TSS. kb, kilobases. TSS, transcription start site. r, Most enriched DNA-binding motifs associated with Bcl11b derived from a de novo motif analysis (MEME). s, Graph presenting the distribution of Bcl11b peaks on genes deregulated by Bcl11b modulation in MEFs (differentially expressed genes in Bcl11b KD and Bcl11b OE versus control MEFs. t, Western blot depicting Bcl11b and ERK1/2 levels in Control and Bcl11b KD MEFs. In f, h and i, the data represent means ± s.d. Statistical significance was determined by Fisher’s exact two-sided test (b) or two-tailed Student’s t-test (f, h and i).

Source data

Among the identified candidates, we selected the TF Bcl11b, which was previously described as a cellular identity gatekeeper in haematopoiesis, for further investigation (Fig. 2c)38. We showed that Bcl11b expression is high in MEFs, specifically decreased in Thy1low cells during reprogramming and transformation, and silenced in iPS and transformed cells (Fig. 2d and Extended Data Fig. 2e). Interrogation of published datasets broadened Bcl11b downregulation to keratinocyte reprogramming (Extended Data Fig. 2f).

First, we investigated Bcl11b function during transformation. Bcl11b downregulation by RNA interference (Bcl11b knockdown (KD)) (Extended Data Fig. 2g,h), before the induction of transformation, significantly increased the efficiency of soft agar colony formation. In contrast, Bcl11b overexpression (Bcl11b OE) severely hindered the process, indicating that a tight Bcl11b level safeguards MEFs from transformation (Fig. 2e,f). Similar results were obtained in foci assays (Extended Data Fig. 2i,j). Next, we assessed Bcl11b function during reprogramming. Bcl11b KD significantly improved the efficiency of generation of AP+ but also Pou5f1-GFP+ iPS colonies (Fig. 2g–i). Similar results were obtained using Bcl11b conditional KO MEFs (Extended Data Fig. 2k,l). However, Bcl11b OE did not negatively impact the reprogramming efficiency (Fig. 2g,h). Of note, Bcl11b KD iPS cell lines were capable of forming three germ layers in teratoma (Fig. 2j), indicating that Bcl11b loss is compatible with the acquisition of multilineage differentiation potential. Because Bcl11b is expressed in T lymphocytes39, T cells isolated from mice conditional KO for Bcl11 were induced to reprogramme (Fig. 2k). Bcl11b depletion triggered the formation of twofold more AP+ iPS colonies (Fig. 2k,l). In addition, Bcl11b KD, before the induction of MEF transdifferentiation into neurons, significantly improved the efficiency of generation of MAP2+ cells (Fig. 2m,n)40.

Next, we combined RNA-Seq and chromatin immunoprecipitation sequencing (ChIP-Seq) assays to identify the gene regulatory network (GRN) controlled by Bcl11b. Transcriptomic analyses of Bcl11b KD and Bcl11B OE MEFs led, respectively, to the identification of 774 and 321 deregulated genes compared with control MEFs (adjusted P value < 0.05; log2[fold change (FC)] > 0.5 or <−0.5) (Fig. 2o). Bcl11b ChIP-Seq identified 7,430 specific peaks located mainly in the vicinity of genes (<10 kilobases from the transcription start site (TSS)) with enrichment for an Elk motif (Fig. 2p–r). Among the 979 genes deregulated by Bcl11b, 122 (12.4%) presented a Bcl11b-specific peak (Fig. 2s). Moreover, while MEF identity was not significantly impacted by Bcl11b deregulation (Extended Data Fig. 2m), we noticed that several Bcl11b targets were associated with the Mapk pathway, such as Calponin-1 and Bmf41,42. In line with this, we unveiled that Bcl11b constrains phospho-Erk1/2 levels in MEFs, potentially explaining its barrier role during reprogramming43 (Fig. 2t). Altogether, we demonstrated that Bcl11b regulates reprogramming, transformation and transdifferentiation, as well as a specific GRN and phospho-Erk1/2 levels.

Bcl11b faithfully indicates reprogramming and transforming potential

Next, we investigated whether Bcl11b could be used as a marker to track cells changing fate using Bcl11b-tdTomato reporter MEFs (Extended Data Fig. 2n)38. FACS analysis confirmed that the majority of MEFs expressed Bcl11b-tdTomato. However, after 5 d of reprogramming or transformation, a subset of Bcl11blow cells emerged (Extended Data Fig. 2o). Bcl11blow cells, sorted at day 5 of reprogramming, formed sevenfold more AP+ iPS colonies than Bcl11bhigh cells (Extended Data Fig. 2p,q). Bcl11blow cells, sorted at day 5 of transformation, formed immortalized foci with a tenfold higher efficiency than Bcl11bhigh cells (Extended Data Fig. 2r,s). Collectively, these results identify Bcl11b as a MEF marker whose downregulation faithfully reflects the ability of cells to engage into pluripotency or immortalization paths.

Capture of early cellular intermediates using Bcl11b and Thy1

Our scRNA-Seq analysis did not allow the interrogation of the functional features of individual cells4,18. Therefore, we attempted to design a strategy to isolate early cellular intermediates using FACS. Most reprogramming strategies combined the downregulation of a MEF marker with the activation of a pluripotent factor6,15,44,45,46. However, as we aimed to capture cells emerging during both reprogramming and transformation, the use of pluripotent markers was not possible. We noticed, in contrast, that the downregulation of Bcl11b and Thy1 was not occurring in the same cells during reprogramming and transformation (Fig. 2c) and in published OSK-mediated reprogramming dataset (Extended Data Fig. 3a)14. This finding was confirmed by the visualization of four subpopulations of cells (Bcl11bhigh/Thy1high (BHTH), Bcl11bhigh/Thy1low (BHTL), Bcl11blow/Thy1high (BLTH) and Bcl11blow/Thy1low (BLTL)) on the diffusion map or trajectories (Fig. 3a and Extended Data Fig. 3b). This result prompted us to investigate whether the combined downregulation of Bcl11b and Thy1 can be used to capture cellular intermediates by FACS. We profiled Bcl11b and Thy1 changes during reprogramming and transformation using Bcl11b-tdTomato MEFs (Fig. 3b). To begin with a homogeneous population, BHTH MEFs were FACS sorted to purity. In the absence of reprogramming or transformation, MEFs stably maintained a BHTH phenotype (Extended Data Fig. 3c,d). By day 17 of both processes, most cells displayed Bcl11b and Thy1 downregulation, as expected (Fig. 3b). However, rare BLTL cells emerged as early as 3 d after the induction of reprogramming (R-BLTL) and transformation (T-BLTL) (Fig. 3b). We demonstrated that R-BLTL and T-BLTL cells were respectively highly prone to forming iPS (Fig. 3c) or immortalized (Fig. 3d) colonies compared with R-BHTH and T-BHTH cells that remained heavily refractory. Next, we assessed the emergence of BLTL cells with alternative molecular cocktails that did not rely on c-Myc. BLTL cells emerged during reprogramming induced by Sall4-Nanog-Esrrb-Lin28 (ref. 5) or by Oct4-Sox2-Klf4-Wnt inhibitor IWP2 (ref. 47). BLTL cells also emerged during transformation induced by cyclin E, H-RasG12V and p53 depletion (Fig. 3e and Extended Data Fig. 3e). In addition, BLTL cells were found to emerge during reprogramming and transformation of mouse adult ear fibroblasts (Fig. 3f and Extended Data Fig. 3f) and to be more efficient at forming pluripotent colonies than BHTH cells (Fig. 3g,h).

Fig. 3: Sequence of intermediates during reprogramming and transformation.
figure 3

a, Representation of Bcl11b and Thy1 expression in single cells. The thresholds were as follows: Bcl11b < 1 and Thy1 < 2 for BLTL; Bcl11b > 2 and Thy1 < 2 for BHTL; Bcl11b < 1 and Thy1 > 4 for BLTH; and Bcl11b > 2 and Thy1 > 4 for BHTH. b, Expression of Bcl11b-tdTomato and Thy1 during reprogramming and transformation. KI: knock-in. c, Left, pictures of iPS colonies from a representative experiment. Right, quantification of AP+ colonies (n = 6 independent experiments). d, Left, pictures of foci assays from a representative experiment. Right, foci quantification (n = 5 independent experiments). e, Emergence of Bcl11b-tdTomatolow/Thy1low cells. The graph represents the distribution of BHTH, BLTH, BHTL and BLTL cells. IWP2, Wnt inhibitor; OSKM, Oct4, Sox2, Klf4, c-Myc; SNEL, Sall4, Nanog, Esrrb, Lin28. BHTH cells were FACS sorted before reprogramming/transformation. f, Emergence of BLTL cells from mouse adult ear fibroblasts. The settings were similar to those for e. g, Left, pictures of iPS colonies following reprogramming induced by OSK + IWP2, taken from a representative experiment. Right, quantification of AP+ colonies (n = 2 independent experiments). h, Left, pictures of iPS colonies from mouse adult ear fibroblasts, taken from of a representative experiment. Right, quantification of AP+ colonies (n = 3 independent experiments). i, Schematic of the experimental design. j, Left, brightfield images of tumours in a chick (as indicated by the dashed lines). Right, quantification of the tumours (n = 23 for T-BLTL cells; n = 27 for T-BHTH cells). k, Left, tumour growth curves. Right, survival curves of the mice (n = 6 animals per group). l, Left, FACS profiles. Cells harboring various levels of Bcl11b and Thy1 were FACS sorted at day 5 of reprogramming, plated back in culture and analyzed 2 days later. Right, quantification of AP+ colonies (n = 5 independent experiments). m, Left, FACS profiles. Cells harboring various levels of Bcl11b and Thy1 were FACS sorted at day 5 of transformation, plated back in culture and analyzed 2 days later. Right, quantification of foci (n = 6 independent experiments). n, Schematic of the sequence of intermediates. The corresponding efficiencies are indicated using arbitrary units. In h, j and the left panel of k, the data represent means ± s.d. Statistical significance was determined by two-tailed Student’s t-test (c, d, h and j), two-way ANOVA combined with Šidák’s multiple comparisons test (left panel in k), Gehan–Breslow–Wilcoxon test (right panel in k) or one-way ANOVA followed by Tukey’s post-hoc test (l and m).

Source data

Next, we assessed whether T-BLTL cells acquired increased aggressiveness compared with T-BHTH cells48,49 by comparing the functional features of transformed cell lines generated from these subsets of cells. Practically, T-BHTH and T-BLTL cells were FACS sorted 5 d after the induction of transformation, replated and serially passaged to establish independent polyclonal cell lines. While the cell lines presented similar growth curves when grown in two dimensions in vitro (Extended Data Fig. 3g), T-BLTL-derived lines formed sevenfold more soft agar colonies than T-BHTH-derived ones (Extended Data Fig. 3h,i). Next, we performed chick chorioallantoic membrane and mouse xenograft assays as in vivo models of tumorigenesis (Fig. 3i). The size of the tumours generated in chick embryos by T-BLTL-derived cells was significantly higher than by T-BHTH (Fig. 3j). An accelerated growth of T-BLTL-derived tumours and reduced survival were also observed in immunocompromised mice (Fig. 3k). These data indicate that Thy1 and Bcl11b loss broadly delineate early intermediates highly amenable to forming pluripotent or tumorigenic derivatives.

Sequence of intermediates during reprogramming and transformation

Next, we sought to characterize the sequential emergence of intermediates during reprogramming and transformation. To ensure that changes in Bcl11b/Thy1 (Fig. 3b) reflected the transition of individual cells from one stage to the next, and not merely the loss of one major population and expansion of another, each fraction was sorted after 5 d of reprogramming then replated for 48 h before FACS analysis. The progression of cellular intermediates revealed the routes induced by OSKM. First, we observed that R-BLTL cells were stable as they did not transit efficiently into other states. R-BHTH and R-BHTL cells generated R-BLTL cells at a very low rate while R-BLTH cells transited into R-BLTL cells efficiently (35%) (Fig. 3l), suggesting that Bcl11b downregulation is a rate-limiting step of reprogramming, in line with our previous results (Fig. 2). Importantly, these cellular progressions were correlated with the capacity of the intermediates to form AP+ colonies (Fig. 3l and Extended Data Fig. 3j). For transformation, the T-BLTL state was also relatively stable and prone to forming immortalized foci (Fig. 3m). T-BHTH and T-BLTH cells were poorly efficient at generating T-BLTL while T-BHTL cells efficiently reached this state (Fig. 3m and Extended Data Fig. 3k). On this basis, we generated the functional roadmaps presented in Fig. 3n.

Chromatin reconfigurations in early intermediates

We next dissected the reconfigurations of chromatin accessibility by conducting assay for transposase-accessible chromatin sequencing (ATAC-Seq) on cellular intermediates captured at day 5 of reprogramming or transformation. PCA analysis showed that R-BLTL and T-BLTL cells segregated together on the x axis (principal component 1) (Fig. 4a) and towards the direction of the iPS/transformed cells (Extended Data Fig. 4a), suggesting the existence of common changes of chromatin accessibility, as exemplified with Thy1 (Fig. 4b). To test this, we classified the peaks in clusters defining regions that were accessible in MEFs but not in both R-BLTL and T-BLTL cells (C1); became accessible in both R-BLTL and T-BLTL cells over MEFs (C2); specifically lost (C3) or gained (C4) accessibility in R-BLTL cells over MEFs and T-BLTL cells; or specifically lost (C5) or gained (C6) accessibility in T-BLTL cells over MEFs and R-BLTL cells (Fig. 4c). We also generated an unsupervised heatmap to visualize peak intensities for differential loci (Extended Data Fig. 4b). To uncover TFs possibly driving these changes, we performed a DNA motif enrichment on clusters (Fig. 4d and Extended Data Fig. 4c). A subset of regulatory elements changing accessibility in R-BLTL and T-BLTL cells were enriched in the FosL1 motif (C2, C3 and C5), suggesting relocation of this TF. We assessed whether Fosl1 functionally regulates both processes. FosL1 depletion (Extended Data Fig. 4d,e) led to a fourfold reduction in the number of immortalized foci (Extended Data Fig. 4f,g) and an average sixfold increase in reprogramming efficiency (Extended Data Fig. 4h,i). Hence, ATAC-Seq shed light on changes in chromatin accessibility that occur specifically or commonly during reprogramming and transformation, as well as identified FosL1 as a common but antagonistic regulator.

Fig. 4: Chromatin and transcriptome reconfigurations in cellular intermediates.
figure 4

a, PCA conducted on ATAC-Seq data. Untreated MEFs (black), BLTL and BHTH cells FACS sorted after 5 d of reprogramming (red) or transformation (blue) are represented. b, Example of ATAC chromatin sites at the Thy1 locus. c, Definition of the clusters described in the main text (n = 2 independent experiments). Central lines represents medians, box edges represent upper and lower quartiles and whiskers show the highest and lowest values, excluding outliers (at most 1.5× the interquartile range above or below the upper and lower quartile). d, Enrichment in TF motifs. Each point represents significant enrichment in the motif (x axis) for the cluster (y axis). The point size represents the proportion of sequences in the cluster featuring the motif and the colour gradient represents the enrichment significance. e, PCA conducted on RNA-Seq data. f, Venn diagram showing the numbers of differentially expressed genes in MEFs versus R-BLTL cells (red), MEFs versus T-BLTL cells (blue) and control MEFs versus Bcl11b KD MEFs (green) (log2[FC] < −0.5 or >0.5; base mean < 40; adjusted P value < 5 × 10−2). g, Visualization of R-BLTL and T-BLTL score activities on single-cell trajectories. h, Venn diagram showing the numbers of differentially expressed genes in T-BLTL versus T-BHTH cells (blue) and R-BLTL versus R-BHTH cells (red) (log2[FC] < −1 or >1; adjusted P value < 5 × 10−2). i, Statistical over-representation assays conducted with PantherDB. Statistical significance was determined by Fisher’s exact two-sided test. j, Top, patterns of the downregulated genes signature plotted on the diffusion map. Bottom, patterns of the upregulated genes signature plotted on the diffusion map. k, Heatmap clustering the 301 commonly deregulated genes. The MEF sample was excluded from the representation. l, Western blot depicting Atoh8, Id4, Twist2 and Gapdh levels in cellular intermediates. m, Western blot depicting Atoh8, Id4, Twist2 and Actin levels in MEFs, iPS and transformed cells.

Source data

Transcriptomic changes in early intermediates

PCA conducted on RNA-Seq data of day 5 cellular intermediates revealed that R-BLTL and T-BLTL cells segregated together on the x axis, suggesting common changes (Fig. 4e). In addition, a significant number of the genes deregulated by Bcl11b modulation in MEFs (Fig. 2o) were also impacted in both R-BLTL and T-BLTL cells (Fig. 4f). Next, we exploited published datasets to characterize R-BLTL and T-BLTL cells. MEF identity scores12,13 were not downregulated in R-BLTL and T-BLTL cells (Extended Data Fig. 4j), indicating that these cells constituted early intermediates that gained plasticity but did not yet downregulate identity, in contrast with previously isolated intermediates8. In line with this, R-BLTL and T-BLTL cells did not induce CD73 and CD49d, which delineate late intermediates (Extended Data Fig. 4k)45. R-BLTL and T-BLTL cells harboured some moderated reductions in stromal markers (Csf1, Prrx1 and Id3) but no concomitant inductions of mesenchymal-to-epithelial transition (MET) markers (Fut9 and Zic3), reinforcing the notion that they are not yet fully engaged on a MET trajectory (Extended Data Fig. 4l)13. These findings demonstrate that a gain of cellular plasticity is not correlated with, but rather precedes, a loss of cellular identity or engagement into MET. Next, we attempted to position R-BLTL and T-BLTL cells on the trajectories of reprogramming and transformation by defining activity scores. Cells with high R-BLTL and T-BLTL score activity emerged early along the respective trajectories and increased in number during progression towards pluripotency and malignancy (Fig. 4g).

To identify molecular regulators of cellular plasticity, we next interrogated the specific transcriptomic features of R-BLTL and T-BLTL cells. Some 410 genes were differentially expressed between R-BLTL and R-BHTH cells and 1,389 genes were differentially expressed between T-BLTL and T-BHTH cells, with a shared signature of 301 genes (adjusted P value < 0.05; log2[FC] > 1 or <−1) (Fig. 4h) enriched in stem cell differentiation but also immunity (Fig. 4i)8. Interestingly, the modulation of these genes also occurred in the single-cell dataset (Fig. 4j). Next, we conducted heatmap analysis to identify clusters of genes permanently or transiently deregulated in R-BLTL and T-BLTL cells (Fig. 4k). Cluster 1 corresponded to genes permanently repressed during reprogramming and transformation, including Thy1 and Bcl11b. Clusters 2 and 3 encompassed genes transiently repressed in R-BLTL and T-BLTL cells but reactivated respectively in transformed or iPS cells such as Icam1 (ref. 44). Of importance, we noticed that a network of basic helix–loop–helix (bHLH) TFs, encompassing the Atoh8, Id4 and Twist2 transcripts, was downregulated in R-BLTL and T-BLTL cells (Fig. 4k). We confirmed the downregulation of Atoh8, Id4 and Twist2 proteins in cellular intermediates (Fig. 4l) but Atoh8 silencing was solely maintained in iPS and transformed cells (Fig. 4m). Overall, these results identify shared transcriptional modifications occurring during reprogramming and transformation.

Atoh8 regulates the acquisition of malignant features

We addressed whether Atoh8 functionally controls cellular plasticity during transformation. Atoh8 KD, before the induction of transformation, significantly increased the ability of MEFs to grow independent of anchorage (Extended Data Fig. 5a–c) and form immortalized foci (Extended Data Fig. 5d–f). Comparable results were obtained by CRISPR–Cas9 Atoh8 KO (Extended Data Fig. 5g–j), demonstrating that Atoh8 constrains transformation. To assess whether Atoh8 controls the pace at which MEFs acquire malignant features, control and Atoh8 KD MEFs were subjected to soft agar assays as early as 6 d after transformation induction (Fig. 5a). Atoh8 KD cells succeeded in forming colonies, while control cells largely failed (Fig. 5a,b), demonstrating that Atoh8 constrains the temporal acquisition of anchorage-independent growth properties. In line with this, Atoh8 depletion was found to accelerate the emergence of T-BLTL cells (Fig. 5c,d).

Fig. 5: Atoh8 regulates the acquisition of transformed features.
figure 5

a, Top, experimental design. Bottom, picture of soft agar colonies, representative of three independent experiments. b, Colony quantification (n = 3 independent experiments). c, FACS analysis showing the emergence of Bcl11b-tdTomatolow/Thy1low (BLTL) cells during transformation in Control and Atoh8 KD backgrounds. d, Graph depicting the emergence of BLTL cells (n = 3 independent experiments). e, Histograms depicting Atoh8 transcript levels in patients. The data are presented as the log2[ratio of Atoh8 fragments per kilobase of transcript per million mapped reads (FPKMs)] between malignant and healthy tissues in paired samples. f, Schematic of the experimental design. Cells were split for at least ten passages (30 d) before subsequent analyses. g, Volcano plot showing differentially expressed genes in Control versus Atoh8 KD transformed line. Each dot corresponds to a transcript. Blue dots represent log2[FC] > 1 or <−1 and adjusted P value < 0.00001. Benjamini–Hochberg-adjusted P values of the comparisons were computed using the limma-voom workflow modified two-sided t-test. h, Western blot for E-cadherin, Snail, Vimentin, Twist1 and Gapdh in Control and Atoh8 KD transformed cell lines. i, PCA of the scRNA-Seq data. j, Single-cell pseudotime trajectories. k, Heatmap of the top temporally expressed genes based on the Atoh8 KD cell trajectory. l, Xenograft tumour volume over time (n = 6 independent mice per group). m, Survival plot of the mice from l (n = 6 independent mice). In b, d and l, the data are presented as means ± s.d. Statistical significance was determined by two-tailed Student’s t-test (b, d and l) or Kaplan–Meyer test (m).

Source data

In line with a putative function in tumorigenesis, data from The Cancer Genome Atlas (TCGA) revealed significant downregulation of Atoh8 expression in malignant tissues compared with paired peritumoral tissues in various cancers (Fig. 5e)50. While the role of Atoh8 in established cancer cells has been addressed, its function in cellular plasticity during transformation remains unknown50. We addressed this question by establishing polyclonal cell lines transformed in the presence or absence of Atoh8 (Fig. 5f). Bulk RNA-Seq led to the identification of 803 differentially expressed genes with a significant induction of epithelial markers (Cdh1 and Epcam) (log2[FC] > 1 and <1; adjusted P value < 10−5) (Fig. 5g and Extended Data Fig. 5k), indicating that Atoh8 impacts the establishment of the transformed transcriptome. We confirmed the significant increase in the E-cadherin protein but also the steady expression of Vimentin or Snail in Atoh8 KD-derived cells (Fig. 5h and Extended Data Fig. 5l). Next, we conducted scRNA-Seq to assess whether these differences corresponded to the emergence of alternative cellular states. Among five clusters of cells, PCA and t-SNE analyses revealed that C3 was found in both Control and Atoh8 KD populations (Fig. 5i,j). In contrast, C5 was composed nearly exclusively of Atoh8 KD-derived cells (Extended Data Fig. 5m). Pseudotime calculations inferred two main trajectories: C1–C2–C3–C4 and C1–C2–C3–C5 (Fig. 5j). These data demonstrate that Atoh8 depletion diverts cells during transformation towards a new cluster enriched in genes linked to cancer cell invasion such as Ube2c51 (Fig. 5k). Based on this, we assessed whether Atoh8 regulates tumorigenicity. Atoh8 KD-derived lines were found to be significantly more prone to growing under non-adherent conditions (Extended Data Fig. 5n–s). Injection into immunocompromised mice showed an increased growth of Atoh8 KD-derived tumours and a significant reduction in the overall survival of mice (Fig. 5l,m). Collectively, these data indicate that Atoh8 constrains cellular plasticity and the emergence of highly aggressive tumour cells.

Atoh8 hinders reactivation of the pluripotent network

Interrogation of reprogramming datasets6,9,12,13,52,53 confirmed Atoh8 expression in MEFs and downregulation in reprogramming intermediates (Extended Data Fig. 6a). In contrast, mouse adult ear fibroblasts barely expressed Atoh8, as reported (Extended Data Fig. 6b,c)54. Atoh8 KD (Fig. 6a and Extended Data Fig. 6d,e) led to a fourfold increase in the number of AP+ (Extended Data Fig. 6f,g) and Pou5f1-GFP+ iPS colonies (Fig. 6b,c). Similar improvements were observed with Atoh8 CRISPR–Cas9 KO (Extended Data Fig. 6h–j), demonstrating that Atoh8 constrains mouse reprogramming. In contrast, Atoh8 OE was not sufficient to constrain reprogramming or transformation (Extended Data Fig. 6k–n). Atoh8 KD established iPS lines expressed similar Oct4, Sox2 and Nanog levels to controls (Extended Data Fig. 6o,p) and differentiated into three germ layers in teratoma (Fig. 6d). During reprogramming, cells activate the endogenous pluripotency network and become independent of the OSKM transgenes15. We showed that the emergence of Pou5f1-GFP+ cells is significantly accelerated by Atoh8 depletion (Fig. 6e,f). To evaluate transgene independency, OSKM doxycycline-inducible MEFs were exposed to doxycycline for 6 d and iPS colony emergence was monitored (Fig. 6g). Atoh8 KD cells succeeded in forming AP+ iPS colonies while control cells failed (Fig. 6h,i). Of note, iPS cell lines derived from Pou5f1-GFP+ Atoh8 KD cells FACS sorted at day 6 of reprogramming expressed similar levels of pluripotency markers to bona fide iPS lines (Fig. 6j and Extended Data Fig. 6q,r). These findings demonstrate that Atoh8 hinders reactivation of the endogenous Pou5f1 gene and acquisition of transgene independency.

Fig. 6: Atoh8 constrains cellular plasticity during reprogramming and transdifferentiation.
figure 6

a, Experimental set-up. b, Immunofluorescence of Pou5f1-GFP+ colonies at day 15 of reprogramming. Scale bars, 100 µm. c, Pou5f1+ colony quantification (n = 3 independent experiments). d, Histological analysis of teratomas. Two independent teratomas were analysed per cell line. Scale bars, 1 mm. e, Graph depicting the percentage of Pou5f1-GFP+ cells (n = 3 independent experiments). f, FACS analysis showing the emergence of Pou5f1-GFP+ cells. g, Experimental design. h, Image representing AP+ colonies at day 15 of reprogramming, representative of three independent experiments. i, Colony quantification (n = 3 independent experiments). j, Western blot for Sox2, Ssea1, Nanog, Oct4 and Actin in Control and Atoh8 KD-derived iPS cell lines. k, Schematic of human induced pluripotent stem cell reprogramming. l, Picture representing SSEA4+ colonies at day 26 of reprogramming, representative of two independent experiments. Scale bars, 100 µm. m, Colony quantification (n = 2 independent experiments). n, Immunofluorescence for Oct4 and Nanog. hiPS, human induced pluripotent stem. Scale bars, 100 µm. o, Schematic depicting MEF-to-neuron transdifferentiation. p, MAP2 immunostaining. Scale bars, 100 µm. q, MAP2+ cell quantification per field (n = 2 independent experiments). In c, e and i, the data represent means ± s.d. and statistical significance was determined by two-tailed Student’s t-test.

Source data

Atoh8 constrains human reprogramming and mouse transdifferentiation

We assessed Atoh8 function in human dermal fibroblast (HDF) reprogramming, during which it is also rapidly downregulated (Fig. 6k and Extended Data Fig. 7a)11,55. CRISPR–Cas9 ATOH8 KO significantly increased the number of AP+ (Extended Data Fig. 7b,c) and SSEA4+ iPS colonies (Fig. 6l,m). Control and ATOH8 KO iPS cell lines expressed comparable levels of pluripotency markers (Fig. 6n and Extended Data Fig. 7d), as well as differentiation genes in embryoid bodies (Extended Data Fig. 7e). Next, we evaluated whether Atoh8 also hinders MEF-to-neuron transdifferentiation (Fig. 6o)40. Atoh8 depletion led to a threefold increase in the number of MAP2+ induced neuronal cells (Fig. 6p,q), demonstrating that Atoh8 broadly constrains TF-mediated cell conversions.

Atoh8 fine tunes WNT signalling via Sfrp1

Assuming that TF binding determines its function, we assessed Atoh8 genomic distribution in MEFs (Extended Data Fig. 7f). ChIP-Seq led to the identification of 1,826 peaks, principally distributed in upstream and downstream gene regions and introns (Fig. 7a,b). The specificity of reads was confirmed with a mock sample (Extended Data Fig. 7g)56. Motif analysis showed that Atoh8 bound preferentially to the CAGCTG motif (E-box), as expected for a bHLH TF, even if the AP-1 motif was also enriched (Fig. 7c). In contrast with the bHLH TFs Ascl1 and MyoD, which mainly bind to inaccessible regions as pioneer factors, Atoh8 binds preferentially in accessible (ATAC-Seq-positive) enhancer regions enriched in H3K27Ac/H3K4Me1, distant from the TSS9,57 (Fig. 7d and Extended Data Fig. 7h–j). We noticed a significant overlap in Atoh8 and Bcl11b binding, especially on the Atoh8 locus itself (Fig. 7e and Extended Data Fig. 7k).

Fig. 7: Atoh8 restrains cellular plasticity by binding specific enhancers to limit WNT signalling activity.
figure 7

a, Heatmap displaying Atoh8 read counts ± 1 kilobase around merged peak summits. b, Genomic distribution of Atoh8-specific peaks. c, The most enriched DNA-binding motifs associated with Atoh8, derived from a de novo motif analysis (MEME). d, Mean read count enrichment density associated with H3K27Ac and H3K4Me1. e, Venn diagram showing the numbers of Bcl11b versus Atoh8 peaks. f, Fraction of Atoh8 sites retrieved within open chromatin regions (ATAC-Seq) during reprogramming and transformation. g, Examples of open chromatin sites. Chr, chromosome. h, Atoh8-centred chromatin state analysis. ES, Embryonic stem cells. i, Volcano plot showing differentially expressed genes in Atoh8 KD versus control MEFs. Samples were collected 5 d after RNA interference induction. Each dot corresponds to a transcript. Benjamini–Hochberg-adjusted P values of the comparisons were computed using the limma-voom workflow modified two-sided t-test. j, Top, schematic depicting the c-Myc binding site (BS) on the Atoh8 promoter. Bottom, real-time qPCR showing the levels of DNA immunoprecipitated. The data are represented as a percentage of Atoh8 DNA levels in ChIP input and are from two independent experiments. k, Western blot for c-Myc, Atoh8 and Gapdh in MEFs in response to c-Myc exogenous expression. l, Graph depicting the fold-enrichment as determined by statistical over-representation analysis. RTK, Receptor tyrosine kinase. m, Top, pictures representing AP+ colonies. Bottom, pictures representing cresyl violet foci. n, Quantification (n = 3 independent experiments). o, Quantification of AP+ colonies and foci (n = 3 independent experiments). p, Top, pictures representing AP+ colonies. Bottom, pictures representing cresyl violet foci. q, Quantification (n = 6 independent experiments for the left panel and n = 3 for the right panel). r, Western blot of Atoh8 expression in MEFs treated with recombinant Wnt3a for 48 h. s, Schematic recapitulating the main findings of the study. In n, o and q, the data represent means ± s.d. and statistical significance was determined by two-tailed Student’s t-test.

Source data

Next, we conducted ATAC-Seq on MEFs after 5 d of reprogramming or transformation to assess the dynamic behaviour of Atoh8-bound regions. They remained overall largely accessible except in iPS cells (Fig. 7f,g). To focus on reprogramming, we conducted a chromatin combinatorial state analysis using data for chromatin accessibility, histone marks and TFs in MEFs, after 48 h of OSKM expression and in pluripotent cells9,58. Atoh8-bound regions progressively lost H3K27Ac/H3K4Me1 while gaining H3K9Ac in pluripotent cells (Fig. 7h). A significant proportion of Atoh8 peaks were co-occupied by c-Myc in MEFs, but this fraction dropped significantly during reprogramming, indicating a gradual relocation of c-Myc, in parallel with a transient binding of Oct4 and, to a lesser extent, Sox2. Altogether, these data indicate highly dynamic reconfigurations of the Atoh8-bound regions during reprogramming and transformation.

Next, we defined the GRN controlled by Atoh8. RNA-Seq conducted on MEFs (Control) and after 5 d of Atoh8 KD identified 503 deregulated genes (log2[FC] > 0.8 and <−0.8; adjusted P value < 10−5) (Fig. 7i). At this time point, MEF identity scores were not impacted, suggesting that the Atoh8 primary function is not to safeguard identity (Extended Data Fig. 7l). Interestingly, Atoh8 depletion induced c-Myc (Fig. 7i and Extended Data Fig. 7m). Conversely, c-Myc bound to the Atoh8 promoter and repressed its expression (Fig. 7j,k), revealing a negative feedback loop. In addition, over-representation assays linked Atoh8 with WNT signalling and Sfrp1 (Fig. 7i,l), in line with β-catenin upregulation in Atoh8 KD MEFs (Extended Data Fig. 7n). These results prompted us to evaluate whether Atoh8 constrains cellular plasticity by tuning WNT signalling via Sfrp1. We first found that Sfrp1 is downregulated in R-BLTL and T-BLTL cells (Extended Data Fig. 7o). Next, we assessed whether Sfrp1 depletion mimics the effects of Atoh8 KD. Sfrp1 KD (Extended Data Fig. 7p,q) led to a significant increase in the numbers of AP+ colonies and immortalized foci (Fig. 7m,n). Moreover, simultaneous suppression of Atoh8/Sfrp1 improved reprogramming/transformation efficiencies in a similar range to Atoh8, indicating that Atoh8 exerts its function via a regulatory axis involving Sfrp1 (Fig. 7o). In line with this, we demonstrated that the effect of Atoh8 on reprogramming/transformation is significantly reduced by recombinant Sfrp1 (Fig. 7p,q). WNT signalling activation also induces Atoh8 expression, highlighting a negative feedback loop (Fig. 7r). Finally, we evaluated whether Bcl11b, Atoh8, FosL1 and Sfrp1 regulate the expression of drivers of reprogramming/transformation and of each other. We showed by RNA interference that they do not regulate the expression of Oct4 and K-rasG12D (Extended Data Fig. 7r). However, putative interplays between FosL1 and Sfrp1 were identified (Extended Data Fig. 7s). Thus, we showed that Atoh8 is a target of WNT and c-Myc that constrains cellular plasticity by tuning WNT activity via Sfrp1.

Discussion

Coordinated changes of cellular plasticity and identity emerged as critical for reprogramming and transformation1,20,23,59. However, the cellular and molecular roadmaps that orchestrate these intermingled modifications, as well as their degree of analogy and coupling, have never been comparatively dissected despite constituting crucial topics for regenerative medicine60. Because genetic models enabling the conduction of such analyses were lacking, in this study we generated repro-transformable mice to rigorously compare the responses to reprogramming and transformation in a genetically matched manner. In complement to reports defining trajectories of reprogramming6,11,13,14,45, we provide a high-resolution analysis of the early cellular intermediates that emerge simultaneously during both processes. First, by inferring the single-cell trajectories of reprogramming and transformation, we identified the TF Bcl11b as a broad regulator of cell fate, enlarging initial findings in haematopoietic cells38, but also as a faithful indicator of the propensity of cells to change fate. This reconstruction also unveiled how cells intersect in a reprogramming–transformation area, in which the transcriptomes of reprogramming and transforming cells share analogies. The combined use of Bcl11b and Thy1 confirmed the existence of cellular intermediates sharing transcriptomic but also epigenomic traits, therefore providing a concrete demonstration of the previously proposed correlative analogies between cancer and reprogramming21,22,59. Of interest, we underlined cell-surface markers (CD14, CD53, CD72 and CD84) that are transiently upregulated in intermediates and might help to refine their isolation. We also unveiled that a gain of cellular plasticity and a loss of cellular identity are uncoupled in these intermediates. The acquisition of plasticity in R-BLTL and T-BLTL cells indeed precedes the significant loss of MEF identity. While both phenomena are coupled in the literature6,11,13,14,45, our results rather suggest the existence of partitioned regulatory networks controlling identity or plasticity.

Next, we exploited this finding to uncover molecular determinants of cellular plasticity. We identified the TF Atoh8—initially described in neurodevelopment and later on as a cellular context-dependent regulator of reprogramming50,54,61—as a MEF-specific barrier of cellular plasticity. Genome-wide binding analysis revealed a unique location to highly dynamic enhancer regions compared with other bHLH TFs (that is, c-Myc, Ascl1 and MyoD)9,57. Moreover, Atoh8 was found to control a specific GRN that tunes WNT signalling—a pathway well described to promote plasticity in regeneration and cancer62,63. Therefore, we revealed the existence of a c-Myc/Atoh8/Sfrp1 regulatory axis specifically constraining plasticity without impacting identity, reinforcing the concept that both processes are uncoupled and regulated by specific networks of genes. The control of cellular plasticity and identity has emerged as one of the most crucial research topics for regenerative medicine and cancer biology60. By providing a single-cell comparative deconvolution of reprogramming and transformation, and by identifying molecules uncoupling the gain of plasticity and the loss of identity in various biological processes (Fig. 7s), our work provides a conceptual framework that opens fascinating perspectives for regenerative medicine and cancer biology.

Methods

Mice and MEFs

The animals were maintained in a specific pathogen-free animal facility P-PAC (Plateforme du petit animal du CRCL) at the Cancer Research Center of Lyon and the Center Léon Bérard, Lyon, France. All of the experiments were performed in accordance with the animal care guidelines of the European Union and were validated by the local Animal Ethics Evaluation Committee (C2EA-15 agreed by the French Ministry of High School and Research). R26rtTA;Col1a14F2A (ref. 64), LSL-K-rasG12D (ref. 34), R26-CREERT2, Oct4-EGFP and Bcl11b-tdTomato and Bcl11bflox/flox38 mice were housed under standard conditions and bred in accordance with French national guidelines. Genotyping was carried out on genomic DNA derived from adult and embryonic tails using the DirectPCR Lysis Reagent (102-T; Viagen Biotech) and EconoTaq Plus Green 2X Master Mix (Lucigen). Supplementary Table 2 lists the primers that were used.

MEFs were isolated from E13.5 embryos after removal of the head and internal organs. The remaining tissues were physically dissociated and incubated in trypsin at 37 °C for 10 min, after which the cells were resuspended in MEF medium. When indicated, MEFs were treated with 1 µg ml−1 recombinant Sfrp1 (9019-SF-025; R&D Systems) or 100 ng ml−1 mouse Wnt3a (315-20 Peprotech).

Teratoma

Teratoma formation assays were performed by injecting 1 × 106 iPS cells into the testes of 7-week-old severe combined immunodeficient (SCID) mice (CB17/SCID; Charles River Laboratories). After 3–4 weeks, the mice were euthanized and lesions were surgically removed and fixed in 4% paraformaldehyde for sectioning and haematoxylin and eosin staining.

Plasmids and constructs

pMXS-Oct4, pMXS-Sox2, pMXS-Klf4, pMXS-Myc, pLKO.1 and pWPXLd plasmids were purchased from Addgene. Short hairpin RNAs (shRNAs) against Trp53, FosL1, Atoh8 and Sfrp1 were designed using the MISSION shRNA library from Sigma–Aldrich and ligated using the Rapid DNA ligation kit (Sigma–Aldrich) into the pLKO.1 vector digested with AgeI and EcoRI. Supplementary Table 2 lists the shRNA sequences. Atoh8 and Bcl11b complementary DNA was amplified from MEFs and cloned into the pWPXLd expression vector at the BamH1 restriction site. For Atoh8 ChIP-Seq, AM-Tag was added at the carboxy terminal. Single guide RNA targeting Atoh8 (designed using the CRISPOR program) was cloned into the lentiCRISPRv2 plasmid at a BsmBI restriction site. The single guide RNA sequences are listed in Supplementary Table 2. The pWPIR H-ras G12V and cyclin E plasmids were kindly supplied by the laboratory of A. Puisieux. Tet-O-FUW-Brn2, Tet-O-FUW-Ascl1, Tet-O-FUW-Myt1l, FUW-TetO-Sall4, FUW-TetO-Nanog, FUW-TetO-Esrrb, FUW-TetO-Lin28 and FUdeltaGW-rtTA plasmids were purchased from Addgene.

Cell culture and viral production

MEF, mouse adult ear fibroblast and HDF medium consisted of DMEM supplemented with 10% foetal bovine serum (FBS), 100 U ml−1 penicillin–streptomycin, 1 mM sodium pyruvate, 2 mM l-glutamine, 0.1 mM non-essential amino acids (NEAAs) and 0.1 mM β-mercaptoethanol. T lymphocytes from the spleen of Bcl11b conditional KO mice were isolated using the Pan T Cell Isolation Kit (Miltenyi Biotec) according to the manufacturer’s instructions after removal of red blood cells by NH4Cl treatment. T cells were grown in RPMI medium supplemented with 10% FBS, 100 U ml−1 penicillin–streptomycin, 1 mM sodium pyruvate, 2 mM l-glutamine, 0.1 mM NEAAs, 10 mM HEPES, 0.1 mM β-mercaptoethanol, 10 ng ml−1 interleukin-2 and anti-CD3/CD28.

pMXs-based retroviral vectors were generated with Plat-E cells (a retroviral packaging cell line constitutively expressing gag, pol and env genes). Briefly, calcium phosphate transfection of the vectors was performed with the CalPhos Mammalian Transfection kit (Ozyme) in 10-cm dishes. The medium was replaced with 10 ml MEF medium after 7 h of incubation. The lentivirus-containing supernatants were collected 48 h later and stored at −80 °C. 293FT cells, grown in MEF medium, were used to produce lentiviral particles. The vectors were transfected along with plasmids encoding the envelope G glycoprotein of the vesicular stomatitis virus and Gag-Pol.

Pluripotent reprogramming

For doxycycline-induced reprogramming, reprogrammable R26rtTA;Col1a14F2A;Pou5f1-EGFP MEFs within three passages were plated in six-well plates at 80,000–100,000 cells per well in MEF medium. The following day, cells were infected overnight with shRNA- or single guide RNA (sgRNA)-carrying lentiviral stocks in the presence of 8 μg ml−1 polybrene. The medium was then replaced with fresh medium with 2 μg ml−1 doxycycline. MEFs were reseeded 72 h after infection on 0.1% gelatin-coated plates in iPSC medium (DMEM containing 15% KnockOut Serum Replacement, 1,000 U ml−1 leukaemia inhibitory factor, 100 U ml−1 penicillin–streptomycin, 1 mM sodium pyruvate, 2 mM l-glutamine, 0.1 mM NEAAs and 0.1 mM β-mercaptoethanol) at equal densities for each condition to normalize the potential effect of differential MEF proliferation on reprogramming efficiency. Several densities were tested (15,000–68,000 cells per cm2). Every day, the medium was either replaced or supplemented with doxycycline-containing fresh medium. Once the iPS colonies were macroscopically visible, Pou5f1-EGFP+ colonies were counted under an Axiovert 200M microscope and AP staining was performed using the Leukocyte Alkaline Phosphatase Kit (Sigma–Aldrich). Alternatively, MEFs were co-infected with OSKM retroviral vectors 48 h after lentiviral infections and cultured identically thereafter. SNEL infection (Sall4, Nanog, Esrrb and Lin28) or OSK + IWP2 (2 µM for the first 3 d of reprogramming) were also alternatively used to induce pluripotent reprogramming.

For human pluripotent reprogramming, HDFs (Sigma–Aldrich) were cultivated in MEF medium and infected with lentiviral sgRNA particles in the presence of 8 μg ml−1 polybrene. The following day, the medium was replaced with fresh medium. Two days after sgRNA infection, HDFs were infected with OSKM Sendai particles (CytoTune-iPS Sendai Reprogramming Kit, Life technologies) and the medium was replaced with fresh medium the following day and every other day until day 9. After 9 d, the cells were split onto vitronectin and the medium was changed to mTeSR medium (STEMCELL Technologies). After approximately 26 d, the colonies were SSEA4 live stained (GloLIVE Human Pluripotent Stem Cell Live Cell Imaging Kit, R&D Systems) and counted under an Axiovert 200M microscope. Alternatively, AP staining was performed with the Leukocyte Alkaline Phosphatase Kit (Sigma–Aldrich).

Oncogenic transformation

For oncogenic transformation, the LSL-K-rasG12D;R26-CREERT2 MEFs were similarly infected overnight with shRNA- or sgRNA-carrying lentiviral stocks in the presence of 8 μg ml−1 polybrene. After 48 h, the cells were co-infected overnight with shTrp53- and Myc-carrying viruses concomitantly with 4-hydroxitamoxifen treatment (1 µM) to induce K-rasG12D expression. Alternatively, the co-infection of shTrp53-, Myc- and H-rasG12V-carrying viruses was used in wild-type MEFs to initiate transformation. MEFs were reseeded 48 h post-infection in six-well plates at low density (500, 1,000 or 2,000 cells per well) in focus medium (MEF medium with 5% FBS) for the foci formation assay. The medium was then changed twice a week. After several passages of the cells derived from oncogenic transformation, soft agar assays were performed. Transformed cells were plated on an agarose-containing MEF medium layer at a density of 25,000–50,000 cells per six-well plate. Foci and soft agar colonies were stained 25–30 d later with a 0.5% cresyl violet solution in 20% methanol. shTrp53, H-rasG12V and cyclin E-expressing plasmids were also alternatively used in different combinations to induce oncogenic transformation.

Xenografts

Some 3 × 106 transformed cells were prepared in 100 µl PBS supplemented with 100 µl Matrigel and injected subcutaneously into immunocompromised SCID mice (n = 6 for each group). The volume of the tumour was then measured every 3 d until day 16.

Chick chorioallantoic membrane assay

Some 2.5 × 106 transformed cells were inoculated on the chorioallantoic membrane in the eggs of chick embryos at E11, where they formed a primary tumour. The size of the tumour was evaluated after 7 d. Numbers of replicates are indicated in the figure captions.

MEF-to-neuron transdifferentiation

Wild-type MEFs were co-infected with FUdeltaGW-rtTA and shRNA or sgRNA (control or targeting Atoh8 or Bcl11b) lentiviral plasmids at day −2 in the presence of 8 μg ml−1 polybrene. At day 0, the cells were co-infected with Tet-O-FUW-Brn2, -Ascl1 and -Myt1l lentiviral plasmids. The day after, the medium was replaced with fresh MEF medium supplemented with 2 μg ml−1 doxycycline. At day 3, the medium was replaced with fresh N3 medium consisting of DMEM/F12, 100 U ml−1 penicillin–streptomycin, 2.5 µg ml−1 insulin, 50 µg ml−1 apo-transferrin, 86.5 µg ml−1 sodium selenite, 6.4 ng ml−1 progesterone and 16 µg ml−1 putrescine supplemented with 2 μg ml−1 doxycycline. The medium was changed daily until day 7–8.

T lymphocyte pluripotent reprogramming

T cells were infected with OSKM retroviral vectors in the presence of 8 μg ml−1 polybrene the day after isolation for two consecutive days. At 4 h after infection, the medium was replaced with fresh T cell medium. At 3 d after the second infection, the cells were plated onto irradiated MEFs. The day after, the medium was replaced with iPSC medium supplemented with 10 ng ml−1 interleukin-2 and Dynabeads Human T-Activator CD3/CD28 (Life Technologies). The medium was changed every other day.

Immunofluorescence

Cells were fixed with 4% paraformaldehyde for 10 min at room temperature, washed three times with PBS, permeabilized with 0.1% Triton X-100 for 30 min at room temperature and blocked with 1% bovine serum albumin for 1 h. After incubation with primary antibodies overnight at 4 °C, the cells were washed three times with PBS and incubated with fluorophore-labelled appropriate secondary antibodies (Life Technologies). Live SSEA4 immunostaining was carried out with the GloLIVE Human Pluripotent Stem Cell Live Cell Imaging Kit (SC023B; R&D Systems). Acquisition was done with Axiovision 4.8.2 software. Supplementary Table 2 lists the antibody dilutions and secondary antibodies used.

RNA extraction and real-time quantitative PCR

Total RNAs were extracted using TRIzol reagent and 1 μg RNA was reverse transcribed with the RevertAid H Minus First Strand cDNA Synthesis Kit (Life Technologies). The quantitative PCR (qPCR) was performed with the LightCycler 480 SYBR Green I Master mix (Roche) on the LightCycler 96 machine (Roche) and LightCycler 4.1 software. Gapdh and Rplp0 were used as housekeeping genes. The qPCR primers are listed in Supplementary Table 2.

Chromatin immunoprecipitation

MEFs were infected with lentiviral particles carrying AM-tagged Atoh8. After 3 d, DNA was extracted, precipitated and purified using the Tag-ChIP-IT kit (53022; Active Motif). qPCRs were performed as described above. Bcl11b ChIP-Seq was performed using a combination of two antibodies on MEFs exogenously expressing Bcl11b cDNA.

Protein extraction and western blot

Cells were harvested in RIPA buffer (150 mM NaCl, 1% Triton, 0.5% deoxycholate, 0.1% SDS and 50 mM Tris (pH 8.0)) supplemented with protease inhibitors and phosphatase inhibitors. After 30 min on ice, lysis by sonication and then centrifugation for 10 min at 15,000g, supernatants were collected and proteins were denatured for 10 min at 95 °C in Laemmli sample buffer, separated on 4–15% polyacrylamide gel and transferred onto a nitrocellulose membrane. The membrane was blocked with 5% milk in Tris-buffered saline with 0.1% Tween 20 for 1 h then incubated with primary antibody at 4 °C overnight and secondary antibodies for 1 h at room temperature. Antigens were detected using ECL reagents. Data were acquired using Bio-Rad Image Lab Software. Band intensities were quantified using ImageJ and normalized to actin levels. The western blot antibodies are listed in Supplementary Table 2.

FACS

The following antibody was used: anti-mouse CD90.2 (Thy-1.2) APC (17-0902; eBioscience). Analysis was performed on a BD LSRFortessa with the FACSDiva version 8.0 and FlowJo version 10 software. Sorting was performed on a BD FACSAria. Apoptosis was measured using the FITC Annexin V/Dead Cell Apoptosis Kit (V13242; Invitrogen). For cell cycle analysis, the cells were fixed in 70% ethanol and stained with 40 µg ml−1 propidium iodide supplemented with 2 mg ml−1 RNase. The antibodies used in the study are listed in Supplementary Table 2.

Next-generation sequencing analyses

For bulk RNA-Seq, RNA quality was analysed using a 2100 Bioanalyzer (Agilent). RNA-Seq libraries were constructed and sequenced on an Illumina HiSeq 2000 by the cancer genomics platform on site. Fastq files were quality control checked with FASTQC (version 0.11.5). Reads from fastq files were mapped to a reference genome (GRCm38; Gencode) with STAR. The aligned reads were then converted to counts with STAR (version 2.5.2b). RNA-Seq analyses were done with the DESeq2 (version 1.30.1) package in R (version 4.0.3). For the MEF identity score calculation, we computed the fragments per kilobase of transcript per million mapped reads gene values with the DESeq2 package. Then, we used ssGSEA (GSVA R package version 1.44.0) analysis to generated the MEF identity score. Bulk ATAC-Seq and ChIP-Seq data were generated by the Active Motif company. For native Bcl11b ChIP-Seq, a combination of two antibodies was used: ab18465 (Abcam) and A300-385A (Bethyl Laboratories). For scRNA-Seq, cells were resuspended in PBS with 0.04% bovine serum albumin and the number of live cells was determined with a NucleoCounter NC-3000 (ChemoMetec) to obtain an expected cell recovery population of 5,000 cells per channel, loaded on a 10X chip and run on the Chromium Controller system (10X Genomics) according to the manufacturer’s instructions. scRNA-Seq libraries were generated with the Chromium Single Cell 3′ v3.1 assay (PN-1000121; 10X Genomics) and sequenced on the NovaSeq 6000 platform (S2 flowcell; Illumina) to obtain around 60,000 reads per cell. The Cell-Ranger Single-Cell Software Suite (version 3.0.2) was used to perform sample demultiplexing, alignment to the mouse genome, barcode assignment for each cell and gene counting by unique molecular identifier counts. Standard procedures for filtering, normalization, variable gene selection, dimensionality reduction and clustering were performed using R software version 4.0.3 (R packages SingleCellExperiment version 1.12.0 (ref. 65), scater version 1.18.6 (ref. 66) and scran version 1.18.5 (ref. 67)). Cells having <2,500 genes or a mitochondrial content >15% were excluded from the analysis. Counts were log normalized. The 1,500 most variable genes were used to reduce the dimensionality of the dataset by PCA. Based on the plot of variance explained, we kept the first six principal components for further analyses and summarized them using t-SNE. Clustering was conducted using a shared nearest neighbours graph (the buildSNNGraph function of the R package scran). Cluster-specific markers were computed with a t-test from the finderMarkers() function of the scran package67, where the combined P value of a gene is the maximum of all P values from all pairwise comparisons (pval = ”all”). Single-cell pseudotime trajectories were constructed with Slingshot36 and temporally expressed genes were identified using a general additive model (R package gam version 1.20). To create diffusion maps of the data, we used the function runDiffusionMap of the R package scater. Single-sample gene set enrichment activity scores (ssGSEA68) of the pathways were computed with the GSVA37 R package version 1.38.2. Calculation of the MEF identity score was conducted using two independent signatures from refs. 12,13. Some figures have been created using the Cerebro visualization tool version 1.3 (ref. 69).

For ATAC-Seq unsupervised hierarchical clustering, we used the pheatmap package to create a heatmap from the differentially accessible regions used to form the six clusters described above. Peak intensity labels are scaled row-wise and hierarchical clustering was performed using complete clustering. We have annotated each peak with the assigned cluster label for reference.

Atoh8 ChIP-Seq and ATAC-Seq datasets were aligned to the mouse reference genome assembly mm10 using Bowtie 2.1.0 using default parameters. Peak calling was performed using MACS 2.1.1. For experiments with replicates, BED files obtained after alignment were concatenated before MACS peak calling processing. Atoh8-specific sites were obtained by subtracting binding sites observed within the AM-tag processed control dataset (BEDTools 2.29.2). Enrichment heatmaps and mean density plots were obtained with seqMINER version 1.3.4 (ref. 70). De novo motif analysis was performed with MEME-ChIP (MEME Suite version 5.4.1)71. Read count enrichment signals were visualized with the IGV genome browser (version 2.4.15). Atoh8-centred chromatin state analysis was performed by intersecting Atoh8-specific binding sites with those associated with public data9,72 then inferring co-occuring events with ChromHMM version 1.14 (ref. 73).

Statistics and reproducibility

No statistical methods were used to predetermine sample sizes but our sample sizes are similar to those reported in previous publications74. Data distribution was assumed to be normal but this was not formally tested. No randomization was used. Data collection and analysis were not performed blind to the conditions of the experiments. For the single-cell experiments presented in Figs. 1 and 2, two biological replicates were conducted and run together in a single sequencing experiment. Western blot quantifications were performed with ImageJ. Statistical analyses of mean and variance were performed with Prism 8 (GraphPad Software) and the statistical tests are indicated. No data points were excluded. For western blots, three independent experiments gave similar results.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.