A molecular roadmap for the emergence of early-embryonic-like cells in culture

  • Nature Geneticsvolume 50pages106119 (2018)
  • doi:10.1038/s41588-017-0016-5
  • Download Citation


Unlike pluripotent cells, which generate only embryonic tissues, totipotent cells can generate a full organism, including extra-embryonic tissues. A rare population of cells resembling 2-cell-stage embryos arises in pluripotent embryonic stem (ES) cell cultures. These 2-cell-like cells display molecular features of totipotency and broader developmental plasticity. However, their specific nature and the process through which they arise remain outstanding questions. Here we identified intermediate cellular states and molecular determinants during the emergence of 2-cell-like cells. By deploying a quantitative single-cell expression approach, we identified an intermediate population characterized by expression of the transcription factor ZSCAN4 as a precursor of 2-cell-like cells. By using a small interfering RNA (siRNA) screen, we identified epigenetic regulators of 2-cell-like cell emergence, including the non-canonical PRC1 complex PRC1.6 and the EP400–TIP60 complex. Our data shed light on the mechanisms that underlie exit from the ES cell state toward the formation of early-embryonic-like cells in culture and identify key epigenetic pathways that promote this transition.

  • Subscribe to Nature Genetics for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Ishiuchi, T. & Torres-Padilla, M. E. Towards an understanding of the regulatory mechanisms of totipotency. Curr. Opin. Genet. Dev. 23, 512–518 (2013).

  2. 2.

    Surani, M. A., Hayashi, K. & Hajkova, P. Genetic and epigenetic regulators of pluripotency. Cell 128, 747–762 (2007).

  3. 3.

    Wu, G. & Schöler, H. R. Lineage segregation in the totipotent embryo. Curr. Top. Dev. Biol. 117, 301–317 (2016).

  4. 4.

    Nichols, J. & Smith, A. The origin and identity of embryonic stem cells. Development 138, 3–8 (2011).

  5. 5.

    Tarkowski, A. K. Experiments on the development of isolated blastomeres of mouse eggs. Nature 184, 1286–1287 (1959).

  6. 6.

    Tarkowski, A. K. & Wróblewska, J. Development of blastomeres of mouse eggs isolated at the 4- and 8-cell stage. J. Embryol. Exp. Morphol. 18, 155–180 (1967).

  7. 7.

    Tsunoda, Y. & McLaren, A. Effect of various procedures on the viability of mouse embryos containing half the normal number of blastomeres. J. Reprod. Fertil. 69, 315–322 (1983).

  8. 8.

    Evans, M. J. & Kaufman, M. H. Establishment in culture of pluripotential cells from mouse embryos. Nature 292, 154–156 (1981).

  9. 9.

    Smith, A. G. et al. Inhibition of pluripotential embryonic stem cell differentiation by purified polypeptides. Nature 336, 688–690 (1988).

  10. 10.

    Mitsui, K. et al. The homeoprotein Nanog is required for maintenance of pluripotency in mouse epiblast and ES cells. Cell 113, 631–642 (2003).

  11. 11.

    Chambers, I. et al. Functional expression cloning of Nanog, a pluripotency sustaining factor in embryonic stem cells. Cell 113, 643–655 (2003).

  12. 12.

    Schöler, H. R., Hatzopoulos, A. K., Balling, R., Suzuki, N. & Gruss, P. A family of octamer-specific proteins present during mouse embryogenesis: evidence for germline-specific expression of an Oct factor. EMBO J. 8, 2543–2550 (1989).

  13. 13.

    Canham, M. A., Sharov, A. A., Ko, M. S. & Brickman, J. M. Functional heterogeneity of embryonic stem cells revealed through translational amplification of an early endodermal transcript. PLoS Biol. 8, e1000379 (2010).

  14. 14.

    Chambers, I. et al. Nanog safeguards pluripotency and mediates germline development. Nature 450, 1230–1234 (2007).

  15. 15.

    Hayashi, K., de Sousa Lopes, S. M. C., Tang, F., Lao, K. & Surani, M. A. Dynamic equilibrium and heterogeneity of mouse pluripotent stem cells with distinct functional and epigenetic states. Cell Stem Cell 3, 391–401 (2008).

  16. 16.

    Kalmar, T. et al. Regulated fluctuations in Nanog expression mediate cell fate decisions in embryonic stem cells. PLoS Biol. 7, e1000149 (2009).

  17. 17.

    Toyooka, Y., Shimosato, D., Murakami, K., Takahashi, K. & Niwa, H. Identification and characterization of subpopulations in undifferentiated ES cell culture. Development 135, 909–918 (2008).

  18. 18.

    Torres-Padilla, M. E. & Chambers, I. Transcription factor heterogeneity in pluripotent stem cells: a stochastic advantage. Development 141, 2173–2181 (2014).

  19. 19.

    Martinez Arias, A. & Brickman, J. M. Gene expression heterogeneities in embryonic stem cell populations: origin and function. Curr. Opin. Cell Biol. 23, 650–656 (2011).

  20. 20.

    Morgani, S. M. et al. Totipotent embryonic stem cells arise in ground-state culture conditions. Cell Rep. 3, 1945–1957 (2013).

  21. 21.

    Marks, H. et al. The transcriptional and epigenomic foundations of ground-state pluripotency. Cell 149, 590–604 (2012).

  22. 22.

    Alexandrova, S. et al. Selection and dynamics of embryonic stem cell integration into early mouse embryos. Development 143, 24–34 (2016).

  23. 23.

    Martin Gonzalez, J. et al. Embryonic stem cell culture conditions support distinct states associated with different developmental stages and potency. Stem Cell Rep. 7, 177–191 (2016).

  24. 24.

    Macfarlan, T. S. et al. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature 487, 57–63 (2012).

  25. 25.

    Falco, G. et al. Zscan4: a novel gene expressed exclusively in late 2-cell embryos and embryonic stem cells. Dev. Biol. 307, 539–550 (2007).

  26. 26.

    Bošković, A. et al. Higher chromatin mobility supports totipotency and precedes pluripotency in vivo. Genes Dev. 28, 1042–1047 (2014).

  27. 27.

    Ishiuchi, T. et al. Early-embryonic-like cells are induced by downregulating replication-dependent chromatin assembly. Nat. Struct. Mol. Biol. 22, 662–671 (2015).

  28. 28.

    Grün, D. & van Oudenaarden, A. Design and analysis of single-cell sequencing experiments. Cell 163, 799–810 (2015).

  29. 29.

    Etzrodt, M., Endele, M. & Schroeder, T. Quantitative single-cell approaches to stem cell research. Cell Stem Cell 15, 546–558 (2014).

  30. 30.

    Buganim, Y. et al. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell 150, 1209–1222 (2012).

  31. 31.

    Guo, G. et al. Mapping cellular hierarchy by single-cell analysis of the cell surface repertoire. Cell Stem Cell 13, 492–505 (2013).

  32. 32.

    Leitch, H. G. et al. Naive pluripotency is associated with global DNA hypomethylation. Nat. Struct. Mol. Biol. 20, 311–316 (2013).

  33. 33.

    Ficz, G. et al. FGF signaling inhibition in ES cells drives rapid genome-wide demethylation to the epigenetic ground state of pluripotency. Cell Stem Cell 13, 351–359 (2013).

  34. 34.

    Habibi, E. et al. Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369 (2013).

  35. 35.

    Zalzman, M. et al. Zscan4 regulates telomere elongation and genomic stability in ES cells. Nature 464, 858–863 (2010).

  36. 36.

    Amano, T. et al. Zscan4 restores the developmental potency of embryonic stem cells. Nat. Commun. 4, 1966 (2013).

  37. 37.

    Hirata, T. et al. Zscan4 transiently reactivates early embryonic genes during the generation of induced pluripotent stem cells. Sci. Rep. 2, 208 (2012).

  38. 38.

    Eckersley-Maslin, M. A. et al. MERVL–Zscan4 network activation results in transient genome-wide DNA demethylation of mESCs. Cell Rep. 17, 179–192 (2016).

  39. 39.

    Cahan, P. & Daley, G. Q. Origins and implications of pluripotent stem cell variability and heterogeneity. Nat. Rev. Mol. Cell Biol. 14, 357–368 (2013).

  40. 40.

    Wray, J. et al. Inhibition of glycogen synthase kinase-3 alleviates Tcf3 repression of the pluripotency network and increases embryonic stem cell resistance to differentiation. Nat. Cell Biol. 13, 838–845 (2011).

  41. 41.

    Fazzio, T. G., Huff, J. T. & Panning, B. An RNAi screen of chromatin proteins identifies Tip60–p400 as a regulator of embryonic stem cell identity. Cell 134, 162–174 (2008).

  42. 42.

    Hisada, K. et al. RYBP represses endogenous retroviruses and preimplantation- and germ-line-specific genes in mouse embryonic stem cells. Mol. Cell. Biol. 32, 1139–1149 (2012).

  43. 43.

    Suzuki, A. et al. Loss of MAX results in meiotic entry in mouse embryonic and germline stem cells. Nat. Commun. 7, 11056 (2016).

  44. 44.

    Aloia, L., Di Stefano, B. & Di Croce, L. Polycomb complexes in stem cells and embryonic development. Development 140, 2525–2534 (2013).

  45. 45.

    Schwartz, Y. B. & Pirrotta, V. A new world of Polycombs: unexpected partnerships and emerging functions. Nat. Rev. Genet. 14, 853–864 (2013).

  46. 46.

    Gao, Z. et al. PCGF homologs, CBX proteins and RYBP define functionally distinct PRC1 family complexes. Mol. Cell 45, 344–356 (2012).

  47. 47.

    Levine, S. S. et al. The core of the Polycomb repressive complex is compositionally and functionally conserved in flies and humans. Mol. Cell. Biol. 22, 6070–6078 (2002).

  48. 48.

    Ogawa, H., Ishiguro, K., Gaubatz, S., Livingston, D. M. & Nakatani, Y. A complex with chromatin modifiers that occupies E2F- and Myc-responsive genes in G0 cells. Science 296, 1132–1136 (2002).

  49. 49.

    Zhao, W. et al. Essential role for Polycomb group protein Pcgf6 in embryonic stem cell maintenance and a noncanonical Polycomb repressive complex 1 (PRC1) integrity. J. Biol. Chem. 292, 2773–2784 (2017).

  50. 50.

    Sánchez, C. et al. Proteomics analysis of Ring1B–Rnf2 interactors identifies a novel complex with the Fbxl10 (Jhdm1B) histone demethylase and the Bcl6-interacting co-repressor. MCP 6, 820–834 (2007).

  51. 51.

    Macfarlan, T. S. et al. Endogenous retroviruses and neighboring genes are coordinately repressed by LSD1 (KDM1A). Genes Dev. 25, 594–607 (2011).

  52. 52.

    Peaston, A. E. et al. Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev. Cell 7, 597–606 (2004).

  53. 53.

    De Iaco, A. et al. DUX-family transcription factors regulate zygotic genome activation in placental mammals. Nat. Genet. 49, 941–945 (2017).

  54. 54.

    Hendrickson, P. G. et al. Conserved roles of mouse DUX and human DUX4 in activating cleavage-stage genes and MERVL/HERVL retrotransposons. Nat. Genet. 49, 925–934 (2017).

  55. 55.

    Wu, J. et al. The landscape of accessible chromatin in mammalian preimplantation embryos. Nature 534, 652–657 (2016).

  56. 56.

    Xu, Y. et al. The p400 ATPase regulates nucleosome stability and chromatin ubiquitination during DNA repair. J. Cell Biol. 191, 31–43 (2010).

  57. 57.

    Pradhan, S. K. et al. EP400 deposits H3.3 into promoters and enhancers during gene activation. Mol. Cell 61, 27–38 (2016).

  58. 58.

    Eid, A. & Torres-Padilla, M. E. Characterization of noncanonical Polycomb repressive complex 1 subunits during early mouse embryogenesis. Epigenetics 11, 389–397 (2016).

  59. 59.

    Miyanari, Y. & Torres-Padilla, M. E. Control of ground-state pluripotency by allelic regulation of Nanog. Nature 483, 470–473 (2012).

  60. 60.

    Guo, G. et al. Resolution of cell fate decisions revealed by single-cell gene expression analysis from zygote to blastocyst. Dev. Cell 18, 675–685 (2010).

  61. 61.

    Stacklies, W., Redestig, H., Scholz, M., Walther, D. & Selbig, J. pcaMethods—a Bioconductor package providing PCA methods for incomplete data. Bioinformatics 23, 1164–1167 (2007).

  62. 62.

    Burton, A. et al. Single-cell profiling of epigenetic modifiers identifies PRDM14 as an inducer of cell fate in the mammalian embryo. Cell Rep. 5, 687–701 (2013).

  63. 63.

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

  64. 64.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

  65. 65.

    Love, M.I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

  66. 66.

    Liu, Z. & Kraus, W. L. Catalytic-independent functions of PARP-1 determine Sox2 pioneer activity at intractable genomic loci. Mol. Cell 65, 589–603 (2017).

  67. 67.

    de Dieuleveult, M. et al. Genome-wide nucleosome specificity and function of chromatin remodellers in ES cells. Nature 530, 113–116 (2016).

  68. 68.

    Kundu, S. et al. Polycomb repressive complex 1 generates discrete compacted domains that change during differentiation. Mol. Cell 65, 432–446 (2017).

  69. 69.

    Farcas, A. M. et al. KDM2B links the Polycomb repressive complex 1 (PRC1) to recognition of CpG islands. eLife 1, e00205 (2012).

  70. 70.

    Morey, L., Aloia, L., Cozzuto, L., Benitah, S. A. & Di Croce, L. RYBP and Cbx7 define specific biological functions of Polycomb complexes in mouse embryonic stem cells. Cell Rep. 3, 60–69 (2013).

  71. 71.

    Krepelova, A., Neri, F., Maldotti, M., Rapelli, S. & Oliviero, S. Myc and Max genome-wide binding sites analysis links the Myc regulatory network with the Polycomb and the core pluripotency networks in mouse embryonic stem cells. PLoS One 9, e88933 (2014).

  72. 72.

    Ramachandran, P., Palidwor, G. A., Porter, C. J. & Perkins, T. J. MaSC: mappability-sensitive cross-correlation for estimating mean fragment length of single-end short-read sequencing data. Bioinformatics 29, 444–450 (2013).

  73. 73.

    Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

  74. 74.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  75. 75.

    Chung, D. et al. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. PLoS Comput. Biol. 7, e1002111 (2011).

Download references


We thank A. Smith (Wellcome Trust/MRC Stem Cell Institute) for providing the knock-in REX1 reporter cell line, M. Ko (Keio University) for the Zscan4c promoter plasmid, R. Enriquez-Gasca for providing a classification of MERVLs before publication, D. Reinberg (New York University Langone School of Medicine) for the rabbit antibody to PRDM14, A. Ettinger for time-lapse analysis, C. Ebel, D. Pich, T. Hofer and W. Hammerschmidt for help and access to FACS, the INGESTEM infrastructure for access to the IGBMC high-throughput high-content screening workstation, C. Thibault, F. Recillas-Targa and M. Zurita-Ortega for helpful discussions and A. Burton for critical reading of the manuscript. M.-E.T.-P. acknowledges funding from EpiGeneSys NoE, ERC-Stg ‘NuclearPotency’ (280840), the EMBO Young Investigator Programme, the Fondation Schlumberger pour l’Education et la Recherche (2016-Torres-Padilla) and the Helmholtz Association. J.M.V. acknowledges funding from the Max Planck Society and Epigenesys NoE. T.I. was a recipient of postdoctoral fellowships from the Uehara Memorial Foundation and the Human Frontier Science Programme (LT000015/2012-l). D.R.-T. was partially supported by a DGECI fellowship (2890/2014) from the National University of Mexico.

Author information

Author notes

    • Diego Rodriguez-Terrones
    •  & Xavier Gaume

    These authors contributed equally to this work.

    • Takashi Ishiuchi

    Present address: Division of Epigenetics and Development, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan

  1. Diego Rodriguez-Terrones and Xavier Gaume contributed equally to this work.


  1. Institute of Epigenetics and Stem Cells (IES), Helmholtz Zentrum München, Munich, Germany

    • Diego Rodriguez-Terrones
    • , Xavier Gaume
    • , Takashi Ishiuchi
    • , Audrey Penning
    •  & Maria-Elena Torres-Padilla
  2. Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS–INSERM, U964, Strasbourg, France

    • Diego Rodriguez-Terrones
    • , Xavier Gaume
    • , Amélie Weiss
    • , Arnaud Kopp
    •  & Laurent Brino
  3. Max Planck Institute for Molecular Biomedicine, Münster, Germany

    • Kai Kruse
    •  & Juan M. Vaquerizas
  4. Faculty of Biology, Ludwig Maximilians Universität, Munich, Germany

    • Maria-Elena Torres-Padilla


  1. Search for Diego Rodriguez-Terrones in:

  2. Search for Xavier Gaume in:

  3. Search for Takashi Ishiuchi in:

  4. Search for Amélie Weiss in:

  5. Search for Arnaud Kopp in:

  6. Search for Kai Kruse in:

  7. Search for Audrey Penning in:

  8. Search for Juan M. Vaquerizas in:

  9. Search for Laurent Brino in:

  10. Search for Maria-Elena Torres-Padilla in:


D.R.-T., X.G. and T.I. designed, performed and analyzed experiments. D.R.-T. performed most of the computational analyses. A.W. performed the screen together with X.G., under the supervision of L.B. A.K. implemented the screening analysis pipeline with L.B. K.K. performed bioinformatics analysis under the supervision of J.M.V. A.P. performed experiments for screen validation. M.-E.T.-P. designed and supervised the study. All authors contributed to manuscript preparation and read, commented on and approved the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Maria-Elena Torres-Padilla.

Integrated supplementary information

  1. Supplementary Figure 1 Controls for the single-cell expression profiling experiments of ES and 2-cell-like cells

    a, List of genes selected for the single-cell analysis classified according to their pathway or function. b, Immunofluorescence analysis using a turboGFP and an OCT4 antibody in the 2C::turboGFP cell line before and after sorting out 2-cell-like cells as indicated in Fig. 1b. Scale bar, 100 μm. c, Scatterplots of turboGFP fluorescence versus tdTomato fluorescence for feeder cells only (bottom), WT ES cells and feeder cells (middle), and the 2C::turboGFP/CAG-tdTomato reporter line with feeders (top) assayed by FACS. The presence of constitutively expressed NLS-tdTomato in the reporter line allows efficient discrimination from feeder cells. d, Normalized Ct values for the ERCC-943 spike-in comparing turboGFP and turboGFP+ cells. Note that the turboGFP and turboGFP+ cells analyzed in these plots come from independent sample preparation experiments but were processed on the same Biomark chip. Because the two groups both exhibit constant expression that is highly similar for ERCC-943, we conclude that they were normalized properly and that their expression levels are therefore comparable. Boxes indicate 25% and 75% quartiles, and the whiskers extend to 1.5 times the interquartile range. e, Graphic interpretation of the features contrasted across the first three principal components of the principal-component analyses shown in Figs. 13. f,g, Different viewpoints of the principal-component analysis of the ES and 2-cell-like single-cell dataset. This PCA was computed without the expression data from turboGFP and Zscan4. Each point corresponds to a single cell and is color-coded based on the original expression level of turboGFP (f) or Zscan4c/d/f (g) as indicated on the right. Black dots indicate no expression.

  2. Supplementary Figure 2 Zscan4 + cells are an intermediate cellular state between the ES and 2-cell-like states

    a, Accuracy of the Zscan4c::tdTomato reporter cell line used for the single-cell profiling described in Fig. 2. The graph shows the number of tdTomato+ cells that scored positive as assessed by FACS in relation to whether they belong to ES cells (no Zscan4c/d/f transcripts detected), Zscan4 + cells (Zscan4c/d/f transcripts detected) or 2-cell-like cells (Zscan4c/d/f and turboGFP transcripts detected). b,c, Principal-component projection of all datasets combined. Principal components were calculated for the aggregate of the ES, Zscan4 + and 2-cell-like datasets (Figs. 1 and 2), unlike the analyses in Figs. 2 and 3 where the Zscan4 dataset (from Fig. 2) was projected onto the principal components of the ES and 2-cell-like datasets. In b, turboGFP and Zscan4c/d/f were omitted from the calculation of the principal components. d,e, Validation of the Zscan4c::tdTomato and 2C::turboGFP cell line used for the time-lapse analysis in Fig. 2e. d, Representative immunostaining for mCherry and Zscan4 (top) and mCherry and turboGFP (bottom) from three independent cell cultures. e, Quantification of the percentage of (endogenous) ZSCAN4+ cells that also express mCherry. The reporter recapitulates endogenous expression of ZSCAN4 protein with ~92% accuracy. Error bars, s.d. Scale bar, 10 μm. f,g, Zscan4 and Zscan4 + cells were FACS sorted based on the Zscan4::mCherry reporter and cultured for 24 h, after which the percentage of turboGFP+ cells was quantified by FACS. Shown are the means ± s.d. of four independent experiments. During the 24-h window, 4% of the Zscan4 + cell population became 2C-like cells, 63% remained Zscan4 + cells and 33% lost Zscan4 reporter expression. h, Heat maps showing ATAC–seq signal intensity over 1,911 genomic regions with different accessibility in ES and 2-cell-like cells.

  3. Supplementary Figure 3 Gradual transcriptional changes accompany Zscan4 upregulation and precede entry to the 2-cell-like state

    a, The graph combines two parameters: the line (left y axis) depicts probability density and the histogram under it (right y axis) refers to absolute frequency of occurrence. The probability density function of Zscan4c/d/f expression in ES (blue), Zscan4 + (orange) and 2-cell-like (green) cells is plotted against the normalized expression of Zscan4c/d/f (x axis) in each individual cell. These three distinct levels were classified as low, mid and high based on the histogram data, which derive from the Biomark analysis. b,c, Projection of the expression profiles of Zscan4 + cells onto the principal components of the ES and 2-cell-like cell dataset (Fig. 1d). Each dot represents a single cell and is color-coded according to whether it corresponds to an ES cell, a Zscan4 low, Zscan4 mid or Zscan4 high cell, or a 2-cell-like cell according to the legend on the right. In c, cells are colored based on their expression levels of Zscan4/c/d/f as indicated on the right. Black indicates no expression. d, Density plots for Zscan4c/d/f and MT2_Mm based on single-cell RNA-seq data39. Dotted lines represent the thresholds used to classify individual cells into ES cells, Zscan4 low, Zscan4 mid or Zscan4 high cells, and 2-cell-like cells. e, Violin plots for the MT2_Mm LTR, Zscan4c/d/f and two MERVL-driven chimeric genes in the single-cell RNA-seq dataset. f, MA plots showing significantly differentially expressed genes (red) for each transitional state analyzed from single-cell RNA-seq data. The list of differentially expressed genes for each transition is shown in Supplementary Table 9. g, Heat map showing a gradual transition in the expression profiles of cells transitioning between ES cells and 2-cell-like cells based on single-cell RNA-seq data.

  4. Supplementary Figure 4 Pluripotency transcription factors and the 2-cell-like state

    a, Scatterplot showing the fluorescence intensity measurements for Oct4 and Zscan4 in individual cells as judged by immunostaining. r depicts the Pearson correlation coefficient between OCT4 and ZSCAN4 expression for each group of cells, as indicated. b,c, Validation of the Rex1::EGFP and Zscan4::tdTomato cell line by immunofluorescence. A representative single confocal section from three independent cell cultures is shown. The Rex1 knock-in construct was validated previously44. d, Density plot showing the gating parameters used for sorting the Rex1 high and Rex1 low cells in Fig. 4a. e,f, Violin and density plots showing the distribution of single-cell expression for Rex1 and Nanog. Note that in these plots ES cells were further classified into two groups according to whether they express high or low levels of Rex1, which highlights naive versus primed pluripotent states, as confirmed also by the abundance of Nanog transcripts in the same cells. g, Percentage of OCT4+, EGFP+ and ZSCAN4+ cells 48 h after transfection with siRNA for Oct4 or the scrambled control. Data shown are the means ± s.d. for three independent cell cultures. h, Percentage of EGFP+, ZSCAN4+ and 2-cell-like cells after transfection with Oct4, Nanog, Sox2 or Rex1 siRNA as compared to p150 siRNA and to the negative controls (NT and Neg). Transfection and analysis were performed as described in the Methods for the RNAi screen. Shown are the means ± s.d. from triplicate cell cultures. i, RT–qPCR analysis of MERVL and Zscan4 in the 2C::EGFP reporter cell line after transfection with the indicated siRNAs. Shown are the mean values ± s.d. of two independent cell cultures.

  5. Supplementary Figure 5 Sequential gene expression changes during the transition to the 2-cell-like state

    a, Violin plots showing the distribution of expression levels of individual cells for the indicated genes. Higher values correspond to higher expression levels, and a Ct value of 0 indicates that no amplification was detected. The median is indicated by a square. b, Schematic of significantly and differentially expressed genes related to germline development between individual stages of the transition from the ES to the 2-cell-like state. Changes were considered significant if they exhibited at least 2-fold changes across cells between individual states and P < 0.05 (Mann–Whitney U test). The arrow indicates the direction (up or down) of the changes in gene expression.

  6. Supplementary Figure 6 Pipeline and controls for analysis of the primary siRNA screen

    a, Screening was based on nuclear segmentation, following DAPI staining, for which a representative image is shown. Nuclei were segmented based on DAPI intensity, and only nuclei that met the quality control were used for further analysis (blue outlines). Scale bars, 100 μm (left) and 5 μm (right). b, Box-and-whisker plots for the negative (non-transfected cells (NT; n = 45 wells) and negative-control-siRNA-transfected cells (neg; n = 270 wells)) and positive (p150-siRNA-transfected cells (p150; n = 270 wells)) controls from the primary screen. The percentage of EGFP+, ZSCAN4+ and OCT4+ cells was determined for each cell culture well. Two-cell-like cells were defined as cells fitting all three criteria, namely: positive for EGFP and ZSCAN4 but negative for OCT4. On the graphs, boxes indicate 25% and 75% quartiles, and the whiskers extend to 1.5 times the interquartile range. Outlier wells are not shown. c, Complete results from the primary screen depicting the z scores of the 1,167 targets (mean z score of triplicate wells for each target) relative to the negative controls. The positive control p150 is depicted in red. d, Analysis of cell toxicity, as inferred by cell number, elicited upon treatment with siRNA for the top 50 hits. The heat map displays the top 50 hits ranked by their ability to induce 2-cell-like cells (left) and the cell number per well upon siRNA transfection (right). Note that, because all siRNAs were transfected using the same number of cells, changes in cell number indicate cell death and/or cell growth defects resulting from RNAi.

  7. Supplementary Figure 7 Validation of the hits obtained in the primary screen by a secondary screen and identification of new hits

    a, Box-and-whisker plots representing the results of the secondary screen for the non-transfected cells (NT), scrambled-siRNA-transfected cells (Neg) and cells transfected with siRNA for p150 (positive control). n indicates the number of cell culture wells analyzed. Two-cell-like cells are defined as cells positive for EGFP and ZSCAN4 but negative for OCT4. Boxes indicate 25% and 75% quartiles, and the whiskers extend to 1.5 times the interquartile range. Outlier wells are not shown. The mean ± s.d. of two technical replicates is shown. b, Comparison of the primary and secondary screen results for three selected hits (Ep400, Dmap1 and Ring1b). Fold changes relative to the negative control are indicated. c, Validation of individual siRNAs from the siRNA pool for the top 50 hits. The top 50 hits from the primary screen were selected for validation in the secondary screen by transfecting four different individual siRNAs, and the effect of each individual siRNA on 2-cell-like cell emergence was assessed. The number of validated hits (z score > 2 as compared to the negative control) by 4, 3, 2 or 1 siRNA is depicted. Only one hit (Dnmt3b) from the primary screen was not validated by any of the four individual siRNAs. d, Representative random, inverted dynamics merged fields of view from the secondary screen for the indicated siRNAs as compared to the negative and positive controls. Scale bar, 500 μm. e, Percentage of EGFP+, OCT4+ and ZSCAN4+ cells and of 2-cell-like cells obtained in the secondary screen for Mga, Max, Rybp and Daxx as compared to the negative (NT and Neg) and positive (p150) controls. Mean values ± s.d. derived from triplicate cell cultures are shown.

  8. Supplementary Figure 8 Gene expression dynamics of the novel regulators of the 2-cell-like state in the preimplantation mouse embryo and 2-cell-like cells

    a, Heat map showing the changes in mRNA levels for the top 50 candidates in endogenous, p60-knockdown-induced and p150-knockdown-induced 2-cell-like cells. Log fold changes were calculated based on bulk RNA-seq data27 and are color-coded relative to ES cells. Genes are ranked according to differential expression in endogenous 2-cell-like cells. b, Heat map showing unsupervised clustering of the relative expression levels of the top 50 candidates during early mouse development (zygote, early, mid and late 2-cell, 4-cell, 8-cell, 16-cell stages, and early, mid and late blastocyst stages). Protein names are color-coded according to the complex to which they belong. Expression data are derived from ref.66. Notably, while most spliceosome proteins were enriched upon development to the morula stage, PRC1 subunits peaked in expression at different time points during the 2-cell stage, further suggesting that parallel pathways act in concert to restrict totipotent/2-cell/2-cell-like identity.

  9. Supplementary Figure 9 Characterization of 2-cell-like cell protein markers in 2-cell-like cells induced upon siRNA of the hits identified in the siRNA screen

    Characterization of 2-cell-like cell markers in 2-cell-like cells induced upon siRNA targeting of the hits identified in the siRNA screens. a,b, Immunostaining with an antibody against the protein from MERVL reveals expression of endogenous MERVL loci in cells expressing the 2C::EGFP reporter in controls (a) as well as in 2-cell-like cells induced upon siRNA targeting of the indicated chromatin modifiers (b). c, Immunostaining for ZSCAN4 and EGFP in the 2C::EGFP reporter ES cell line. Representative images from at least three independent cell cultures performed on different days are shown. Scale bars, 10 μm.

  10. Supplementary Figure 10 Characterization of 2-cell-like cell transcriptional markers in 2-cell-like cells induced upon siRNA of the hits identified in the siRNA screen

    a, Expression of 2-cell-like genes upon siRNA targeting of the identified hits. RT–qPCR analysis was performed for repetitive elements (top) and chimeric LTR transcripts (bottom) upon transfection with the indicated siRNAs. Shown are the mean values ± s.d. from four independent cell cultures performed on two different days. b, FACS analysis of the 2C::turboGFP and Zscan4::mCherry cell line after transfection with the indicated siRNAs individually or in pairs. Fold changes in turboGFP+, mCherry+ and double-positive (2-cell-like) cells are shown. The mean ± s.d. of the indicated number of cell cultures is shown.

  11. Supplementary Figure 11 PRC1.6 subunits negatively regulate the 2-cell-like state

    a, Schematic of PRC1 complexes identified in mammals. PRC1 complexes are divided into cPRC1 (canonical PRC1) (left) and ncPRC1 (non-canonical PRC1) (right). RING1a and RING1b interact with distinct PCGF proteins. PCGF2 and PCGF4 are present only in canonical PRC1 complexes (PRC1.2 and PRC1.4, respectively). PCGF1, PCGF3, PCGF5 and PCGF6 proteins associate with RYBP or YAF2 to form the non-canonical PRC1 complexes (PRC1.1, PRC1.3, PRC1.5 and PRC1.6, respectively). b, Two-cell-like cell induction after transfection with siRNAs for all PRC1 components. Results for the siRNA pools identified in the primary or secondary screen are shown as a z score. c,g, RT–qPCR analysis was performed to measure siRNA efficiency for Yaf2 and Ring1a (c) or Eed and Ezh2 (g) after transfection with the corresponding siRNAs as compared to scrambled siRNA in the 2C::EGFP reporter cell line. Mean values ± s.d. from four independent cell cultures performed on two different days are shown. d,h, Quantification of EGFP+ cells (%) by FACS after transfection with the indicated siRNAs. Shown are the means ± s.d. of the indicated number of cell cultures. e,f,i,j, Expression of 2-cell-like genes upon treatment with the indicated siRNAs. RT–qPCR analysis was performed of MERVL (e,i) and Zscan4 (f,j) expression in the 2C::EGFP reporter cell line after transfection with the indicated siRNA. The mean values ± s.d. from four independent cell cultures performed on two different days are shown.

  12. Supplementary Figure 12 Two-cell-like cells are characterized by low levels of H2AK119Ub

    a, Immunostaining for OCT4, EGFP and H2AK119Ub in the 2C::EGFP reporter ES cell line depicting endogenous (Neg siRNA) as well as p60- and p150-knockdown-induced 2-cell-like cells. Representative single-section confocal images of at least three independent cell cultures are shown. Dashed white lines demarcate EGFP+ cells. Scale bar, 20 μm. b, H2AK119Ub levels in endogenous 2-cell-like cells and in 2-cell-like cells induced upon transfection with the indicated siRNAs. EGFP (top), OCT4 (middle) and H2AK119Ub (bottom) fluorescence was quantified in ES cells (blue, EGFP negative) and in 2-cell-like cells (red, EGFP positive). Each dot represents a single cell. Shown are raw values obtained in one representative experiment of three independent biological replicates performed on different days. c, Quantification of EGFP+ cells (fold change as compared to negative control Neg) by FACS after transfection with the indicated siRNAs in combination with Rex1 (left) or Nanog (right) siRNA. The mean ± s.d. of the indicated number of cell cultures is shown. d, RT–qPCR analysis of MERVL, Zscan4 and Gm6763 in the 2C::EGFP reporter cell line after transfection with the indicated siRNAs and/or overexpression of Nanog (OE Nanog). Expression of Pcgf6 (lower left), Dmap1 (lower middle) and Nanog (lower right) is shown as controls for siRNA and overexpression efficiency. The mean ± s.d. of the indicated number of cell cultures performed on different days is shown.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–12, Supplementary Table 10 and Supplementary Note

  2. Life Sciences Reporting Summary

  3. Supplementary Table 1

    List of TaqMan assays.

  4. Supplementary Table 2

    Raw data from the Biomark expression analysis.

  5. Supplementary Table 3

    Significantly differentially expressed genes between transitional states, based on the Biomark expression data.

  6. Supplementary Table 4

    List of siRNA targets used in the library.

  7. Supplementary Table 5

    List of siRNAs used for validation and subsequent experiments.

  8. Supplementary Table 6

    List of all primers used in this study.

  9. Supplementary Table 7

    Results from primary screening.

  10. Supplementary Table 8

    Results from secondary screening.

  11. Supplementary Table 9

    Differentially expressed genes across each transitional state.

  12. Supplementary Video 1

    Embryonic stem cells transitioning to the 2-cell-like state, through an intermediate Zscan4 + state—example 1. Example video for the time-lapse experiments shown in Fig. 2. The destabilized 2C::tbGFP reporter is shown in green, the destabilized ZSCAN4::mCherry reporter is shown in red and the constitutively expressed H2B-iRFP marking all nuclei is shown in cyan.

  13. Supplementary Video 2

    Embryonic stem cells transitioning to the 2-cell-like state, through an intermediate Zscan4 + state—example 2. Example video for the time-lapse experiments shown in Fig. 2. The destabilized 2C::tbGFP reporter is shown in green, the destabilized ZSCAN4::mCherry reporter is shown in red and the constitutively expressed H2B-iRFP marking all nuclei is shown in cyan.

  14. Supplementary Video 3

    Embryonic stem cells transitioning to the 2-cell-like state, through an intermediate Zscan4 + state—example 3. Example video for the time-lapse experiments shown in Fig. 2. The destabilized 2C::tbGFP reporter is shown in green, the destabilized ZSCAN4::mCherry reporter is shown in red and the constitutively expressed H2B-iRFP marking all nuclei is shown in cyan.