Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genome-wide characterization of the routes to pluripotency

A Corrigendum to this article was published on 17 June 2015

This article has been updated

Abstract

Somatic cell reprogramming to a pluripotent state continues to challenge many of our assumptions about cellular specification, and despite major efforts, we lack a complete molecular characterization of the reprograming process. To address this gap in knowledge, we generated extensive transcriptomic, epigenomic and proteomic data sets describing the reprogramming routes leading from mouse embryonic fibroblasts to induced pluripotency. Through integrative analysis, we reveal that cells transition through distinct gene expression and epigenetic signatures and bifurcate towards reprogramming transgene-dependent and -independent stable pluripotent states. Early transcriptional events, driven by high levels of reprogramming transcription factor expression, are associated with widespread loss of histone H3 lysine 27 (H3K27me3) trimethylation, representing a general opening of the chromatin state. Maintenance of high transgene levels leads to re-acquisition of H3K27me3 and a stable pluripotent state that is alternative to the embryonic stem cell (ESC)-like fate. Lowering transgene levels at an intermediate phase, however, guides the process to the acquisition of ESC-like chromatin and DNA methylation signature. Our data provide a comprehensive molecular description of the reprogramming routes and is accessible through the Project Grandiose portal at http://www.stemformatics.org.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Multi-omics analysis of secondary reprogramming.
Figure 2: Molecular characterization of cell states during secondary reprogramming.
Figure 3: Dynamic features of chromatin remodelling in reprogramming cell states.
Figure 4: H3K4me3 dynamics define cell states.
Figure 5: Expression and regulation of lncRNAs during reprogramming.
Figure 6: Paths to F-class and ESC-like pluripotency.

Similar content being viewed by others

Accession codes

Primary accessions

European Nucleotide Archive

Sequence Read Archive

Data deposits

Sequencing data have been deposited in the NCBI Sequence Read Archive (SRA) under accession number SRP046744 for all RNA-seq and ChIP-seq experiments, and in the European Bioinformatics Institute under the European Nucleotide Archive (ENA) accession number ERP004116 for MethylC-sequencing. The global and cell surface mass spectrometry proteomics raw data have been deposited in the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository under data set identifiers PXD000413 and PXD001456, respectively.

Change history

  • 10 December 2014

    A minor addition was made to the Acknowledgements in the HTML and PDF versions.

References

  1. Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676 (2006)

    Article  CAS  PubMed  Google Scholar 

  2. Mikkelsen, T. S. et al. Dissecting direct reprogramming through integrative genomic analysis. Nature 454, 49–55 (2008)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. Graf, T. & Enver, T. Forcing cells to change lineages. Nature 462, 587–594 (2009)

    Article  ADS  CAS  PubMed  Google Scholar 

  4. Tonge, P. D. et al. Divergent reprogramming routes lead to alternative stem-cell states. Nature http://dx.doi.org/10.1038/nature14047 (this issue)

  5. Samavarchi-Tehrani, P. et al. Functional genomics reveals a BMP-driven mesenchymal-to-epithelial transition in the initiation of somatic cell reprogramming. Cell Stem Cell 7, 64–77 (2010)

    Article  CAS  PubMed  Google Scholar 

  6. Polo, J. M. et al. A molecular roadmap of reprogramming somatic cells into iPS cells. Cell 151, 1617–1632 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Golipour, A. et al. A late transition in somatic cell reprogramming requires regulators distinct from the pluripotency network. Stem Cells 11, 769–782 (2012)

    CAS  Google Scholar 

  8. O’Malley, J. et al. High-resolution analysis with novel cell-surface markers identifies routes to iPS cells. Nature 499, 88–91 (2013)

    Article  ADS  PubMed  PubMed Central  CAS  Google Scholar 

  9. Nagy, A. Secondary cell reprogramming systems: as years go by. Curr. Opin. Genet. Dev. 23, 534–539 (2013)

    Article  CAS  PubMed  Google Scholar 

  10. Woltjen, K. et al. piggyBac transposition reprograms fibroblasts to induced pluripotent stem cells. Nature 458, 766–770 (2009)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. Buganim, Y. et al. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell 150, 1209–1222 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Belteki, G. et al. Conditional and inducible transgene expression in mice through the combinatorial use of Cre-mediated recombination and tetracycline induction. Nucleic Acids Res. 33, e51 (2005)

    Article  PubMed  PubMed Central  Google Scholar 

  13. Wells, C. A. et al. Stemformatics: visualisation and sharing of stem cell gene expression. Stem Cell Res. 10, 387–395 (2013)

    Article  CAS  PubMed  Google Scholar 

  14. Clancy, J. L. et al. Small RNA changes en route to distinct cellular states of induced pluripotency. Nature Commun. http://dx.doi.org/10.1038/ncomms6522 (2014)

  15. Benevento, M. et al. Proteome adaptation in cell reprogramming proceeds via distinct transcriptional networks. Nature Commun. http://dx.doi.org/10.1038/ncomms6613 (2014)

  16. Polo, J. M. et al. Cell type of origin influences the molecular and functional properties of mouse induced pluripotent stem cells. Nature Biotechnol. 28, 848–855 (2010)

    Article  CAS  Google Scholar 

  17. Ohi, Y. et al. Incomplete DNA methylation underlies a transcriptional memory of somatic cells in human iPS cells. Nature Cell Biol. 13, 541–549 (2011)

    Article  CAS  PubMed  Google Scholar 

  18. Schug, J. et al. Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol. 6, R33 (2005)

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  19. Li, R. et al. A mesenchymal-to-epithelial transition initiates and is required for the nuclear reprogramming of mouse fibroblasts. Cell Stem Cell 7, 51–63 (2010)

    Article  CAS  PubMed  Google Scholar 

  20. Kojima, Y. et al. The transcriptional and functional properties of mouse epiblast stem cells resemble the anterior primitive streak. Cell Stem Cell 14, 107–120 (2014)

    Article  CAS  PubMed  Google Scholar 

  21. Li, B., Carey, M. & Workman, J. L. The role of chromatin during transcription. Cell 128, 707–719 (2007)

    Article  CAS  PubMed  Google Scholar 

  22. Simon, J. A. & Kingston, R. E. Occupying chromatin: polycomb mechanisms for getting to genomic targets, stopping transcriptional traffic, and staying put. Mol. Cell 49, 808–824 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Mansour, A. A. et al. The H3K27 demethylase Utx regulates somatic and germ cell epigenetic reprogramming. Nature 488, 409–413 (2012)

    Article  ADS  CAS  PubMed  Google Scholar 

  24. Pereira, C. F. et al. ESCs require PRC2 to direct the successful reprogramming of differentiated cells toward pluripotency. Cell Stem Cell 6, 547–556 (2010)

    Article  CAS  PubMed  Google Scholar 

  25. Wong, J. J.-L. et al. Orchestrated intron retention regulates normal granulocyte differentiation. Cell 154, 583–595 (2013)

    Article  CAS  PubMed  Google Scholar 

  26. Fadloun, A. et al. Chromatin signatures and retrotransposon profiling in mouse embryos reveal regulation of LINE-1 by RNA. Nature Struct. Mol. Biol. 20, 332–338 (2013)

    Article  CAS  Google Scholar 

  27. Tang, S.-J. Chromatin organization by repetitive elements (CORE): a genomic principle for the higher-order structure of chromosomes. Genes 2, 502–515 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Lunyak, V. V. et al. Developmentally regulated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science 317, 248–251 (2007)

    Article  ADS  CAS  PubMed  Google Scholar 

  29. Rebollo, R., Romanish, M. T. & Mager, D. L. Transposable elements: an abundant and natural source of regulatory sequences for host genes. Annu. Rev. Genet. 46, 21–42 (2012)

    Article  CAS  PubMed  Google Scholar 

  30. Bernstein, B. E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006)

    Article  CAS  PubMed  Google Scholar 

  31. Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  32. Jørgensen, H. F. et al. Stem cells primed for action: polycomb repressive complexes restrain the expression of lineage-specific regulators in embryonic stem cells. Cell Cycle 5, 1411–1414 (2006)

    Article  PubMed  Google Scholar 

  33. Voigt, P. et al. Asymmetrically modified nucleosomes. Cell 151, 181–193 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Schmitges, F. W. et al. Histone methylation by PRC2 is inhibited by active chromatin marks. Mol. Cell 42, 330–341 (2011)

    Article  CAS  PubMed  Google Scholar 

  35. Yuan, W. et al. H3K36 methylation antagonizes PRC2-mediated H3K27 methylation. J. Biol. Chem. 286, 7983–7989 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Voigt, P., Tee, W. W. & Reinberg, D. A double take on bivalent promoters. Genes Dev. 27, 1318–1338 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Lee, D.-S. et al. DNA methylation as a reprogramming modulator: an epigenomic roadmap to induced pluripotency. Nature Commun. http://dx.doi.org/10.1038/ncomms6619 (2014)

  38. Guttman, M. et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nature Biotechnol. 28, 503–510 (2010)

    Article  CAS  Google Scholar 

  39. Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Khalil, A. M. et al. Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc. Natl Acad. Sci. USA 106, 11667–11672 (2009)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  41. Guttman, M. et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  42. Guttman, M. et al. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature 477, 295–300 (2011)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  43. Behringer, R. R., Gertsenstein, M., Nagy-Vintersten, K. & Nagy, A. Manipulating the Mouse Embryo: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2013)

    Google Scholar 

  44. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010)

    Article  CAS  PubMed  Google Scholar 

  45. Kong, L. et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 35, W345–W349 (2007)

    Article  PubMed  PubMed Central  Google Scholar 

  46. Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Res. 22, 2008–2017 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Xie, W. et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010)

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. O’Geen, H., Echipare, L. & Farnham, P. J. in Epigenetics Protocols 791, 265–286 (Humana, 2011)

    Book  Google Scholar 

  50. Gaspar-Maia, A. et al. Chd1 regulates open chromatin and pluripotency of embryonic stem cells. Nature 460, 863–868 (2009)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  51. Wang, T. et al. The histone demethylases Jhdm1a/1b enhance somatic cell reprogramming in a vitamin-C-dependent manner. Cell Stem Cell 9, 575–587 (2011)

    Article  CAS  PubMed  Google Scholar 

  52. Roberts, A., Trapnell, C., Donaghey, J., Rinn, J. L. & Pachter, L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 12, R22 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Feng, J., Liu, T., Qin, B., Zhang, Y. & Liu, X. S. Identifying ChIP-seq enrichment using MACS. Nature Protocols 7, 1728–1740 (2012)

    Article  CAS  PubMed  Google Scholar 

  54. Hawkins, R. D. et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479–491 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Shen, L., Shao, N., Liu, X. & Nestler, E. ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 15, 284 (2014)

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Gauci, S. et al. Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach. Anal. Chem. 81, 4493–4501 (2009)

    Article  CAS  PubMed  Google Scholar 

  58. Wollscheid, B. et al. Mass-spectrometric identification and relative quantification of N-linked cell surface glycoproteins. Nature Biotechnol. 27, 378–386 (2009)

    Article  CAS  Google Scholar 

  59. Kislinger, T. et al. PRISM, a generic large scale proteomic investigation strategy for mammals. Mol. Cell. Proteomics 2, 96–106 (2003)

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank M. Gertsenstein and M. Pereira for chimaera production, C. Monetti for cell culture, R. Cowling for DNA purification, and K. Harpal for chimaera embryo sectioning and staining. We acknowledge the intellectual contributions of P. P. L. Tam and R. P. Harvey. A.N. is Tier 1 Canada Research Chair in Stem Cells and Regeneration. This work was supported by grants awarded to A.N., I.M.R. and P.W.Z. from the Ontario Research Fund Global Leadership Round in Genomics and Life Sciences grants (GL2-01-028), to A.N. from the Canadian stem cell network (9/5254 (TR3)) and from the Canadian Institutes of Health Research (CIHR MOP102575). This work received support from the Korean Ministry of Knowledge Economy (grant 10037410 to J.-S.S.), from the SNUCM Research Fund (grant 0411-20100074 to J.-S.S.), and from Macrogen Inc. (grant MGR03-11 and MGR03-12). The Stemformatics resource is supported by an Australian Research Council special research grant to Stem Cells Australia (C.A.W. and S.M.G.). The analysis of the miRNA was supported by grants from the National Health and Medical Research Council of Australia (1024852 to J.L.C. and T.P.) and the Australian Research Council (DP1300101928 to T.P.). W.R. is a Cancer Institute of NSW Fellow and with J.E.J.R. receives support from the Cancer Council of NSW and National Health & Medical Research Council (571156 and 1061906). J.E.J.R. receives funding from Cure the Future & Tour de Cure. K.-A.L.C. is supported, in part, by the Wound Management Innovation CRC (established and supported under the Australian Government’s Cooperative Research Centres Program). S.M.G. received support from the Australian Research Council (SR110001002). C.A.W. is a QLD Smart Futures Fellow. M.B., J.M. and A.J.R.H. are supported by the Netherlands Proteomics Centre, and by the European Community’s Seventh Framework Programme (FP7/2007-2013) by the PRIME-XS project grant agreement number 262067. P.W.Z. is the Canada Research Chair in Stem Cell Bioengineering. S.M.I.H. received a fellowship from the McEwen Centre of Regenerative Medicine.

Author information

Authors and Affiliations

Authors

Contributions

S.M.I.H., M.C.P., P.D.T. and A.N. conceived, designed and carried out most of the experiments, interpreted results and wrote the manuscript. P.W.Z. contributed to study design. T.P., C. A. Wells, I.M.R., P.W.Z., C. A. White, N.S., A.J.C. and J.C.M. assisted with data interpretation and manuscript writing. M.L., S.M.I.H. and M.C.P. performed ChIP. M.C.P., S.M.I.H., N.C., O.K., D.L.A.W., M.E.G. and S.M.G. produced and analysed RNA-seq data. S.M.I.H., D.-S.L., M.C.P., J.-Y.S., J.-I.K. and J.-S.S. produced and analysed MethylC-seq and ChIP-seq data. J.E.J.R, W.R. and R.Mi. performed the IR analysis, interpretation and contributed to the manuscript writing. C. A. Wells, R.Mo., O.K., K.-A.LC. and J.C.M. provided support for bioinformatics analyses and data visualization. M.B., J.M. and A.J.R.H. performed the LC-MS analysis and proteomic data analysis. H.R.P. mapped the miRNA Next Generation Sequencing (NGS) data and provided support for bioinformatics analyses and data visualization. J.L.C. and T.P. analysed and interpreted the miRNA NGS data. C.A.W. performed the CSC proteomics. C.A.W., N.S. and P.W.Z. analysed CSC proteome data.

Corresponding author

Correspondence to Andras Nagy.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Effects of lowering doxycycline on reprogramming cells.

a, Frequency of doxycycline-independent pluripotent cells obtained when 1B secondary MEFs were reprogrammed in 1,500 ng ml−1 doxycycline until the indicated day. b, Morphology of cells at day 15 after lowering the doxycycline concentration from 1,500 ng ml−1 to levels as indicated on day 8 of reprogramming. c, Clonal efficiency measurement at day 15 of reprogramming after lowering the doxycycline concentration on day 8 to the level indicated. d, e, 1B secondary iPSCs show widespread contribution to all germ layers of chimaeric embryos. Whole-mount view (d) and transverse section of E10.5 diploid chimaera (e). Embryo is representative of n = 6 chimaeric embryos with strong (>75%) iPSC donor cell contribution. h, heart; hg, hindgut; nt, neural tube. Scale bars, 750 μm (d) and 400 µm (e). f, RNA-seq analysis of transgene and endogenous expression levels during reprogramming. CPM, counts per million.

Extended Data Figure 2 Locus-specific sequencing data.

Read coverage histograms representing gene expression and epigenetic status at the genomic loci of selected ESC-associated genes.

Extended Data Figure 3 Hierarchical clustering and principal component analysis (PCA) for multi-omics analyses.

a, Pearson correlation complete linkage hierarchical clustering of long RNA-seq data set. Colour coding indicates the grouping of samples based on clustering. bd, PCA performed on each platform (10 neighbours for k-value nearest neighbour (KNN) imputation). Short RNA-seq platform PCA was performed on miRNAs (b). Long RNA-seq platform PCA was performed on protein-coding transcripts (b). Cell surface proteome PCA represents proteins detected by cell surface focused mass spectrometry analysis (b). c, PCA of global CpG methylation analysis. Red arrow follows the high-doxycycline sample trajectory; black dashed arrow follows D8H through low-doxycycline trajectory. Low-doxycycline samples D21L and D21 are highlighted in blue to indicate that compared to other platforms they do not project with ESC/iPSC (see text for further details). d, H3K4me3, H3K36me3 and H3K27me3 PCAs represent genome-wide enriched regions at annotated genes.

Extended Data Figure 4 Integration of gene expression data from 1B reprogramming and other transcriptome data sets.

a, Distribution of the entropy score of protein-coding gene expression for individual samples (blue) and sample groups (red) indicated as probability density curve. b, Pearson correlation analysis of 1B secondary reprogramming sample protein-coding gene expression with transcriptomes of early embryonic stages and epiblast stem cells (EpiSCs) derived from a range of developmental stages20. c, Pearson correlation analysis of 1B secondary reprogramming sample protein-coding gene expression with transcriptome of sorted secondary reprogramming intermediates8. d, Expression of CD44 and Icam1 markers during 1B reprogramming. Error bars represent standard error of the mean. e, Pearson correlation analysis of 1B reprogramming sample protein-coding gene expression with sorted reprogramming and pluripotent cells from the Col1a1 primary reprogramming system6.

Extended Data Figure 5 Effect of Oct4, Sox2, Klf4 and Myc expression level on reprogramming outcomes.

a, Pearson correlation analysis of RNA-seq data from 1B reprogramming samples and reprogramming clones from ref. 7 that are competent or incompetent to become factor-independent secondary iPSC (SC and SI clones, respectively). b, Transgene and endogenous gene expression determined by RNA-seq for Myc, Pou5f1 (Oct4), Sox2 and Klf4 in SC and SI clones7. Bar graphs represent average expression of doxycycline-treated samples or SC iPSCs. Error bars represent standard error of the mean. Student’s t-test was used for statistics. c, PCA of protein-coding stage-specific genes from Fig. 2c, comparing 1B reprogramming samples and secondary reprogramming clones from ref. 7. F-class cells cluster separately from SI and SC clones. Moreover, 1B reprogramming follows a different trajectory than SI and SC clones towards iPSCs. Colour coding indicates the grouping of samples. d, Pearson correlation complete linkage hierarchical clustering of 1B reprogramming samples and SI and SC secondary reprogramming clones. Clustering was performed on protein-coding stage-specific genes and based on FPKM values normalized to the averaged ESC/iPSCs values from the respective study. Heat maps show stage-specific protein-coding gene expression belonging to iPSC/ESC (top heat map) and F-class (bottom heat map) genes. Clusters and genes on the right of each heat map highlight genes that show a different expression pattern between F-class and doxycycline-treated SI clones. For gene lists associated with d, refer to Supplementary Table 1.

Extended Data Figure 6 Global analysis of histone mark and intron retention changes during reprogramming.

a, Intensity plots of genes associated with H3K4me3 (green) and H3K27me3 (red) ±10 kb of annotated TSSs. b, Heat map representation of PRC2 components and histone demethylase expression at the RNA (RNA-seq) and protein level. c, Correlation of gene transcription with protein and intron retention for genes that exhibit intron retention from Fig. 2c. d, Correlation of intron retention, RNA expression and protein level for Kdm6a. e, Violin plots comparing observed and random Pearson correlations of intron retention versus gene FPKM at reprogramming stages. Bars represent average Pearson correlation coefficients. Error bars represent standard error of the mean. Student’s t-test was used for statistics. f, Number of expressed transposable elements during reprogramming.

Extended Data Figure 7 Tracking secondary MEF histone mark changes during reprogramming from one sample to another.

a, Pie-chart diagram tracking the histone mark changes using secondary MEF and secondary iPSCs as reference points. Each histone mark is colour coded: H3K4me3, green; H3K4me3H3K27me3, orange; H3K27me3, red; no mark, grey. Loci were tracked from their start (2°MEF) and end (2°iPSCs) histone signatures. bg, Tracking bar graphs of histone mark changes. The histone mark change is shown at the top of each set of 12 histograms. Bars represent number of genes whose mark changed for the time point indicated at the top of the individual histogram, and which of these genes carry the same mark at the other time points (x axis). For example, in b ‘2°MEF (H3K4me3/H3K27me3→H3K4me3)’ the histogram shows the number of genes that were bivalent in secondary MEFs but changed to H3K4me3 monovalent at another time point. In the case of the small histogram labelled D2H, the black-framed green bar represents the number of loci that showed this change from secondary MEFs at D2H and the bars for all the other samples indicate how many of these D2H loci were also H3K4me3+ in the other samples.

Extended Data Figure 8 Determining expression threshold for defining bivalent loci and bivalency in other reprogramming systems.

a, RNA-seq expression value (log2 of FPKM) distribution (as represented by density curves) of four categories of genes: (1) genes marked by H3K4me3 and H3K36me3 (blue line); (2) genes marked by H3K4me3 alone (green line); (3) genes marked by H3K27me3 alone (red line); and (4) genes marked by H3K4me3 and H3K27me3, but not H3K36me3 (orange line). Genes were combined from all the samples to identify each category. Expression threshold was defined as the 10th percentile expression boundary of genes marked by H3K4me3 and H3K36me3. Genes that were expressed at lower levels than this threshold were considered not expressed in subsequent analyses. b, Assessment of cellular heterogeneity in 1B reprogramming by chromatin mark and expression association of two cell surface markers: CD24 and CD73. Upper scatter plots show H3K27me3 versus H3K36me3 enrichment in individual samples. Lower plot shows percentage of cells expressing each marker for same samples as determined by FACs analysis. Active locus: H3K4me3+H3K36me3+H3K27me3. Heterogeneous locus: H3K4me3+H3K36me3+H3K27me3+. c, Absolute number (primary y axis) and proportion (secondary y axis) of false (heterogeneous) bivalent loci during secondary reprogramming. the presence of H3K36me3 distinguishes false bivalent loci (H3K4me3+H3K27me3+H3K36me3+) that represent heterogeneity from true bivalent loci that are transcriptionally repressed (H3K36me3). d, Tracking of histone mark status of secondary MEF heterogeneous loci. Heterogeneous loci resolve into silent and active loci during reprogramming. e, Total number of detected bivalent loci as defined by lack of H3K36me3 mark and expression levels below the threshold as shown in panel a. Dark and light green bar graphs highlight proportion shared among all samples and with secondary MEFs, respectively. f, Sequential addition of novel bivalent marks with respect to stages of reprogramming, as indicated by colours. g, h, Corresponding bivalent loci identified in 1B samples and two independent data sets6,31. i, Tracking of bivalent loci for Polo et al. reprogramming system6. For gene lists related to e, refer to Supplementary Table 2.

Extended Data Figure 9 Long non-coding RNA expression analysis.

a, Determination of expression threshold for lncRNA genes using H3K4me3 and H3K36me3 chromatin mark. b, Distribution of the entropy of non-coding gene expression for individual samples (blue) and sample groups (red) indicated as probability density curve. c, Percentage of unannotated transcripts with listed genomic features. d, Analysis of unannotated lncRNA transcripts for coding potential using coding potential calculator (CPC). (See Supplementary Information for details.) e, RNA and protein expression profiles of three novel coding transcripts.

Extended Data Figure 10 Comparison of lncRNA expression in 1B secondary reprogramming and other reprogramming systems.

a, Pearson correlation analysis of differentially expressed un-annotated RNA transcripts for 1B reprogramming samples and secondary reprogramming clones that are competent or incompetent to become factor-independent secondary iPSCs (SC and SI clones, respectively)7. b, Pearson correlation analysis of differentially expressed unannotated RNA transcripts for 1B reprogramming samples and sorted reprogramming intermediates from ref. 8. c, Heat map of differentially expressed novel RNAs from 1B reprogramming samples with secondary reprogramming clones that are competent or incompetent to become factor-independent secondary iPSCs (SC and SI clones, respectively)7. For gene lists related to c, refer to Supplementary Table 4. d, Read coverage histograms representing gene expression and epigenetic status of unannotated lncRNAs observed in F-class (D16H) and ESC-like state (secondary iPSCs). e, GO analysis results for genes downregulated in F-class state (FDR <1%), but unchanged in ESC-like state, from D8H (combined groups 3, 6 and 9). f, GO analysis results for genes upregulated in ESC-like state (FDR <1%), but unchanged in F-class state, from D8H (combined groups 1b, 4b and 7b). For gene lists, full GO term analyses and P values associated with e, f refer to Supplementary Table 5.

Supplementary information

Supplementary Data

This zipped file contains Supplementary Tables 1-6. (ZIP 3101 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hussein, S., Puri, M., Tonge, P. et al. Genome-wide characterization of the routes to pluripotency. Nature 516, 198–206 (2014). https://doi.org/10.1038/nature14046

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature14046

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing