Abstract
Germline–soma segregation is a fundamental event during mammalian embryonic development. Here we establish the epigenetic principles of human primordial germ cell (hPGC) development using in vivo hPGCs and stem cell models recapitulating gastrulation. We show that morphogen-induced remodelling of mesendoderm enhancers transiently confers the competence for hPGC fate, but further activation favours mesoderm and endoderm fates. Consistently, reducing the expression of the mesendodermal transcription factor OTX2 promotes the PGC fate. In hPGCs, SOX17 and TFAP2C initiate activation of enhancers to establish a core germline programme, including the transcriptional repressor PRDM1 and pluripotency factors POU5F1 and NANOG. We demonstrate that SOX17 enhancers are the critical components in the regulatory circuitry of germline competence. Furthermore, activation of upstream cis-regulatory elements by an optimized CRISPR activation system is sufficient for hPGC specification. We reveal an enhancer-linked germline transcription factor network that provides the basis for the evolutionary divergence of mammalian germlines.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Resetting histone modifications during human prenatal germline development
Cell Discovery Open Access 03 February 2023
-
The balance between NANOG and SOX17 mediated by TET proteins regulates specification of human primordial germ cell fate
Cell & Bioscience Open Access 04 November 2022
-
Germline stem cells in human
Signal Transduction and Targeted Therapy Open Access 02 October 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout








Data availability
ChIP–seq and RNA-seq datasets are available on NCBI GEO (GSE159654). Single-cell sequencing datasets are available on ArrayExpress (E-MTAB-11135). Previously published data that were re-analysed here are: hPGC RNA-seq (GSE60138), TF KO RNA-seq (GSE99350), TFAP2C ChIP–seq (GSE140021) and OTX2 ChIP–seq (GSE61475). Genome databases used are: UCSC GRCh38/hg38, Ensembl GrCh38 v90 and Gencode Human Release 30. All other data supporting the findings of this study are available from the corresponding authors on reasonable request. Source data are provided with this paper.
References
Seydoux, G. & Braun, R. E. Pathway to totipotency: lessons from germ cells. Cell 127, 891–904 (2006).
Tang, W. W., Kobayashi, T., Irie, N., Dietmann, S. & Surani, M. A. Specification and epigenetic programming of the human germ line. Nat. Rev. Genet. 17, 585–600 (2016).
Saitou, M. & Miyauchi, H. Gametogenesis from pluripotent stem cells. Cell Stem Cell 18, 721–735 (2016).
Kobayashi, T. & Surani, M. A. On the origin of the human germline. Development 145, dev150433 (2018).
Senft, A. D., Bikoff, E. K., Robertson, E. J. & Costello, I. Genetic dissection of Nodal and Bmp signalling requirements during primordial germ cell development in mouse. Nat. Commun. 10, 1089 (2019).
Ohinata, Y. et al. A signaling principle for the specification of the germ cell lineage in mice. Cell 137, 571–584 (2009).
Kurimoto, K. et al. Quantitative dynamics of chromatin remodeling during germ cell specification from mouse embryonic stem cells. Cell Stem Cell 16, 517–532 (2015).
Magnusdottir, E. et al. A tripartite transcription factor network regulates primordial germ cell specification in mice. Nat. Cell Biol. 15, 905–915 (2013).
Kobayashi, T. et al. Principles of early human development and germ cell program from conserved model systems. Nature 546, 416–420 (2017).
Sasaki, K. et al. The germ cell fate of cynomolgus monkeys is specified in the nascent amnion. Dev. Cell 39, 169–185 (2016).
Lawson, K. A. et al. Bmp4 is required for the generation of primordial germ cells in the mouse embryo. Genes Dev. 13, 424–436 (1999).
Ohinata, Y. et al. Blimp1 is a critical determinant of the germ cell lineage in mice. Nature 436, 207–213 (2005).
Tyser, R.C.V. et al. Single-cell transcriptomic characterization of a gastrulating human embryo. Nature 600, 285–289 (2021).
D’Amour, K. A. et al. Efficient differentiation of human embryonic stem cells to definitive endoderm. Nat. Biotechnol. 23, 1534–1541 (2005).
Loh, K. M. et al. Efficient endoderm induction from human pluripotent stem cells by logically directing signals controlling lineage bifurcations. Cell Stem Cell 14, 237–252 (2014).
Irie, N. et al. SOX17 is a critical specifier of human primordial germ cell fate. Cell 160, 253–268 (2015).
Nakaki, F. et al. Induction of mouse germ-cell fate by transcription factors in vitro. Nature 501, 222–226 (2013).
Hara, K. et al. Evidence for crucial role of hindgut expansion in directing proper migration of primordial germ cells in mouse early embryogenesis. Dev. Biol. 330, 427–439 (2009).
Chen, D. et al. The TFAP2C-regulated OCT4 naive enhancer is involved in human germline formation. Cell Rep. 25, 3591–3602 (2018).
Kojima, Y. et al. Evolutionarily distinctive transcriptional and signaling programs drive human germ cell lineage specification from pluripotent stem cells. Cell Stem Cell 21, 517–532 (2017).
Sasaki, K. et al. Robust in vitro induction of human germ cell fate from pluripotent stem cells. Cell Stem Cell 17, 178–194 (2015).
de Jong, J. et al. Differential expression of SOX17 and SOX2 in germ cells and stem cells has biological and clinical implications. J. Pathol. 215, 21–30 (2008).
Hoei-Hansen, C. E. et al. Transcription factor AP-2γ is a developmentally regulated marker of testicular carcinoma in situ and germ cell tumors. Clin. Cancer Res. 10, 8521–8530 (2004).
Eckert, D. et al. Expression of BLIMP1/PRMT5 and concurrent histone H2A/H4 arginine 3 dimethylation in fetal germ cells, CIS/IGCNU and germ cell tumors. BMC Dev. Biol. 8, 106 (2008).
Tang, W. W. et al. A unique gene regulatory network resets the human germline epigenome for development. Cell 161, 1453–1467 (2015).
Li, L. et al. Single-cell RNA-seq analysis maps development of human germline cells and gonadal niche interactions. Cell Stem Cell 20, 858–873 (2017).
Corces, M. R. et al. An improved ATAC–seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).
Brind’Amour, J. et al. An ultra-low-input native ChIP–seq protocol for genome-wide profiling of rare cell populations. Nat. Commun. 6, 6033 (2015).
Takashima, Y. et al. Resetting transcription factor control circuitry toward ground-state pluripotency in human. Cell 158, 1254–1269 (2014).
Pontis, J. et al. Hominoid-specific transposable elements and KZFPs facilitate human embryonic genome activation and control transcription in naive human ESCs. Cell Stem Cell 24, 724–735 (2019).
Kim, T. K. & Shiekhattar, R. Architectural and functional commonalities between enhancers and promoters. Cell 162, 948–959 (2015).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).
Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825–837 (2013).
Zentner, G. E., Tesar, P. J. & Scacheri, P. C. Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions. Genome Res. 21, 1273–1283 (2011).
Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).
Tu, S. et al. Co-repressor CBFA2T2 regulates pluripotency and germline development. Nature 534, 387–390 (2016).
Nady, N. et al. ETO family protein Mtgr1 mediates Prdm14 functions in stem cell maintenance and primordial germ cell formation. eLife 4, e10150 (2015).
Gomes Fernandes, M., Bialecka, M., Salvatori, D. C. F. & Chuva de Sousa Lopes, S. M. Characterization of migratory primordial germ cells in the aorta-gonad-mesonephros of a 4.5-week-old human embryo: a toolbox to evaluate in vitro early gametogenesis. Mol. Hum. Reprod. 24, 233–243 (2018).
Sybirna, A. et al. A critical role of PRDM14 in human primordial germ cell fate revealed by inducible degrons. Nat. Commun. 11, 1282 (2020).
Seguin, C. A., Draper, J. S., Nagy, A. & Rossant, J. Establishment of endoderm progenitors by SOX transcription factor expression in human embryonic stem cells. Cell Stem Cell 3, 182–195 (2008).
Cheneby, J. et al. ReMap 2020: a database of regulatory regions from an integrative analysis of human and Arabidopsis DNA-binding sequencing experiments. Nucleic Acids Res. 48, D180–D188 (2020).
Aksoy, I. et al. Oct4 switches partnering from Sox2 to Sox17 to reinterpret the enhancer code and specify endoderm. EMBO J. 32, 938–953 (2013).
Rao, J. et al. Stepwise clearance of repressive roadblocks drives cardiac induction in human ESCs. Cell Stem Cell 18, 341–353 (2016).
Jostes, S. V. et al. Unique and redundant roles of SOX2 and SOX17 in regulating the germ cell tumor fate. Int. J. Cancer J.146, 1592–1605 (2020).
Chen, D. et al. Human primordial germ cells are specified from lineage-primed progenitors. Cell Rep. 29, 4568–4582 (2019).
Rothstein, M. & Simoes-Costa, M. Heterodimerization of TFAP2 pioneer factors drives epigenomic remodeling during neural crest specification. Genome Res. 30, 35–48 (2020).
Pastor, W. A. et al. TFAP2C regulates transcription in human naive pluripotency by opening enhancers. Nat. Cell Biol. 20, 553–564 (2018).
Eguizabal, C. et al. Characterization of the epigenetic changes during human gonadal primordial germ cells reprogramming. Stem Cells 34, 2418–2428 (2016).
Alver, B. H. et al. The SWI/SNF chromatin remodelling complex is required for maintenance of lineage specific enhancers. Nat. Commun. 8, 14648 (2017).
Nishioka, N. et al. The Hippo signaling pathway components Lats and Yap pattern Tead4 activity to distinguish mouse trophectoderm from inner cell mass. Dev. Cell 16, 398–410 (2009).
Yagi, R. et al. Transcription factor TEAD4 specifies the trophectoderm lineage at the beginning of mammalian development. Development 134, 3827–3836 (2007).
Beyer, T. A. et al. Switch enhancers interpret TGF-β and Hippo signaling to control cell fate in human embryonic stem cells. Cell Rep. 5, 1611–1624 (2013).
Tanenbaum, M. E., Gilbert, L. A., Qi, L. S., Weissman, J. S. & Vale, R. D. A protein-tagging system for signal amplification in gene expression and fluorescence imaging. Cell 159, 635–646 (2014).
Morita, S. et al. Targeted DNA demethylation in vivo using dCas9-peptide repeat and scFv-TET1 catalytic domain fusions. Nat. Biotechnol. 34, 1060–1065 (2016).
Chen, B. et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 155, 1479–1491 (2013).
Sun, D. et al. A functional genetic toolbox for human tissue-derived organoids. eLife 10, e67886 (2021).
Chen, D. et al. Germline competency of human embryonic stem cells depends on eomesodermin. Biol. Reprod. 97, 850–861 (2017).
Bahrami, S. & Drablos, F. Gene regulation in the immediate-early response process. Adv. Biol. Regul. 62, 37–49 (2016).
Funa, N. S. et al. β-Catenin regulates primitive streak induction through collaborative interactions with SMAD2/SMAD3 and OCT4. Cell Stem Cell 16, 639–652 (2015).
Yu, P., Pan, G., Yu, J. & Thomson, J. A. FGF2 sustains NANOG and switches the outcome of BMP4-induced human embryonic stem cell differentiation. Cell Stem Cell 8, 326–334 (2011).
Su, Z. et al. Antagonism between the transcription factors NANOG and OTX2 specifies rostral or caudal cell fate during neural patterning transition. J. Biol. Chem. 293, 4445–4455 (2018).
Boroviak, T. et al. Single cell transcriptome analysis of human, marmoset and mouse embryos reveals common and divergent features of preimplantation development. Development 145, dev167833 (2018).
Tsankov, A. M. et al. Transcription factor binding dynamics during human ES cell differentiation. Nature 518, 344–349 (2015).
Zhang, J. et al. OTX2 restricts entry to the mouse germline. Nature 562, 595–599 (2018).
Hayashi, K., Ohta, H., Kurimoto, K., Aramaki, S. & Saitou, M. Reconstitution of the mouse germ cell specification pathway in culture by pluripotent stem cells. Cell 146, 519–532 (2011).
Faial, T. et al. Brachyury and SMAD signalling collaboratively orchestrate distinct mesoderm and endoderm gene regulatory networks in differentiating human embryonic stem cells. Development 142, 2121–2135 (2015).
Teo, A. K. et al. Pluripotency factors regulate definitive endoderm specification through eomesodermin. Genes Dev. 25, 238–250 (2011).
Pauklin, S. & Vallier, L. The cell-cycle state of stem cells determines cell fate propensity. Cell 155, 135–147 (2013).
Nettersheim, D. et al. The cancer/testis-antigen PRAME supports the pluripotency network and represses somatic and germ cell differentiation programs in seminomas. Br. J. Cancer 115, 454–464 (2016).
Gunne-Braden, A. et al. GATA3 mediates a fast, irreversible commitment to BMP4-driven differentiation in human embryonic stem cells. Cell Stem Cell 26, 693–706 (2020).
Wang, S., Sengel, C., Emerson, M. M. & Cepko, C. L. A gene regulatory network controls the binary fate decision of rod and bipolar cells in the vertebrate retina. Dev. Cell 30, 513–527 (2014).
Kanai-Azuma, M. et al. Depletion of definitive gut endoderm in Sox17-null mutant mice. Development 129, 2367–2379 (2002).
Vincent, S. D. et al. The zinc finger transcriptional repressor Blimp1/Prdm1 is dispensable for early axis formation but is required for specification of primordial germ cells in the mouse. Development 132, 1315–1325 (2005).
Yamaji, M. et al. Critical function of Prdm14 for the establishment of the germ cell lineage in mice. Nat. Genet. 40, 1016–1022 (2008).
Yamashiro, C. et al. Generation of human oogonia from induced pluripotent stem cells in vitro. Science 362, 356–360 (2018).
Yamashiro, C., Sasaki, K., Yokobayashi, S., Kojima, Y. & Saitou, M. Generation of human oogonia from induced pluripotent stem cells in culture. Nat. Protoc. 15, 1560–1583 (2020).
Alberio, R., Kobayashi, T. & Surani, M. A. Conserved features of non-primate bilaminar disc embryos and the germline. Stem Cell Rep. 16, 1078–1092 (2021).
Kobayashi, T. et al. Tracing the emergence of primordial germ cells from bilaminar disc rabbit embryos and pluripotent stem cells. Cell Rep. 37, 109812 (2021).
Zheng, Y. et al. Controlled modelling of human epiblast and amnion development using stem cells. Nature 573, 421–425 (2019).
Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Bryja, J. & Konecny, A. Fast sex identification in wild mammals using PCR amplification of the Sry gene. Folia Zool. 52, 269–274 (2003).
FastQC (Babraham Bioinformatics, 2010).
Trim Galore (Babraham Bioinformatics, 2012).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Gorkin, D. U. et al. An atlas of dynamic chromatin landscapes in mouse fetal development. Nature 583, 744–751 (2020).
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Neph, S. et al. BEDOPS: high-performance genomic feature operations. Bioinformatics 28, 1919–1920 (2012).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Wang, S. et al. Target analysis by integration of transcriptome and ChIP–seq data with BETA. Nat. Protoc. 8, 2502–2515 (2013).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Zerbino, D. R., Johnson, N., Juettemann, T., Wilder, S. P. & Flicek, P. WiggleTools: parallel processing of large collections of genome-wide datasets for visualization and statistical analysis. Bioinformatics 30, 1008–1009 (2014).
Horlbeck, M. A. et al. Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. eLife 5, e19760 (2016).
Tischler, J. et al. Metabolic regulation of pluripotency and germ cell fate through α-ketoglutarate. EMBO J. 38, e99518 (2019).
Landt, S. G. et al. ChIP–seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).
Acknowledgements
M.A.S. was supported by Wellcome Investigator Awards in Science (209475/Z/17/Z and 096738/Z/11/Z), an MRC research grant (RG85305) and a BBSRC research grant (G103986). W.W.C.T. received a Croucher Postdoctoral Research Fellowship and was supported by the Isaac Newton Trust. A.C.-V. was supported by the Wellcome 4-Year PhD Programme in Stem Cell Biology and Medicine and the Cambridge Commonwealth European and International Trust (203831/Z/16/Z). W.H.G. was supported by a BBSRC research grant (G103986). T.K. and M.A.S. were supported by Butterfield Awards of Great Britain Sasakawa Foundation. T.K. was supported by the Astellas Foundation for Research on Metabolic Disorders. D.S. was supported by a Wellcome Trust PhD studentship (109146/Z/15/Z) and the Department of Pathology, University of Cambridge. N.I. was supported by an MRC research grant (RG85305). We thank R. Barker and X. He for providing human embryonic tissues, and C. Bradshaw for bioinformatic support. We also thank The Weizmann Institute of Science for the WIS2 hESC line and the Genomics Core Facility of CRUK Cambridge Institute for sequencing services, and R. Alberio and members of the Surani lab for insightful comments and critical reading of the manuscript.
Author information
Authors and Affiliations
Contributions
M.A.S. and W.W.C.T. conceived the study. W.W.C.T. designed experiments, collected human embryonic tissues and performed bioinformatic analysis. A.C.-V. designed experiments, performed cell culture, cloning and luciferase assay and collected in vitro samples. W.W.C.T. and W.H.G. optimized and generated ATAC–seq and ULI-NChIP–seq libraries. T.K. and N.I. generated TF ChIP–seq libraries. T.K. and A.C.-V. generated RNA-seq libraries. A.C.-V. generated the scRNA-seq libraries. A.C.-V., M.D.M. and C.A.P. analysed the scRNA-seq data. W.W.C.T., A.C.-V., T.K. and D.S. designed and performed CRISPRa and interference assay. W.W.C.T., M.A.S., A.C.-V. and W.H.G. wrote the manuscript with inputs from all authors.
Corresponding authors
Ethics declarations
Competing interests
W.W.C.T. is currently employed by Adrestia Therapeutics Ltd. The other authors declare no competing interests.
Peer review
Peer review information
Nature Cell Biology thanks Hiromitsu Nakauchi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Sample collection and overview of transcriptomic and epigenomic data.
a, Fluorescence-activated cell sorting (FACS) pseudocolor plots showing the cell populations collected for transcriptomic and epigenomic analysis (red gates). b, Table showing the hESC-derived cell types and the number of cells used to generate RNA-seq, ATAC-seq, and ChIP-seq libraries. c, Table showing details of the human embryos used for hPGC isolation and the number of hPGCs used to generate RNA-seq, ATAC-seq, and ChIP-seq libraries. Asterisks indicate RNA-seq samples published in a previous study16. d, Heatmaps showing Spearman’s correlation coefficient of gene expression, ATAC-seq, H3K4me1, H3K4me3, H3K27ac, and H3K27me3 ChIP-seq signals in biological replicates. For RNA expression, correlation was based on the log2(normalized counts) of the top 25% most variable protein coding genes and lincRNA. For ATAC-seq, signals (log2(normalized counts)) at combined peaks across the 6 cell types were used. For H3K4me1, H3K4me3, H3K27ac, and H3K27me3 ChIP-seq, signals (log2(normalized counts)) at 1 kb bins of combined peaks were used. The samples were clustered using (1 - Spearman’s correlation coefficient) as the distance measure (Ward’s method with optimal tree ordering). See Methods. e, Distance distribution between the summit of ATAC peaks or the centre of histone modification peaks and the closest TSS. Shown are all peaks and dynamic peaks with differential chromatin signals in the sample pairs shown in Extended Data Fig. 2d (log2(signal fold change) >1 and adjusted p-value <0.05). Note that most of the dynamic ATAC, H3K4me1, H3K4me3, H3K27ac and H3K27me3 peaks were >2 kb away (dotted line) from the TSS.
Extended Data Fig. 2 Characterisation of dynamically active enhancers.
a, Chromatin profile heatmaps of ATAC, H3K4me3, H3K4me1, H3K27ac, H3K27me3 and input signals in hPGCLCs at ATAC-seq summit ± 3 kb. Segregation of ATAC-seq summits by K-means clustering using normalised chromatin mark signals. b, Distribution of chromatin mark signals at active, mixed, primed, poised, repressed, and neutral enhancers in hPGCLCs (see Fig. 2a). Enhancers per violin/box plot: 23255 active, 1288 mixed, 36999 primed, 5648 poised, 3984 repressed, 79290 neutral. Box plots depict the median, lower and upper hinges correspond to the 25th and 75th percentiles. c, Enhancer state transitions of DE-active enhancers. Distal OCRs not overlapping any histone modification peak in the analysed cell types were referred to as ‘neutral’ enhancers. d, Putative enhancers with differential H3K27ac levels (absolute(log2(fold change)) >1 and adjusted p-value < 0.05) in the indicated sample pairs. e, High confidence enhancer-gene associations. Putative enhancers were assigned to the nearest TSS. The relevance of the enhancer-gene pair was assessed by the Kendall’s rank correlation analysis between the enhancer H3K27ac signals and the expression levels of the associated genes across the 6 cell types and 2 replicates (see Methods). In the simplified model shown, the Enh1-geneA and Enh3-geneB pairs (green text and arrows) were identified as high confidence associations based on positive correlation between H3K27ac and gene expression levels. f, Expression levels of genes associated with active enhancers in different cell types. Compared to simply associating genes to the nearest active enhancer, high confidence active enhancer associated genes (Kendall rank correlation coefficient >0.3; empirical p-value < 0.05) exhibited significantly higher expression in all cell types. Two-sided Wilcoxon rank sum test with FDR correction. Gene number per violin/box plot (all nearest, high confidence): hESC (8686, 1279), PreME (9782, 1413), ME (11494, 1853), DE (12279, 2224), hPGCLC (9954, 1726), hPGC (6612, 1024). Box plot organisation as in Extended Data Fig. 2b. g, Distribution of ATAC, H3K27ac and H3K27me3 signals in dynamically active enhancers and high confidence target gene expression. Enhancers were segregated into nine clusters (Fig. 2c). Enhancer per clusters as in Fig. 2c. Box plot organisation as in Extended Data Fig. 2b.
Extended Data Fig. 3 Characterisation of dynamically active and repressed promoters during hPGC development.
a, Promoter classification in hPGCLCs based on the intersection of histone modification peaks at promoter regions (TSS ± 1 kb). b, Receiver operating characteristic (ROC) curves of ATAC, H3K4me3, H3K4me1, H3K27ac, and H3K27me3 signals at promoter (TSS ± 1 kb) as predictors of gene activity in hPGCLCs. The top 1000 (top panel) or the bottom 1000 (bottom panel) expressed genes were used as positives. Promoters not overlapping any chromatin peak were excluded. Note that H3K27ac (area under the curved (AUC) of top genes = 0.770) and H3K27me3 (AUC of bottom genes = 0.681) were the best predictors of expressed and repressed genes, respectively. c, Distribution of chromatin mark signals at active, mixed, poised, repressed, and neutral promoters and the expression of their associated genes in hPGCLCs. Promoter number: 30261 active, 2833 mixed, 7089 poised, 1579 repressed, 19832 neutral. Associated gene number: 12629 active, 1526 mixed, 3662 poised, 1144 repressed, 13038 neutral. Box plot organisation as in Extended Data Fig. 2b d, K-means clustering of dynamically repressed promoters into 7 clusters (C) by H3K27me3 signals. Dynamically repressed promoters were promoters that exhibited ‘mixed’, ‘poised’ or ‘repressed’ state (see Methods) in any cell type with differential H3K27me3 signals. e, Box plots showing expression levels of genes associated with the C6 dynamically repressed promoters in c. **** p-value < 0.00001 (Wilcoxon rank sum test adjusted by the Holm method) in marked against each unmarked cell type. Box plot organisation as in Extended Data Fig. 2b. f, Heatmaps showing the expression levels of representative genes associated with C6 in c. The right panel shows the representative enriched gene ontology terms. g, Dot plots showing the enrichment of transcription regulator (TR) binding site in the dynamic repressed promoters in C6 in c. The top 20 enriched TRs (out of 1,135 in the ReMap2020 database42) are shown. TRs were annotated against 5 gene ontology terms associated with repressive functions. The dot size represents promoter fraction overlapping with the TR binding sites. The dot colour indicates the expression levels of the enriched transcription regulators in hPGCs.
Extended Data Fig. 4 Direct targets of SOX17 in hPGCLCs and DE.
a, Cumulative distribution function plot showing the functional prediction of SOX17 and PRDM1. The ChIP peaks of SOX17/PRDM1 in hPGCLCs were assigned to genes with TSS within 100 kb of the peak summits. A regulatory potential score was calculated for each gene based on the distance between the peak summit and the TSS96. The genes were then divided into upregulated, downregulated and unchanged according to their expression patterns upon SOX17 or PRDM1 overexpression. Cumulative distribution function plot was generated for each group with genes ranked by decreasing regulatory potential. A one-tailed Kolmogorov-Smirnov test was used to determine the statistical significance between the differentially expressed groups and the unchanged group. Note that upregulated genes (but not downregulated genes) upon SOX17 induction have a significantly higher tendency to be bound by SOX17. In contrast, genes downregulated upon PRDM1 overexpression tend to be bound by PRDM1. b, GO biological process terms that were enriched in SOX17 direct up targets (red dots in Fig. 3d). c, Expression heatmap of SOX17 direct up target genes. Shown are the genes which were (1) upregulated both by SOX17 alone (log2(fold change) >1 and adjusted p-value <0.05 between Dex-treated and non-treated 12h PreME aggregates); (2) upregulated by cytokines (log2(fold change) >2 and adjusted p-value <0.05 between day 2 hPGCLCs and PreME); and (3) downregulated in SOX17 KO (log2(fold change) >1 and adjusted p-value <0.05 between SOX17 KO day 2 aggregate and WT day 2 aggregate)20. d, Top eight TFs with binding sites (ReMap2020) enriched in hPGCLC-specific and DE-specific SOX17 peaks. e, Genome browser snapshots showing the epigenetic landscape of DE-specific (CER1 and LEFTY2) and hPGCLC-specific (NANOS3 and PDPN) SOX17-bound gene targets. f, Heatmap showing expression of genes associated with the top enriched motifs (Fig. 3h) and the top enriched TF binding sites (in d) in hPGCLC-specific and DE-specific SOX17 peaks. g, UniProtKB Keywords that were enriched in PRDM1 direct down targets (blue dots in Fig. 3i).
Extended Data Fig. 5 Direct target genes of TFAP2C, SOX17 and PRDM1.
a, Genomic distribution of the TFAP2C peaks in hPGCLC aggregates46. b, Cumulative distribution function plot showing the functional prediction of TFAP2C. The TFAP2C peaks were assigned to genes with TSS within 100 kb of the peak summits. A regulatory potential score was calculated for each gene based on the distance between the peak summit and the TSS96. The genes were then divided into three groups (upregulated, downregulated and unchanged) according to their expression patterns in TFAP2C KO day2 hPGCLCs versus the wild-type control20. Cumulative distribution function plot was generated for each group with genes ranked by decreasing regulatory potential. A one-tailed Kolmogorov-Smirnov test was used to determine the statistical significance between the differentially expressed groups and the unchanged group. c, The enrichment of TFAP2C, SOX17 and PRDM1 peaks in genomic loci that gained ATAC, H3K4me1, H3K4me3, H3K27ac or H3K27me3 signals during the PreME to hPGCLC transition. The TF peaks were categorized into seven cooperativity classes as in Fig. 4a. The dot size represents the fraction of chromatin peaks that overlapped with the TF peaks. Dot color indicates enrichment significance. d, Alluvial plots showing the enhancer (upper panel) and promoter (lower panel) state transition from PreME to hPGCLC. The enhancers/promoters that became active/inactive in hPGCLCs were used for TF binding enrichment analysis in Fig. 4b. e, Venn diagram showing the intersection of upregulated and downregulated genes in SOX17, TFAP2C and PRDM1 KO hPGCLCs/aggregate20. Upregulated and downregulated genes were defined as (log2(fold change versus wild-type control) >1 and adjusted p-value < 0.05) and (log2(fold change versus wild-type control) <(-1) and adjusted p-value < 0.05), respectively. f, The number of direct up and down target genes of TFAP2C, SOX17 and PRDM1 based on their cooperative binding. g, Enrichment of chromatin remodelling factor binding sites in TFAP2C, SOX17 and PRDM1 peaks in hPGCLCs. The y-axis shows the chromatin remodelling factors that were amongst the top 10 enriched transcriptional regulators (ReMap2020) in any of the five peak sets (x-axis).
Extended Data Fig. 6 Inducible CRISPR activation and interference systems for activation and repression of hPGC TFs.
a, The piggyBAC plasmids encoding an optimized doxycycline-inducible dCas9-SunTag-VP64 CRISPR activation system. Upon integration of both the CRISPRa plasmid and the sgRNA plasmid into the genome and Dox treatment, the Tet-On 3G doxycycline-binding transactivator protein encoded in the sgRNA plasmid will drive the transcription of dCas9-GCN4x5-P2A-scFV-sfGFP-VP64 through the TRE3G promoter. After translation of the mRNA, the recombinant protein will be split into dCas9-GCN4x5 and scFV-sfGFP-VP64 through the P2A self-cleaving peptide. Subsequently, the dCas9-GCN4x5 will be guided to enhancer/promoter by the constitutively expressed sgRNA and recruit up to 5 copies of the scFV-sfGFP-VP64 recombinant transactivator. To improve epigenome editing efficiency, the GCN4 epitopes were separated by optimized 22-amino-acid linkers55. To increase sgRNA expression and to enhance sgRNA-dCas9 affinity, a sgRNA scaffold with an A-U flip and extended hairpin was used56. b, RT-qPCR showing CRISPR activation of SOX17 (2 days Dox treatment in hESCs) induced PRDM1 and TFAP2C mRNA expression. Activation of PRDM1 also upregulated TFAP2C. Average of 4 biological replicates, with individual replicates shown as data points. c, The epigenetic landscape of the SOX17 and PRDM1 loci in hESC, PreME, hPGCLCs and HEK293 cells. Note that the SOX17 locus in HEK293 cells does not bear H3K4me1, H3K4me3 and H3K27ac marks. For CRISPR activation (CRISPRa) assay, 3–5 sgRNAs were used to activate each putative enhancer (highlighted) and promoter. d-e, RT-qPCR of SOX17 and PRDM1 following CRISPR activation of enhancers and promoters in HEK293 cells. HEK293 cells were transiently transfected with CRISPRa (dCas9-Suntag-VP64) and sgRNA plasmids and treated with Dox for 2 days. GFP-positive cells (expressing dCas9-Suntag and scFV-sfgFP-VP64) were isolated for RT-qPCR. Average of 3 technical replicates, with individual replicates shown as data points. Assay has been performed two times independently with similar results. f, The piggyBAC plasmids encoding a re-engineered doxycycline-inducible KRAB-dCas9-DHFR CRISPR interference system (also see Fig. 6a).
Extended Data Fig. 7 Characterisation of CRISPRa-induced hPGCLCs.
a and b, FACS analysis of day 4 EBs generated from PreME of hESC lines harbouring the Dox-inducible CRISPRa transgene with the indicated sgRNA combinations. c, Immunofluorescence showing the co-expression of hPGCLC markers NANOS3-tdTomato, POU5F1 and SOX17 in hPGCLCs (yellow arrowheads) induced by CRISPRa in the absence of BMP4. White arrowheads indicate SOX17 single-positive cells (presumably DE). Representative results of 3 biological replicates. d, Induction of hPGCLCs from hESCs, PreME and ME with or without CRISPR-mediated activation of SOX17 and PRDM1 enhancers and promoters. FACS analysis of day 4 EBs shows that the co-activation of SOX17 and PRDM1 act synergistically with BMP4 to increase the efficiency of hPGCLC induction from hESCs and PreME, but not from ME. The appearance of the EBs under brightfield and tdTomato fluorescence filter are shown next to the corresponding FACS plots. Representative results of 3 independent experiments. e, Alluvial plots showing enhancer state transitions of hPGCLC-active enhancers in hESCs, PreME and ME.
Extended Data Fig. 8 Sequence conservation of the human PRDM1 regulatory element.
a, PCA analysis scRNAseq profiles of cells in the hESC, PreME and ME state b, Violin plots summarizing expression levels of the indicated genes in individual cells in the hESC, PreME and ME state analysed by scRNA-seq. c, Schematics of the inducible CRISPR interference (CRISPRi) system used to repress the two OTX2 promoters (upper panel). Western Blots depicting OTX2 and H3 levels in transgenic hESCs treated with vehicle or Doxcycline and TMP for the indicated time periods (lower panel). Molecular weights of marker proteins are depicted in kilodaltons (kDa). Representative experiment, knockdown efficiency tested two times independently at the shown timepoints. d, FACS analysis of PGCLCs expressing non-targeting or sgRNA targeting the OTX2 promoters in the presence or absence of KRAB-dCas9-ecDHFR (GFP). Representative experiment out of 3 independent technical replicates shown in Fig. 8e. e, Genome browser snapshots showing OTX2 ChIPseq signals and peaks in hESCs (GSE61475)64. Enhancer identified in this work are indicated in yellow. f, Upper panels: BLAT alignment of the core murine PRDM1 enhancer (B108)72 to the human genome. Conservation of the SOX motifs in the putative enhancer and promoter of human PRDM1 across seven mammalian species. MULTIZ whole-genome alignment showed that 4 out of 5 core SOX motifs (‘ATTGT’, underlined) in the human PRDM1 enhancer and promoter are conserved in mice. Grey dot indicates exact match. Blank space represents absence of the corresponding sequence in the indicated species. Lower panels: BLAT alignment of the human PRDM1 enhancer to the murine genome showing the conservation of the OTX2 motif in the murine PRDM1 enhancer72.
Supplementary information
Supplementary Tables
Supplementary Table 1. Sequencing and mapping summary of RNA-seq and ChIP–seq libraries. Supplementary Table 2. Gene Ontology enrichment analysis of dynamically active enhancers. Supplementary Table 3. Direct target genes of SOX17 by overexpression. Supplementary Table 4. Direct target genes of PRDM1 by overexpression. Supplementary Table 5. Direct up-target genes of TFAP2C, SOX17 and PRDM1. Supplementary Table 6. Direct down-target genes of TFAP2C, SOX17 and PRDM1. Supplementary Table 7. Antibodies, oligos and sgRNAs.
Source data
Source Data Fig. 3
Luciferase assay source data.
Source Data Fig. 5
Fig. 5c RT–qPCR source data.
Source Data Fig. 6
Fig. 6b Relative hPGCLC induction source data.
Source Data Fig. 7
Fig. 7b RT–qPCR source data.
Source Data Fig. 8
Fig. 8e Relative hPGCLC induction source data.
Source Data Extended Data Fig. 6
Extended Data Fig. 6b,d,e RT–qPCR source data.
Source Data Extended Data Fig. 8
Extended Data Fig. 8c Western blot raw images.
Rights and permissions
About this article
Cite this article
Tang, W.W.C., Castillo-Venzor, A., Gruhn, W.H. et al. Sequential enhancer state remodelling defines human germline competence and specification. Nat Cell Biol 24, 448–460 (2022). https://doi.org/10.1038/s41556-022-00878-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41556-022-00878-z
This article is cited by
-
Resetting histone modifications during human prenatal germline development
Cell Discovery (2023)
-
The balance between NANOG and SOX17 mediated by TET proteins regulates specification of human primordial germ cell fate
Cell & Bioscience (2022)
-
Germline stem cells in human
Signal Transduction and Targeted Therapy (2022)