Human reproduction is regulated by retrotransposons derived from ancient Hominidae-specific viral infections

Germ cells are essential to pass DNA from one generation to the next. In human reproduction, germ cell development begins with the specification of primordial germ cells (PGCs) and a failure to specify PGCs leads to human infertility. Recent studies have revealed that the transcription factor network required for PGC specification has diverged in mammals, and this has a significant impact on our understanding of human reproduction. Here, we reveal that the Hominidae-specific Transposable Elements (TEs) LTR5Hs, may serve as TEENhancers (TE Embedded eNhancers) to facilitate PGC specification. LTR5Hs TEENhancers become transcriptionally active during PGC specification both in vivo and in vitro with epigenetic reprogramming leading to increased chromatin accessibility, localized DNA demethylation, enrichment of H3K27ac, and occupation of key hPGC transcription factors. Inactivation of LTR5Hs TEENhancers with KRAB mediated CRISPRi has a significant impact on germ cell specification. In summary, our data reveals the essential role of Hominidae-specific LTR5Hs TEENhancers in human germ cell development.


nature research | reporting summary
April 2020 For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. The accession number for the RNA-seq reported in this paper is GEO: GSE182218. No effect sizes were pre-specified.
No data were excluded from this work.
For CRISPRi experiments in UCLA2: hPGCLC differentiation was performed n=3 independent times for each group. For UCLA1: hPGCLC differentiation was performed n=6 independent times for each group. RNA-Seq for CRISPRi experiments in UCLA2: we performed RNA-Seq in n=3 biological replicates For ChIP-seq of NANOG and SOX17: We performed ChIP-seq in n=2 biological replicates for NANOG ChIP-seq in both UCLA2 hESCs and induced hPGCLCs; we performed ChIP-seq in n=2 biological replicates for SOX17 ChIP-seq in both UCLA2 hESCs and induced hPGCLCs.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

ChIP-seq Data deposition
Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks.

Data access links
May remain private before publication.

Files in database submission
For CRISPRi experiments, no selective process was used to assign experimental groups, instead cells (UCLA1 or UCLA2) were randomly split into treatment or control groups at the beginning of the experiment. For all ChIP-Seq experiments, randomization does not apply because there were no treatment groups.
NA. Blinding was not necessary as quantitative analysis methods were performed in batch.
UCLA1 and UCLA2 are human embryonic stem cell (hESC) lines. The lines were generated at UCLA in 2009 by the UCLA Pluripotent Stem Cell Core Facility. Each cell line was generated from a single human embryo consented for research. Following derivation the hESC lines are de-identified, and provided to investigators with all links and identifiers removed. SNP/CNV analysis using the Affymetrix Genome-wide Human SNP Array 6.0 was used for authentication of each hESC line before distribution. The research performed in this study is not considered human subjects research because the cells are provided from the UCLA core facility to UCLA investigators de-identified. HEK293T cells are purchased directly from ATCC (Cat # CRL-3216). Authentication involved certification by ATCC (Manassas, Virginia) prior to purchase.

SNP/CNV Analysis was used for authentication
Mycoplasma testing is performed on a routine basis using ELISA and the cell line is negative.