Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells

Journal name:
Nature
Volume:
522,
Pages:
221–225
Date published:
DOI:
doi:10.1038/nature14308
Received
Accepted
Published online

Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections, and comprise nearly 8% of the human genome1. The most recently acquired human ERV is HERVK(HML-2), which repeatedly infected the primate lineage both before and after the divergence of the human and chimpanzee common ancestor2, 3. Unlike most other human ERVs, HERVK retained multiple copies of intact open reading frames encoding retroviral proteins4. However, HERVK is transcriptionally silenced by the host, with the exception of in certain pathological contexts such as germ-cell tumours, melanoma or human immunodeficiency virus (HIV) infection5, 6, 7. Here we demonstrate that DNA hypomethylation at long terminal repeat elements representing the most recent genomic integrations, together with transactivation by OCT4 (also known as POU5F1), synergistically facilitate HERVK expression. Consequently, HERVK is transcribed during normal human embryogenesis, beginning with embryonic genome activation at the eight-cell stage, continuing through the emergence of epiblast cells in preimplantation blastocysts, and ceasing during human embryonic stem cell derivation from blastocyst outgrowths. Remarkably, we detected HERVK viral-like particles and Gag proteins in human blastocysts, indicating that early human development proceeds in the presence of retroviral products. We further show that overexpression of one such product, the HERVK accessory protein Rec, in a pluripotent cell line is sufficient to increase IFITM1 levels on the cell surface and inhibit viral infection, suggesting at least one mechanism through which HERVK can induce viral restriction pathways in early embryonic cells. Moreover, Rec directly binds a subset of cellular RNAs and modulates their ribosome occupancy, indicating that complex interactions between retroviral proteins and host factors can fine-tune pathways of early human development.

At a glance

Figures

  1. Transcriptional reactivation of HERVK in human preimplantation embryos and naive human ES cells.
    Figure 1: Transcriptional reactivation of HERVK in human preimplantation embryos and naive human ES cells.

    a, Schematic of human preimplantation development. b, HERVK expression in single cells of human embryos at indicated stages. Solid line indicates mean. Oocyte (n = 3), zygote (n = 3), 2-cell (n = 6), 4-cell (n = 11), 8-cell (n = 19), morula (n = 16). bd, Data are taken from ref. 10. *P value < 0.05, non-paired Wilcoxon test. RPKM, reads per kilobase per million. c, HERVK expression in single cells of human blastocysts, grouped by lineage. Solid line indicates mean. Trophectoderm (TE; n = 18), primitive ectoderm (PE; n = 7), epiblast (EPI; n = 5). d, HERVK expression in single cells of blastocyst outgrowths (passage (p)0) or human ES cells at passage (p)10. Solid line indicates mean. p0 (n = 8), p10 (n = 26). e, Analysis of the repetitive transcriptomes of three, genetically matched naive/primed human ES cell pairs. Left, naive/primed ELF1 human ES cells (data from this study) (n = 3 biological replicates for both conditions). Middle, 3iL/primed H1 human ES cells (data are taken from ref. 12) (n = 3 biological replicates for both conditions). Right, naive/primed H9 human ES cells (data are taken from ref. 15) (n = 3 biological replicates for both conditions). Significant repeats indicated in red at false discovery rate (FDR) < 0.05, DESeq. hESC, human ES cells.

  2. Transactivation by OCT4 and DNA hypomethylation of LTR5HS synergistically regulate HERVK transcription.
    Figure 2: Transactivation by OCT4 and DNA hypomethylation of LTR5HS synergistically regulate HERVK transcription.

    a, Expression of different HERVK proviral sequences, grouped according to the oldest common ancestor, as defined previously4. *P value < 0.05, non-paired Wilcoxon test. Solid line indicates mean. RNA-seq data set used for the analysis was from 3iL naive H1 cells12; n = 3 biological replicates. b, Conserved OCT4 site in LTR5HS with position weight matrix of the corresponding motif shown for comparison (top). Presence/absence of OCT4 motif in distinct LTR5 sequences is indicated (bottom); more detailed sequence information is in Extended Data Fig. 2a. c, ChIP-qPCR analyses from human EC cells (NCCIT) using antibodies indicated on top of each graph. Signals were quantified using primer sets specific to LTR5HS (5HS), LTR5a (5a) and LTR5b (5b) consensus sequences or two ‘negative’ intergenic, non-repetitive regions (neg1, neg2). *P value < 0.05 compared to negative control, one sided t-test. n = 4 biological replicates, error bars are ±1 standard deviation (s.d.). d, Flow cytometry analysis of human EC cells with integrated LTR5HS fluorescent reporters, either wild type (middle) or with OCT4 motif mutation (bottom). Red fluorescent protein (RFP)-positive population was gated using side-scatter area (SSC-A) and cells with integrated negative control reporter (top) containing minimal thymidine kinase (miniTK) promoter. Shown is a representative result of two independent experiments. e, Bisulfite conversion quantification of LTR5HS 5-methyl-cytosine levels measured using LTR5HS-specific primer pairs anchored in the LTR5HS consensus sequence (left) or provirus-specific 5′ LTR5HS (right) for human EC cells (hECC; NCCIT), human ES cells (hESC; H9) or naive human ES cells (ELF1). Filled circles depict modified cytosines, open circles depict unmodified cytosines. Human EC cells (NCCIT) and naive human ES cells (ELF1) are less methylated than primed human ES cells (H9). P < 0.05, non-paired Wilcoxon test. f, qPCR with reverse transcription (RT–qPCR) analysis of human ES cells (H9) treated with indicated concentrations of 5-aza-2′-deoxycytidine for 24 hours. *P value < 0.05, one-sided t-test. n = 3 biological replicates, error bars ±1 s.d. g, RT–qPCR analysis of HERVK rec RNA levels in HEK293 cells treated with indicated concentrations of 5-aza-2'-deoxycytidine, followed by transfection with OCT4/SOX2 expression constructs. *P value < 0.05, one-sided t-test; NS, not significant. n = 4 biological replicates, error bars ±1 s.d.

  3. Human blastocysts contain HERVK proteins and viral-like particles.
    Figure 3: Human blastocysts contain HERVK proteins and viral-like particles.

    a, Immunofluorescence of human blastocysts (days post-fertilization (DPF) 5–6) stained with 4′,6-diamidino-2-phenylindole (DAPI; blue), OCT4 antibody (green), and HERVK Gag/Capsid antibody (red). Images show a representative example (n = 19 embryos). Scale bar = 50 µm. White arrow indicates an OCT4+ nucleus, surrounded by cytoplasmic Gag/Capsid (Cap), which is shown with higher magnification in an inset. b, Heavy metal staining TEM of a human blastocyst. Arrow indicates putative VLP (found in n = 2/3 blastocysts, DPF 5–6). Higher magnification of indicated region is shown in inset. Scale bar = 200 nm. c, Heavy metal staining TEM of human blastocyst. Arrow indicates putative immature VLP, bracket indicates vesicle filled with putative VLP (found in n = 2/3 blastocysts, DPF 5–6). Scale bar = 100 nm. d, e, Immuno-TEM of human blastocysts with Gag/Capsid staining; region of higher magnification is boxed. Representative examples of budding (d) and cell-internal (e) particles are shown; n = 3 blastocysts (DPF 5–6), n = 3 labelled particles in two embryos.

  4. HERVK accessory protein Rec upregulates viral restriction pathway and engages cellular mRNAs.
    Figure 4: HERVK accessory protein Rec upregulates viral restriction pathway and engages cellular mRNAs.

    a, Flow cytometry histograms of IFITM1 surface staining in control human EC cells or Rec-hECC cells; histogram of negative control cells stained with isotype IgG+ Alexa-647 secondary antibody (A-647) is shown for comparison. Shown is a representative result of two independent experiments. b, H1N1(PR8) influenza infection of control GFP-hECC cells or two clonal lines of Rec-hECCs. Control cells were set as 100%, shown is aggregate results from two independent experiments, n = 8 total biological replicates for each condition. Error bars are ±1 s.d. **P value < 0.005, one-sided t-test. c, Rec iCLIP reads mapped to the LTR5HS sequence, n = 2 biological replicates. d, Distribution of Rec binding sites on endogenous mRNAs (top) and aggregate Rec iCLIP-seq signal on a metagene (bottom), n = 2 biological replicates. CDS, coding DNA sequence. e, Distribution of Rec iCLIP reads at representative target mRNAs KLRG2 (top), RPL22 (bottom); y-axis, iCLIP score, at cut-off = 3 (see Methods for details). f, Ribosome profiling signal for all significant genes (FDR < 0.05 Cuffdiff) in wild-type human EC cells versus Rec-hECCs, n = 4 biological replicates. Rec iCLIP targets are coloured in red.

  5. Additional single-cell RNA-seq data analyses from preimplantation human embryos (supporting Fig. 1).
    Extended Data Fig. 1: Additional single-cell RNA-seq data analyses from preimplantation human embryos (supporting Fig. 1).

    a, Heat map and hierarchical K-means clustering of highly expressed (average RPKM > 6 across 89 embryo libraries) repetitive elements in single cells of human preimplantation embryos at indicated developmental stages (top) and HERVK expression (bottom) using indicated data sets. b, HERVH expression (RPKM) in single cells of human embryos at indicated preimplantation stages. Solid line indicates mean. RNA-seq data are taken from ref. 10. c, HERVH expression (RPKM) in single cells of human blastocysts, grouped by lineage. Solid line indicates mean. Oocyte (n = 3), zygote (n = 3), 2-cell (n = 6), 4-cell (n = 11), 8-cell (n = 19), morula (n = 16), TE (n = 18), PE (n = 7), EPI (n = 5), p0 (n = 8), p10 (n = 26). RNA-seq data set was from ref. 10. d, Genome browser snapshot showing 100 bp PE-RNA-seq reads from ELF1 naive human EScells aligning at the HERVK 108 provirus on chromsome 7.

  6. LTR5 alignments, HERVK expression data in cell lines, and control ChIP-qPCR analyses in primed human ES cells (supporting Fig. 2).
    Extended Data Fig. 2: LTR5 alignments, HERVK expression data in cell lines, and control ChIP-qPCR analyses in primed human ES cells (supporting Fig. 2).

    a, Top, presence of HERVK(HML-2) sequences in Old World primates, but absence in New World primates. Middle, schematic of HERVK proviral genome; all human-specific insertions contain LTR5HS. Bottom, phylogenetic relationship of HERVK LTR subclasses showing high degree of sequence similarity. Pro, protease; Pol, polymerase; Gag, group-specific antigen; Env, envelope. Bottom, ClustLW multiple sequence alignment of indicated HERVK LTR sequences (top), region around OCT4 motif is boxed, phylogenetic tree (bottom) indicating presence/absence of OCT4 motif. b, HERVK protein expression in human EC cells and human ES cells. Protein extracts from human EC cells (NCCIT) and human ES cells (H9) were analysed by immunoblotting with an antibody detecting HERVK Gag precursor and the processed Capsid (top), or the glycosylated, unprocessed form of the HERVK envelope protein Env (bottom). Tata-binding protein (TBP) was used as a loading control. Shown is a representative result of three independent experiments. c, RT–qPCR analysis of HERVK RNA expression in human EC cell line NCCIT, human ES cell line H9, and HEK293 cells. Three distinct qPCR amplicons, corresponding to env, gag and pro are shown. Samples were normalized to 18S ribosomal RNA levels. *P value < 0.05, one-sided t-test. Error bars are ±1 s.d., n = 3 biological replicates. d, HERVK gag or env expression in male human ES cell lines HSF-1, HSF-8, female human ES cell line H9 and human EC cell line NCCIT. *P value < 0.05, one sided t-test compared to control siRNA, n = 3 biological replicates. Error bars are ±1 s.d. e, RT–qPCR analysis of HERVK transcripts after siRNA knockdown of NANOG, OCT4 or SOX2 in human EC cells (NCCIT). Signals were normalized to 18S rRNA. *P value < 0.05, one sided t-test compared to control siRNA, n = 3 biological replicates. Error bars are ±1 s.d. f, ChIP-qPCR analyses of human ES cells (H9) with indicated antibodies. Signals were interrogated with primer sets for positive control regions (active human ES cell OCT4 and SOX2 enhancers), LTR5HS, or non-repetitive, intergenic negative regions, as indicated at the bottom. Shown is a representative result of two biological replicates.

  7. HERVK regulation by OCT4 and DNA methylation (supporting Fig. 2).
    Extended Data Fig. 3: HERVK regulation by OCT4 and DNA methylation (supporting Fig. 2).

    a, Transcription factor knockdown in human EC cells (NCCIT). Cells were transfected with siRNA pools targeting indicated transcription factors and protein depletion was measured by immunofluoresence with respective antibodies in comparison with control, mock-transfected cells. DAPI (blue), OCT4 (green, left), NANOG (green, middle), SOX2 (green, right). Shown is one of three representative fields of view at ×20 magnification. b, Dual luciferase assays with indicated reporter constructs in human EC cells (NCCIT) showing that mutation of OCT4 site decreases reporter activity. N = 3 biological replicates, error-bars ±1 s.d. *P value < 0.05, one-sided t-test. SV40 enhancer/promoter construct was used as a positive control. c, Bisulfite sequencing for indicated cell types (WT33 human IPSC) analysing consensus LTR5HS-specific amplicon as in Fig. 2e. d, Bisulfite sequencing analysis of HERVK proviral consensus amplicon containing 3′ end of LTR, primer binding site, and 5′ region of Gag ORF (see Extended Data Fig. 2a) in indicated cell types: ELF1 naive, human ES cell, WT33 human IPSC, NCCIT human EC cell, or H9 human ES cell. e, RT–qPCR analysis of HERVK RNA levels in HEK293 cells treated with indicated concentrations of 5-aza-2′-deoxycytidine for 3 days, followed by transfection with OCT4/SOX2 expression constructs and RNA collection 48 h after transfection. qPCR primer sets were designed to three independent amplicons of HERVK. *P value < 0.05, one-sided t-test. n = 4 biological replicates, error bars ±1 s.d.

  8. HERVK Gag/Capsid antibody validation and staining (supporting Fig. 3).
    Extended Data Fig. 4: HERVK Gag/Capsid antibody validation and staining (supporting Fig. 3).

    a, Immunofluorescence analysis of human EC cells (NCCIT) and human ES cells (H9) stained with DAPI (blue), OCT4 (green), Gag/Capsid (red), or IgG control (bottom). White boxes indicate regions shown in higher magnification/merge (right). Shown are representative fields of three independent experiments. b, Sensitivity of HERVK Gag/Capsid antibody immunoblot signal to HERVK knockdown. Human EC cells were transfected with one of three independent siRNA pools targeting HERVK Gag or with a control, non-targeting pool (synthesized against RFP) and total protein was analysed by immunoblotting with anti-Env and anti-Gag/Capsid antibodies. 1:2 serial dilution of total protein was loaded, as indicated. Blots were stripped and re-probed with TBP as a loading control. Shown is a representative result of two independent experiments. c, Sensitivity of HERVK Gag/Capsid antibody immunofluorescence signal to siRNA knockdown of Gag/Capsid (top) or control siRNA targeting RFP (bottom). Shown is a representative result of three fields of view. Magnification: 20X d, Immunoflourescence of naive ELF1 human ES cells with antibodies against OCT4 (green), HERVK Gag/Capsid (pink), DAPI in blue. Region marked with white box on left is shown with larger magnification (bottom). Magnification = 20x, 40x respectively. e, Another representative example of immunoflourescence of human blastocysts with DAPI (blue), OCT4 (green) and Gag/Capsid (red) shown (n = 19 blastocysts; DPF 5–6). Original magnification, ×40.

  9. TEM analyses of human EC cells and control embryo staining (supporting Fig. 3).
    Extended Data Fig. 5: TEM analyses of human EC cells and control embryo staining (supporting Fig. 3).

    a, TEM analysis of human EC cells (NCCIT) with heavy metal staining; arrow indicates VLPs. Boxed region is shown with higher magnification in an inset. Scale bar = 500 nm. Shown is a representative example of two independent experiments. b, TEM immuno-gold labelling of human EC cells (NCCIT) with Gag/Capsid antibodies. Shown is a representative example from two independent experiments. c, Secondary antibody only control for immuno-gold labelling of human blastocysts. Shown is a representative example from eight fields of view. d, Model figure summarizing HERVK transcriptional regulation in human embryos and in vitro cultured pluripotent cells. Dashed lines indicate inference of OCT4, DNA methylation and HERVK level changes at implantation from those observed between naive and primed human ES cells, in the absence of data from actual postimplantation human embryos.

  10. Correlation of HERVK LTR5HS elements with gene expression (supporting Fig. 4).
    Extended Data Fig. 6: Correlation of HERVK LTR5HS elements with gene expression (supporting Fig. 4).

    a, Number of splice junctions identified linking indicated HERV class to annotated ReqSeq genes. Analysis was done using RNA-seq data set from ELF1 naive human ES cells, n = 3 biological replicates. b, Number of reads supporting chimaeric transcripts from indicated HERV class in ELF1 naive human ES cells, n = 3 biological replicates. c, Expression of LTR5HS linked genes plotted as a function of distance to the gene’s transcription start site (TSS). x-axis: distance of TSS to the nearest LTR5HS in kb; y-axis: fold change in expression of the linked gene in ELF1 naive versus primed human ES cells (this study, left) or expression of the linked gene in 3iL versus primed H1 human ES cells (right, ref. 12). d, Top, histograms showing expression of all genes that significantly change in expression between naive and primed ELF1 human ES cells (top histogram, white) or significantly changed genes that are LTR5HS associated (bottom histogram, blue); expression values from naive versus primed ELF1 human ES cell RNA-seq data sets (FDR < 0.05 DESeq). Fischer’s exact test gives stated P value, indicating enrichment of LTR5HS-linked genes in naive upregulated category. Bottom, quantification of average expression of LTR5HS-linked (blue) or unlinked (white) genes. Non-paired Wilcoxon test with stated P value indicating that genes linked to 1 or more LTR5HS have significantly higher mean expression. e, Top, histograms showing expression of all genes that significantly change in expression between 3iL and primed H1 human ES cells (top histogram, white) or significantly changed genes that are LTR5_HS associated (bottom histogram, blue); expression values from RNA-seq data sets reported previously12, FDR < 0.05 DESeq. Fischer’s exact test gives stated P value indicating enrichment of LTR5HS-linked genes in naive upregulated category. Bottom, quantification of average expression of LTR5HS-linked (blue) or unlinked (white) genes. Non-paired Wilcoxon test with stated P value indicating that genes linked to 1 or more LTR5HS have significantly higher mean expression.

  11. rec and IFITM1 expression in naive human ES cells, and effect of Rec expression on H1N1(PR8) infection (supporting Fig. 4).
    Extended Data Fig. 7: rec and IFITM1 expression in naive human ES cells, and effect of Rec expression on H1N1(PR8) infection (supporting Fig. 4).

    a, Left, RT–qPCR analysis of HERVK rec expression levels in ELF1 naive human ES cells (n = 3 biological replicates) or H9 primed human ES cells (one biological replicate). Normalized to 18S rRNA. Right, Rec RNA levels in indicated blastocyst lineages. Solid line indicates mean; data are from ref. 10. b, RNA-seq quantification of IFITM1 RNA levels in naive or primed ELF1 human ES cells (left, this study) or 3iL human ES cells versus primed H1 human ES cells from ref. 12 (right). n = 3 biological replicates for each condition, error bars are ±1 s.d. Asterisk indicates significance at FDR < 0.05, DESeq. c, Flow cytometry for surface-localized IFITM1 staining in the indicated H9 human ES cells or naive ELF1 human ES cells (top) or, as a control for IFITM1 antibody specificity, knockdown of IFITM1 with two independent IFITM1 siRNA pools compared to control siRNA-treated cells in Flag–eGFP–Rec-hECCs (bottom). d, Left, IFITM1 expression in control human EC cell versus Rec-hECC (NCCIT) RNA-seq data sets. n = 2 biological replicates. Significance = FDR < 0.05, DESeq. Right, IFITM1 expression in control siRNA versus Rec siRNA-treated human EC cells (NCCIT) RNA-seq. n = 3 biological replicates, error bars are ±1 s.d. Significance = FDR < 0.05, DESeq. e, Flow-cytometry profiles for indicated cell types in H1N1(PR8) infected (top) or non-infected (bottom) wild-type (WT) control human EC cells or Flag–GFP–Rec-hECCs, clone #1. Shown is one representative example of four independent experiments showing a co-plating experiment in which GFP-Rec cells and wild-type control (GFP negative) cells are infected in the same well, stained in the same tube and identified by GFP fluorescence after gating for FSC and SSC. f, Scatterplot of ELF1 naive versus primed human ES cell RNA-seq showing all interferon-induced genes, with differentially regulated genes (FDR < 0.05 DESeq, n = 3 biological replicates each) highlighted in red. There is a significant overlap between differentially regulated genes and interferon-induced genes as measured by a hypergeometric test (P value < 0.05).

  12. iCLIP analysis of Rec-associated RNAs (supporting Fig. 4).
    Extended Data Fig. 8: iCLIP analysis of Rec-associated RNAs (supporting Fig. 4).

    a, Diagram of iCLIP-seq procedure (see Methods for details). Briefly, cells are crosslinked using ultraviolet, lysed and digested with RNase to trim RNAs. Sequential immunopurification is performed using Flag M2, peptide elution, and GFP immunoprecipitation (IP). After stringent washing, RNAs are recovered and either radiolabelled (shown in b) or reverse transcribed and prepared for Illumina HTPS libraries. b, Autoradiogram of labelled RNAs (top) recovered from ultraviolet-crosslinked cells using sequential Flag–eGFP immunoprecipitation from: wild-type human EC cells (lanes 1, 2), Flag–eGFP control human EC cells (lanes 3, 4), or two independent Rec-hECC transgenic lines (lanes 5–8), separated on an SDS–polyacrylamide gel electrophoresis (SDS–PAGE) gel. Free Rec protein runs as a ~35 kDa band, while Rec protein crosslinked to RNA molecules show lower electrophoretic mobility. Please note that: (1) Rec-bound RNAs are resistant to even high concentrations of RNaseI, probably indicating extensive secondary RNA structures, and (2) low/no background of contaminating RNAs in control immunoprecipitation from wild-type human EC cells or Flag–eGFP control human EC cells. Western blots with anti-GFP antibody were also performed to confirm the presence of tagged protein in Flag–eGFP control and Flag–eGFP–Rec cells, both in input and immunoprecipitation fractions (middle). HSP90 was used as a loading control (bottom). c, Computationally predicted (using mFold) secondary structure of LTR5HS sequence around the Rec response element (identified experimentally in vitro previously25). Single nucleotide resolution Rec ultraviolet-crosslinking sites determined by iCLIP are shaded in red; n = 2 biological replicates.

  13. Rec target mRNA analysis (supporting Fig. 4).
    Extended Data Fig. 9: Rec target mRNA analysis (supporting Fig. 4).

    a, Genome browser representations of the Rec iCLIP read (n = 2 biological replicates) distribution at indicated mRNA targets. b, Computationally predicted (using mFold) secondary structures of indicated Rec iCLIP-seq targets. Single-nucleotide resolution Rec ultraviolet-crosslinking sites determined by iCLIP are shaded in red; to orient the reader, browser representation of the folded fragment is shown above each respective cartoon.

  14. Model of HERVK regulation and function.
    Extended Data Fig. 10: Model of HERVK regulation and function.

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Stoye, J. P. Studies of endogenous retroviruses reveal a continuing evolutionary saga. Nature Rev. Microbiol. 10, 395406 (2012)
  2. Belshaw, R. et al. Long-term reinfection of the human genome by endogenous retroviruses. Proc. Natl Acad. Sci. USA 101, 48944899 (2004)
  3. Barbulescu, M. et al. Many human endogenous retrovirus K (HERVK) proviruses are unique to humans. Curr. Biol. 9, 861868 (1999)
  4. Subramanian, R. P., Wildschutte, J. H., Russo, C. & Coffin, J. M. Identification, characterization, and comparative genomic distribution of the HERVK (HML-2) group of human endogenous retroviruses. Retrovirology 8, 90 (2011)
  5. Herbst, H., Sauter, M. & Mueller-Lantzsch, N. Expression of human endogenous retrovirus K elements in germ cell and trophoblastic tumors. Am. J. Pathol. 149, 17271735 (1996)
  6. Muster, T. et al. An endogenous retrovirus derived from human melanoma cells. Cancer Res. 63, 87358741 (2003)
  7. Contreras-Galindo, R. et al. Human endogenous retrovirus K (HML-2) elements in the plasma of people with lymphoma and breast cancer. J. Virol. 82, 93299336 (2008)
  8. Pace, J. K. & Feschotte, C. The evolutionary history of human DNA transposons: evidence for intense activity in the primate lineage. Genome Res. 17, 422432 (2007)
  9. Kunarso, G. et al. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nature Genet. 42, 631634 (2010)
  10. Yan, L. et al. Single-cell RNA-seq profiling of human preimplantation embryos and embryonic stem cells. Nature Struct. Mol. Biol. 20, 11311139 (2013)
  11. Smith, Z. D. et al. DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611615 (2014)
  12. Chan, Y.-S. et al. Induction of a human pluripotent state with distinct regulatory circuitry that resembles preimplantation epiblast. Cell Stem Cell 13, 663675 (2013)
  13. Gafni, O. et al. Derivation of novel human ground state naive pluripotent stem cells. Nature 504, 282286 (2013)
  14. Ware, C. B. et al. Derivation of naive human embryonic stem cells. Proc. Natl Acad. Sci. USA 111, 44844489 (2014)
  15. Takashima, Y. et al. Resetting transcription factor control circuitry toward ground-state pluripotency in human. Cell 158, 12541269 (2014)
  16. Theunissen, T. W. et al. Systematic identification of culture conditions for induction and maintenance of naive human pluripotency. Cell Stem Cell 15, 471487 (2014)
  17. Hohn, O., Hanke, K. & Bannert, N. HERVK(HML-2), the best preserved family of HERVs: endogenization, expression, and implications in health and disease. Front. Oncol. 3, 246 (2013)
  18. Shin, W. et al. Human-specific HERVK insertion causes genomic variations in the human genome. PLoS ONE 8, e60605 (2013)
  19. Boller, K. et al. Evidence that HERVK is the endogenous retrovirus sequence that codes for the human teratocarcinoma-derived retrovirus HTDV. Virology 196, 349353 (1993)
  20. Bieda, K., Hoffmann, A. & Boller, K. Phenotypic heterogeneity of human endogenous retrovirus particles produced by teratocarcinoma cell lines. J. Gen. Virol. 82, 591596 (2001)
  21. Dewannieux, M. et al. Identification of an infectious progenitor for the multiple-copy HERVK human endogenous retroelements. Genome Res. 16, 15481556 (2006)
  22. Lee, Y. N. & Bieniasz, P. D. Reconstitution of an infectious human endogenous retrovirus. PLoS Pathog. 3, e10 (2007)
  23. Macfarlan, T. S. et al. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature 487, 5763 (2012)
  24. Chuong, E. B., Rumi, M. A. K., Soares, M. J. & Baker, J. C. Endogenous retroviruses function as species-specific enhancer elements in the placenta. Nature Genet. 45, 325329 (2013)
  25. Löwer, R., Tönjes, R. R., Korbmacher, C., Kurth, R. & Löwer, J. Identification of a Rev-related protein by analysis of spliced transcripts of the human endogenous retroviruses HTDV/HERVK. J. Virol. 69, 141149 (1995)
  26. Brass, A. L. et al. The IFITM proteins mediate cellular resistance to influenza A H1N1 virus, West Nile virus, and dengue virus. Cell 139, 12431254 (2009)
  27. Hanke, K. et al. Staufen-1 interacts with the human endogenous retrovirus family HERVK(HML-2) Rec and Gag proteins and increases virion production. J. Virol. 87, 1101911030 (2013)
  28. Magin-Lachmann, C. et al. Rec (formerly Corf) function requires interaction with a complex, folded RNA structure within its responsive element rather than binding to a discrete specific binding site. J. Virol. 75, 1035910371 (2001)
  29. Gkountela, S. et al. The ontogeny of cKIT+ human primordial germ cells proves to be a resource for human germ line reprogramming, imprint erasure and in vitro differentiation. Nature Cell Biol. 15, 113122 (2013)
  30. Lange, U. C. et al. Normal germ line establishment in mice carrying a deletion of the Ifitm/Fragilis gene family cluster. Mol. Cell. Biol. 28, 46884696 (2008)
  31. Chavez, S. L., Meneses, J. J., Nguyen, H. N., Kim, S. K. & Pera, R. A. R. Characterization of six new human embryonic stem cell lines (HSF7, -8, -9, -10, -12, and -13) derived under minimal-animal component conditions. Stem Cells Dev. 17, 535546 (2008)
  32. Boyer, L. A. et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947956 (2005)
  33. Peng, J. C. et al. Jarid2/Jumonji coordinates control of PRC2 enzymatic activity and target gene occupancy in pluripotent cells. Cell 139, 12901302 (2009)
  34. Myers, J. W. & Ferrell, J. E., Jr in RNA Silencing (ed. Carmichael, G. G.) 93196 (Humana Press, 2005)
  35. Chavez, S. L. et al. Dynamic blastomere behaviour reflects human embryo ploidy by the four-cell stage. Nature Commun. 3, 1251 (2012)
  36. Wong, C., Chen, A. A., Behr, B. & Shen, S. Time-lapse microscopy and image analysis in basic and clinical embryo development research. Reprod. Biomed. Online 26, 120129 (2013)
  37. Rusinova, I. et al. INTERFEROME v2. 0: an updated database of annotated interferon-regulated genes. Nucleic Acids Res. 41, D1040D1046 (2013)
  38. Huppertz, I. et al. iCLIP: protein-RNA interactions at nucleotide resolution. Methods 65, 274287 (2014)
  39. Flynn, R. A. et al. Dissecting noncoding and pathogen RNA–protein interactomes. RNA 21, 135143 (2015)
  40. Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789802 (2011)
  41. Xue, Z. et al. Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing. Nature 500, 593597 (2013)

Download references

Author information

Affiliations

  1. Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA

    • Edward J. Grow,
    • Mark Wossidlo &
    • Renee A. Reijo Pera
  2. Howard Hughes Medical Institute and Program in Epithelial Biology, Stanford University School of Medicine, Stanford, California 94305, USA

    • Ryan A. Flynn,
    • Lance Martin &
    • Howard Y. Chang
  3. Institute for Stem Cell Biology & Regenerative Medicine, Stanford University School of Medicine, Stanford University, Stanford, California 94305, USA

    • Shawn L. Chavez,
    • Mark Wossidlo,
    • Daniel J. Wesche,
    • Renee A. Reijo Pera &
    • Joanna Wysocka
  4. Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford University, Stanford, California 94305, USA

    • Shawn L. Chavez,
    • Mark Wossidlo &
    • Renee A. Reijo Pera
  5. Division of Reproductive and Developmental Sciences, Oregon National Primate Research Center, Oregon Health & Science University, Beaverton, Oregon 97006, USA

    • Shawn L. Chavez
  6. Stanford Immunology, Stanford University School of Medicine, Stanford, California 94305, USA

    • Nicholas L. Bayless
  7. Department of Comparative Medicine, University of Washington, Seattle, Washington 98195-8056, USA

    • Carol B. Ware
  8. Department of Medicine, Stanford University School of Medicine, Stanford, California 94305, USA

    • Catherine A. Blish
  9. Department of Cell Biology and Neurosciences, Montana State University, Bozeman, Montana 59717, USA

    • Renee A. Reijo Pera
  10. Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, California 94305, USA

    • Joanna Wysocka
  11. Department of Developmental Biology, Stanford University School of Medicine, Stanford, California 94305, USA

    • Joanna Wysocka

Contributions

E.J.G. and J.W. conceived the project, designed experiments and wrote the manuscript, with input from all authors. E.J.G. carried out the majority of the experiments and data analyses. S.L.C., M.W. and E.J.G. performed human blastocyst handling and immunoflourescence with expertise and resources provided by R.A.R.P. R.A.F., L.M. and H.Y.C. performed and analysed iCLIP experiments. R.A.F. provided assistance with ribosome profiling experiments and analysis. N.L.B. and C.A.B. contributed influenza infection experiments. C.B.W. provided human naïve cells and reagents. D.J.W. performed expression analysis of LTR5HS-associated genes.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Sequencing data sets generated for this study are deposited under in the Gene Expression Omnibus under accession number GSE63570. Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests. Readers are welcome to comment on the online version of the paper. Correspondence and requests formaterials should be addressed to J.W. (wysocka@stanford.edu).

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Additional single-cell RNA-seq data analyses from preimplantation human embryos (supporting Fig. 1). (1,343 KB)

    a, Heat map and hierarchical K-means clustering of highly expressed (average RPKM > 6 across 89 embryo libraries) repetitive elements in single cells of human preimplantation embryos at indicated developmental stages (top) and HERVK expression (bottom) using indicated data sets. b, HERVH expression (RPKM) in single cells of human embryos at indicated preimplantation stages. Solid line indicates mean. RNA-seq data are taken from ref. 10. c, HERVH expression (RPKM) in single cells of human blastocysts, grouped by lineage. Solid line indicates mean. Oocyte (n = 3), zygote (n = 3), 2-cell (n = 6), 4-cell (n = 11), 8-cell (n = 19), morula (n = 16), TE (n = 18), PE (n = 7), EPI (n = 5), p0 (n = 8), p10 (n = 26). RNA-seq data set was from ref. 10. d, Genome browser snapshot showing 100 bp PE-RNA-seq reads from ELF1 naive human EScells aligning at the HERVK 108 provirus on chromsome 7.

  2. Extended Data Figure 2: LTR5 alignments, HERVK expression data in cell lines, and control ChIP-qPCR analyses in primed human ES cells (supporting Fig. 2). (867 KB)

    a, Top, presence of HERVK(HML-2) sequences in Old World primates, but absence in New World primates. Middle, schematic of HERVK proviral genome; all human-specific insertions contain LTR5HS. Bottom, phylogenetic relationship of HERVK LTR subclasses showing high degree of sequence similarity. Pro, protease; Pol, polymerase; Gag, group-specific antigen; Env, envelope. Bottom, ClustLW multiple sequence alignment of indicated HERVK LTR sequences (top), region around OCT4 motif is boxed, phylogenetic tree (bottom) indicating presence/absence of OCT4 motif. b, HERVK protein expression in human EC cells and human ES cells. Protein extracts from human EC cells (NCCIT) and human ES cells (H9) were analysed by immunoblotting with an antibody detecting HERVK Gag precursor and the processed Capsid (top), or the glycosylated, unprocessed form of the HERVK envelope protein Env (bottom). Tata-binding protein (TBP) was used as a loading control. Shown is a representative result of three independent experiments. c, RT–qPCR analysis of HERVK RNA expression in human EC cell line NCCIT, human ES cell line H9, and HEK293 cells. Three distinct qPCR amplicons, corresponding to env, gag and pro are shown. Samples were normalized to 18S ribosomal RNA levels. *P value < 0.05, one-sided t-test. Error bars are ±1 s.d., n = 3 biological replicates. d, HERVK gag or env expression in male human ES cell lines HSF-1, HSF-8, female human ES cell line H9 and human EC cell line NCCIT. *P value < 0.05, one sided t-test compared to control siRNA, n = 3 biological replicates. Error bars are ±1 s.d. e, RT–qPCR analysis of HERVK transcripts after siRNA knockdown of NANOG, OCT4 or SOX2 in human EC cells (NCCIT). Signals were normalized to 18S rRNA. *P value < 0.05, one sided t-test compared to control siRNA, n = 3 biological replicates. Error bars are ±1 s.d. f, ChIP-qPCR analyses of human ES cells (H9) with indicated antibodies. Signals were interrogated with primer sets for positive control regions (active human ES cell OCT4 and SOX2 enhancers), LTR5HS, or non-repetitive, intergenic negative regions, as indicated at the bottom. Shown is a representative result of two biological replicates.

  3. Extended Data Figure 3: HERVK regulation by OCT4 and DNA methylation (supporting Fig. 2). (1,414 KB)

    a, Transcription factor knockdown in human EC cells (NCCIT). Cells were transfected with siRNA pools targeting indicated transcription factors and protein depletion was measured by immunofluoresence with respective antibodies in comparison with control, mock-transfected cells. DAPI (blue), OCT4 (green, left), NANOG (green, middle), SOX2 (green, right). Shown is one of three representative fields of view at ×20 magnification. b, Dual luciferase assays with indicated reporter constructs in human EC cells (NCCIT) showing that mutation of OCT4 site decreases reporter activity. N = 3 biological replicates, error-bars ±1 s.d. *P value < 0.05, one-sided t-test. SV40 enhancer/promoter construct was used as a positive control. c, Bisulfite sequencing for indicated cell types (WT33 human IPSC) analysing consensus LTR5HS-specific amplicon as in Fig. 2e. d, Bisulfite sequencing analysis of HERVK proviral consensus amplicon containing 3′ end of LTR, primer binding site, and 5′ region of Gag ORF (see Extended Data Fig. 2a) in indicated cell types: ELF1 naive, human ES cell, WT33 human IPSC, NCCIT human EC cell, or H9 human ES cell. e, RT–qPCR analysis of HERVK RNA levels in HEK293 cells treated with indicated concentrations of 5-aza-2′-deoxycytidine for 3 days, followed by transfection with OCT4/SOX2 expression constructs and RNA collection 48 h after transfection. qPCR primer sets were designed to three independent amplicons of HERVK. *P value < 0.05, one-sided t-test. n = 4 biological replicates, error bars ±1 s.d.

  4. Extended Data Figure 4: HERVK Gag/Capsid antibody validation and staining (supporting Fig. 3). (1,489 KB)

    a, Immunofluorescence analysis of human EC cells (NCCIT) and human ES cells (H9) stained with DAPI (blue), OCT4 (green), Gag/Capsid (red), or IgG control (bottom). White boxes indicate regions shown in higher magnification/merge (right). Shown are representative fields of three independent experiments. b, Sensitivity of HERVK Gag/Capsid antibody immunoblot signal to HERVK knockdown. Human EC cells were transfected with one of three independent siRNA pools targeting HERVK Gag or with a control, non-targeting pool (synthesized against RFP) and total protein was analysed by immunoblotting with anti-Env and anti-Gag/Capsid antibodies. 1:2 serial dilution of total protein was loaded, as indicated. Blots were stripped and re-probed with TBP as a loading control. Shown is a representative result of two independent experiments. c, Sensitivity of HERVK Gag/Capsid antibody immunofluorescence signal to siRNA knockdown of Gag/Capsid (top) or control siRNA targeting RFP (bottom). Shown is a representative result of three fields of view. Magnification: 20X d, Immunoflourescence of naive ELF1 human ES cells with antibodies against OCT4 (green), HERVK Gag/Capsid (pink), DAPI in blue. Region marked with white box on left is shown with larger magnification (bottom). Magnification = 20x, 40x respectively. e, Another representative example of immunoflourescence of human blastocysts with DAPI (blue), OCT4 (green) and Gag/Capsid (red) shown (n = 19 blastocysts; DPF 5–6). Original magnification, ×40.

  5. Extended Data Figure 5: TEM analyses of human EC cells and control embryo staining (supporting Fig. 3). (1,459 KB)

    a, TEM analysis of human EC cells (NCCIT) with heavy metal staining; arrow indicates VLPs. Boxed region is shown with higher magnification in an inset. Scale bar = 500 nm. Shown is a representative example of two independent experiments. b, TEM immuno-gold labelling of human EC cells (NCCIT) with Gag/Capsid antibodies. Shown is a representative example from two independent experiments. c, Secondary antibody only control for immuno-gold labelling of human blastocysts. Shown is a representative example from eight fields of view. d, Model figure summarizing HERVK transcriptional regulation in human embryos and in vitro cultured pluripotent cells. Dashed lines indicate inference of OCT4, DNA methylation and HERVK level changes at implantation from those observed between naive and primed human ES cells, in the absence of data from actual postimplantation human embryos.

  6. Extended Data Figure 6: Correlation of HERVK LTR5HS elements with gene expression (supporting Fig. 4). (940 KB)

    a, Number of splice junctions identified linking indicated HERV class to annotated ReqSeq genes. Analysis was done using RNA-seq data set from ELF1 naive human ES cells, n = 3 biological replicates. b, Number of reads supporting chimaeric transcripts from indicated HERV class in ELF1 naive human ES cells, n = 3 biological replicates. c, Expression of LTR5HS linked genes plotted as a function of distance to the gene’s transcription start site (TSS). x-axis: distance of TSS to the nearest LTR5HS in kb; y-axis: fold change in expression of the linked gene in ELF1 naive versus primed human ES cells (this study, left) or expression of the linked gene in 3iL versus primed H1 human ES cells (right, ref. 12). d, Top, histograms showing expression of all genes that significantly change in expression between naive and primed ELF1 human ES cells (top histogram, white) or significantly changed genes that are LTR5HS associated (bottom histogram, blue); expression values from naive versus primed ELF1 human ES cell RNA-seq data sets (FDR < 0.05 DESeq). Fischer’s exact test gives stated P value, indicating enrichment of LTR5HS-linked genes in naive upregulated category. Bottom, quantification of average expression of LTR5HS-linked (blue) or unlinked (white) genes. Non-paired Wilcoxon test with stated P value indicating that genes linked to 1 or more LTR5HS have significantly higher mean expression. e, Top, histograms showing expression of all genes that significantly change in expression between 3iL and primed H1 human ES cells (top histogram, white) or significantly changed genes that are LTR5_HS associated (bottom histogram, blue); expression values from RNA-seq data sets reported previously12, FDR < 0.05 DESeq. Fischer’s exact test gives stated P value indicating enrichment of LTR5HS-linked genes in naive upregulated category. Bottom, quantification of average expression of LTR5HS-linked (blue) or unlinked (white) genes. Non-paired Wilcoxon test with stated P value indicating that genes linked to 1 or more LTR5HS have significantly higher mean expression.

  7. Extended Data Figure 7: rec and IFITM1 expression in naive human ES cells, and effect of Rec expression on H1N1(PR8) infection (supporting Fig. 4). (929 KB)

    a, Left, RT–qPCR analysis of HERVK rec expression levels in ELF1 naive human ES cells (n = 3 biological replicates) or H9 primed human ES cells (one biological replicate). Normalized to 18S rRNA. Right, Rec RNA levels in indicated blastocyst lineages. Solid line indicates mean; data are from ref. 10. b, RNA-seq quantification of IFITM1 RNA levels in naive or primed ELF1 human ES cells (left, this study) or 3iL human ES cells versus primed H1 human ES cells from ref. 12 (right). n = 3 biological replicates for each condition, error bars are ±1 s.d. Asterisk indicates significance at FDR < 0.05, DESeq. c, Flow cytometry for surface-localized IFITM1 staining in the indicated H9 human ES cells or naive ELF1 human ES cells (top) or, as a control for IFITM1 antibody specificity, knockdown of IFITM1 with two independent IFITM1 siRNA pools compared to control siRNA-treated cells in Flag–eGFP–Rec-hECCs (bottom). d, Left, IFITM1 expression in control human EC cell versus Rec-hECC (NCCIT) RNA-seq data sets. n = 2 biological replicates. Significance = FDR < 0.05, DESeq. Right, IFITM1 expression in control siRNA versus Rec siRNA-treated human EC cells (NCCIT) RNA-seq. n = 3 biological replicates, error bars are ±1 s.d. Significance = FDR < 0.05, DESeq. e, Flow-cytometry profiles for indicated cell types in H1N1(PR8) infected (top) or non-infected (bottom) wild-type (WT) control human EC cells or Flag–GFP–Rec-hECCs, clone #1. Shown is one representative example of four independent experiments showing a co-plating experiment in which GFP-Rec cells and wild-type control (GFP negative) cells are infected in the same well, stained in the same tube and identified by GFP fluorescence after gating for FSC and SSC. f, Scatterplot of ELF1 naive versus primed human ES cell RNA-seq showing all interferon-induced genes, with differentially regulated genes (FDR < 0.05 DESeq, n = 3 biological replicates each) highlighted in red. There is a significant overlap between differentially regulated genes and interferon-induced genes as measured by a hypergeometric test (P value < 0.05).

  8. Extended Data Figure 8: iCLIP analysis of Rec-associated RNAs (supporting Fig. 4). (963 KB)

    a, Diagram of iCLIP-seq procedure (see Methods for details). Briefly, cells are crosslinked using ultraviolet, lysed and digested with RNase to trim RNAs. Sequential immunopurification is performed using Flag M2, peptide elution, and GFP immunoprecipitation (IP). After stringent washing, RNAs are recovered and either radiolabelled (shown in b) or reverse transcribed and prepared for Illumina HTPS libraries. b, Autoradiogram of labelled RNAs (top) recovered from ultraviolet-crosslinked cells using sequential Flag–eGFP immunoprecipitation from: wild-type human EC cells (lanes 1, 2), Flag–eGFP control human EC cells (lanes 3, 4), or two independent Rec-hECC transgenic lines (lanes 5–8), separated on an SDS–polyacrylamide gel electrophoresis (SDS–PAGE) gel. Free Rec protein runs as a ~35 kDa band, while Rec protein crosslinked to RNA molecules show lower electrophoretic mobility. Please note that: (1) Rec-bound RNAs are resistant to even high concentrations of RNaseI, probably indicating extensive secondary RNA structures, and (2) low/no background of contaminating RNAs in control immunoprecipitation from wild-type human EC cells or Flag–eGFP control human EC cells. Western blots with anti-GFP antibody were also performed to confirm the presence of tagged protein in Flag–eGFP control and Flag–eGFP–Rec cells, both in input and immunoprecipitation fractions (middle). HSP90 was used as a loading control (bottom). c, Computationally predicted (using mFold) secondary structure of LTR5HS sequence around the Rec response element (identified experimentally in vitro previously25). Single nucleotide resolution Rec ultraviolet-crosslinking sites determined by iCLIP are shaded in red; n = 2 biological replicates.

  9. Extended Data Figure 9: Rec target mRNA analysis (supporting Fig. 4). (580 KB)

    a, Genome browser representations of the Rec iCLIP read (n = 2 biological replicates) distribution at indicated mRNA targets. b, Computationally predicted (using mFold) secondary structures of indicated Rec iCLIP-seq targets. Single-nucleotide resolution Rec ultraviolet-crosslinking sites determined by iCLIP are shaded in red; to orient the reader, browser representation of the folded fragment is shown above each respective cartoon.

  10. Extended Data Figure 10: Model of HERVK regulation and function. (734 KB)

Supplementary information

Excel files

  1. Supplementary Data (271 KB)

    This file contains Supplementary Table 1.

  2. Supplementary Data (13.7 MB)

    This file contains Supplementary Table 2.

  3. Supplementary Data (227 KB)

    This file contains Supplementary Table 3.

  4. Supplementary Data (15 KB)

    This file contains Supplementary Table 4.

  5. Supplementary Data (5.1 MB)

    This file contains Supplementary Table 5.

  6. Supplementary Data (8 MB)

    This file contains Supplementary Table 6.

  7. Supplementary Data (11 KB)

    This file contains Supplementary Table 7.

  8. Supplementary Data (9 KB)

    This file contains Supplementary Table 8.

  9. Supplementary Data (9 KB)

    This file contains Supplementary Table 9.

  10. Supplementary Data (16 KB)

    This file contains Supplementary Table 10.

Additional data