Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

The retrovirus HERVH is a long noncoding RNA required for human embryonic stem cell identity


Human endogenous retrovirus subfamily H (HERVH) is a class of transposable elements expressed preferentially in human embryonic stem cells (hESCs). Here, we report that the long terminal repeats of HERVH function as enhancers and that HERVH is a nuclear long noncoding RNA required to maintain hESC identity. Furthermore, HERVH is associated with OCT4, coactivators and Mediator subunits. Together, these results uncover a new role of species-specific transposable elements in hESCs.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: HERVH transcripts are essential for maintenance of hESC identity.
Figure 2: HERVH transcripts function as lncRNAs that interact with coactivators and OCT4 to regulate expression of neighboring genes.

Similar content being viewed by others

Accession codes

Primary accessions


Gene Expression Omnibus


  1. Babu, M.M., Luscombe, N.M., Aravind, L., Gerstein, M. & Teichmann, S.A. Curr. Opin. Struct. Biol. 14, 283–291 (2004).

    Article  CAS  Google Scholar 

  2. Bourque, G. Curr. Opin. Genet. Dev. 19, 607–612 (2009).

    Article  CAS  Google Scholar 

  3. Bourque, G. et al. Genome Res. 18, 1752–1762 (2008).

    Article  CAS  Google Scholar 

  4. Wang, T. et al. Proc. Natl. Acad. Sci. USA 104, 18613–18618 (2007).

    Article  CAS  Google Scholar 

  5. Schmidt, D. et al. Cell 148, 335–348 (2012).

    Article  CAS  Google Scholar 

  6. Kunarso, G. et al. Nat. Genet. 42, 631–634 (2010).

    Article  CAS  Google Scholar 

  7. Lynch, V.J., Leclerc, R.D., May, G. & Wagner, G.P. Nat. Genet. 43, 1154–1159 (2011).

    Article  CAS  Google Scholar 

  8. Feschotte, C. Nat. Rev. Genet. 9, 397–405 (2008).

    Article  CAS  Google Scholar 

  9. Ng, H.H. & Surani, M.A. Nat. Cell Biol. 13, 490–496 (2011).

    Article  CAS  Google Scholar 

  10. Jacques, P.É., Jeyakani, J. & Bourque, G. PLoS Genet. 9, e1003504 (2013).

    Article  CAS  Google Scholar 

  11. Santoni, F.A., Guerra, J. & Luban, J. Retrovirology 9, 111 (2012).

    Article  CAS  Google Scholar 

  12. Kelley, D. & Rinn, J. Genome Biol. 13, R107 (2012).

    Article  Google Scholar 

  13. Kapusta, A. et al. PLoS Genet. 9, e1003470 (2013).

    Article  CAS  Google Scholar 

  14. Harrow, J. et al. Genome Res. 22, 1760–1774 (2012).

    Article  CAS  Google Scholar 

  15. Pera, M.F. & Tam, P.P. Nature 465, 713–720 (2010).

    Article  CAS  Google Scholar 

  16. Burgess, D.J. Nat. Rev. Genet. 12, 300 (2011).

    Article  CAS  Google Scholar 

  17. Guttman, M. et al. Nature 477, 295–300 (2011).

    Article  CAS  Google Scholar 

  18. Pauli, A., Rinn, J.L. & Schier, A.F. Nat. Rev. Genet. 12, 136–149 (2011).

    Article  CAS  Google Scholar 

  19. Esteller, M. Nat. Rev. Genet. 12, 861–874 (2011).

    Article  CAS  Google Scholar 

  20. Tusher, V.G., Tibshirani, R. & Chu, G. Proc. Natl. Acad. Sci. USA 98, 5116–5121 (2001).

    Article  CAS  Google Scholar 

  21. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).

    Article  CAS  Google Scholar 

  22. Saldanha, A.J. Bioinformatics 20, 3246–3248 (2004).

    Article  CAS  Google Scholar 

  23. Eden, E., Navon, R., Steinfeld, I., Lipson, D. & Yakhini, Z. BMC Bioinformatics 10, 48 (2009).

    Article  Google Scholar 

  24. Derrien, T. et al. Genome Res. 22, 1775–1789 (2012).

    Article  CAS  Google Scholar 

  25. Zhao, J. et al. Mol. Cell 40, 939–953 (2010).

    Article  CAS  Google Scholar 

  26. Yuan, P. et al. Genes Dev. 23, 2507–2520 (2009).

    Article  CAS  Google Scholar 

Download references


We thank the ENCODE consortium for the free distribution of their data sets. We also thank J. Jeyakani and K. Zawack for a number of preliminary analyses and S. Pott and Y. Wan for helpful discussions. This work was supported by the Agency for Science, Technology and Research (A*STAR) of Singapore and by Génome Québec, Génome Canada. L.R. is also supported by a grant from the Canadian Institute of Health Research (CIHR MOP-115090). F.S. is supported by the Singapore International Graduate Award from A*STAR. J.G. is supported by a fellowship within the Postdoc-Program of the German Academic Exchange Service, DAAD.

Author information

Authors and Affiliations



X.L. contributed conception, design, data collection and analysis, and manuscript writing; F.S. contributed data collection and analysis; L.R., P.-É.J. and J.G. contributed data analysis; G.B. and H.-H.N. contributed conception, design, supervision, data interpretation and manuscript writing.

Corresponding authors

Correspondence to Guillaume Bourque or Huck-Hui Ng.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 HERVH is a primate-specific endogenous retrovirus (ERV)

(a) Multiple sequence alignment of the expressed HERVH loci and approximate position of the gag, pol and env genes in the ancestral HERVH sequence. (b) Aggregate profiles of the RNA-seq and DNase I hypersensitive sites sequencing (DHS-seq) tags around the HERVH loci in H1 hESCs, GM12878 (GM) and K562 cells split between expressed loci and not expressed loci as defined in H1. Each locus was normalized to a length of 6kb and tags within the loci are shown relative to the end they are closest to. (c) Estimated age of the repeat instances from 3 LTR repeat subfamilies (HERV16, ERVL-E and HERVH). Myrs: million years.

Supplementary Figure 2 Validation of hESC differentiation phenotype after HERVH knockdown

(a) Location of shRNAs against HERVH. (b) Expression of differentiation markers as determined by qPCR after HERVH depletion. Biological triplicate data (n=3 extracts) is presented as mean ±s.e.m. (c) Western blot of OCT4 and NANOG expression after HERVH knockdown. (d) Expression of pluripotency markers (OCT4, NANOG and SOX2) and differentiation markers (GATA6 and RUNX1) after scrambled shRNA knockdown. Biological triplicate data (n=3 extracts) is presented as mean ±s.e.m. (e) Expression of NANOG and SOX2 as determined by immunostaining after HERVH depletion. (f) Expression of pluripotency (SOX2, TRA-1-81 and NANOG) and differentiation marker (ACTA2) after HERVH depletion.

Supplementary Figure 3 Microarray and RNA-seq analysis of gene expression after HERVH depletion

(a) Heatmaps showing change in pluripotency gene expression with expression fold change >1.5 after HERVH depletion as revealed by microarray analysis. (b) Heatmaps showing change in differentiation gene expression with expression fold change >1.5 after HERVH depletion as revealed by microarray analysis. (c) Comparison of control and knockdown expression levels. Expression measured in reads per million from RNA-seq experiment data. Data points are labeled by average percent identity with shRNA. (d) Gene ontology analysis of upregulated genes after HERVH depletion. (e) Gene ontology analysis of downregulated genes after HERVH depletion. (f) Classes of upregulated genes preferentially affected by HERVH depletion. Other upregulated genes were included as control. (f) Classes of upregulated genes preferentially affected by HERVH depletion. Other upregulated genes are included as control. P values are calculated by binomial test. (g) Fold change of downregulated lncRNA and protein coding genes.

Supplementary Figure 4 HERVH LTRs function as OCT4-regulated enhancers

(a) Luciferase enhancer assay using two LTR regions of HERVH in H1 hESCs, 293T and HeLa cells. Biological triplicate data (n=3 dishes) is presented as mean ± s.e.m. (b) Same assay as in hESCs with and without OCT4 depletion. Biological triplicate data (n=3 dishes) is presented as mean ± s.e.m.

Supplementary Figure 5 HERVH specifically localizes to the nucleus and functions as a lncRNA regulating neighboring genes' expression.

(a) RNA FISH of HERVH transcripts in various cell types. HERVH RNA is shown in red, DNA counterstained by 4',6-diamidino-2-phenylindole (DAPI) is shown in blue. MEF (CF1 mouse embryonic fibroblasts), 293T cells, and MRC-5 (human fibroblasts), were stained similar as hESCs (Fig. 2a). H1 cells were treated with 10 μg/mL RNAse A for 15 minutes prior to hybridization. Scale bars, 10 μM (MEF) and 5 μM (293T, MRC-5, H1). (b) Analysis of the interaction between HERVH and chromatin modifiers through coimmunoprecipitation of HERVH transcripts after formaldehyde crosslinking. Biological triplicate data (n=3 extracts) is presented as mean ± s.e.m. (c) Gel analysis of HERVH RT-PCR products after immunoprecipitation. M denotes DNA size marker lane. RT stands for reverse transcriptase. (d) Luciferase assay of HERVH LTRs after HERVH depletion. Each LTR activity was normalized to pGL4 activity (set as 1) in the same condition. Biological triplicate data (n=3 dishes) is presented as mean ± s.e.m. (e) Enrichment of HERVH and LTR7 within 10kb of downregulated genes. P values were calculated by binominal test. (f) Other genomic features within 10kb of downregulated genes. P values were calculated by binominal test. (g) Majority of HERVH instances which regulate expression of neighboring genes are bound by OCT4 within 1kb.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–5. (PDF 877 kb)

Supplementary Table 1

The genomic coordinates of 128 expressed HERVH loci. Expression is from H1 Caltech paired 75-bp ENCODE RNA-seq Replicate 1 dataset and is presented as reads per million. Coordinates are according to human genome assembly hg19. (XLSX 18 kb)

Supplementary Table 2

Downregulated lncRNAs and protein-coding genes after HERVH depletion. (XLSX 163 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, X., Sachs, F., Ramsay, L. et al. The retrovirus HERVH is a long noncoding RNA required for human embryonic stem cell identity. Nat Struct Mol Biol 21, 423–425 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing