Letter | Published:

In vivo enhancer analysis of human conserved non-coding sequences



Identifying the sequences that direct the spatial and temporal expression of genes and defining their function in vivo remains a significant challenge in the annotation of vertebrate genomes. One major obstacle is the lack of experimentally validated training sets. In this study, we made use of extreme evolutionary sequence conservation as a filter to identify putative gene regulatory elements, and characterized the in vivo enhancer activity of a large group of non-coding elements in the human genome that are conserved in human–pufferfish, Takifugu (Fugu) rubripes, or ultraconserved1 in human–mouse–rat. We tested 167 of these extremely conserved sequences in a transgenic mouse enhancer assay. Here we report that 45% of these sequences functioned reproducibly as tissue-specific enhancers of gene expression at embryonic day 11.5. While directing expression in a broad range of anatomical structures in the embryo, the majority of the 75 enhancers directed expression to various regions of the developing nervous system. We identified sequence signatures enriched in a subset of these elements that targeted forebrain expression, and used these features to rank all 3,100 non-coding elements in the human genome that are conserved between human and Fugu. The testing of the top predictions in transgenic mice resulted in a threefold enrichment for sequences with forebrain enhancer activity. These data dramatically expand the catalogue of human gene enhancers that have been characterized in vivo, and illustrate the utility of such training sets for a variety of biological applications, including decoding the regulatory vocabulary of the human genome.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1

    Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004)

  2. 2

    Roeder, R. G. & Rutter, W. J. Multiple forms of DNA-dependent RNA polymerase in eukaryotic organisms. Nature 224, 234–237 (1969)

  3. 3

    Goldberg, M. L. Sequence Analysis of Drosophila Histone Genes. Ph.D. thesis, Stanford Univ. (1979)

  4. 4

    Stathopoulos, A. & Levine, M. Genomic regulatory networks and animal development. Dev. Cell 9, 449–462 (2005)

  5. 5

    Levine, M. & Tjian, R. Transcription regulation and animal diversity. Nature 424, 147–151 (2003)

  6. 6

    Emison, E. S. et al. A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk. Nature 434, 857–863 (2005)

  7. 7

    Kleinjan, D. A. & van Heyningen, V. Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet. 76, 8–32 (2005)

  8. 8

    Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003)

  9. 9

    Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003)

  10. 10

    Nobrega, M. A., Ovcharenko, I., Afzal, V. & Rubin, E. M. Scanning human gene deserts for long-range enhancers. Science 302, 413 (2003)

  11. 11

    Prabhakar, S. et al. Close sequence comparisons are sufficient to identify human cis-regulatory elements. Genome Res. 16, (7)855–863 (2006)

  12. 12

    Woolfe, A. et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005)

  13. 13

    Kothary, R. et al. Inducible expression of an hsp68-lacZ hybrid gene in transgenic mice. Development 105, 707–714 (1989)

  14. 14

    Rojas, A. et al. Gata4 expression in lateral mesoderm is downstream of BMP4 and is activated directly by Forkhead and GATA transcription factors through a distal enhancer element. Development 132, 3405–3417 (2005)

  15. 15

    Rossant, J., Zirngibl, R., Cado, D., Shago, M. & Giguere, V. Expression of a retinoic acid response element-hsplacZ transgene defines specific domains of transcriptional activity during mouse embryogenesis. Genes Dev. 5, 1333–1344 (1991)

  16. 16

    Yamagishi, H. et al. Tbx1 is regulated by tissue-specific forkhead proteins through a common Sonic hedgehog-responsive enhancer. Genes Dev. 17, 269–281 (2003)

  17. 17

    Boffelli, D., Nobrega, M. A. & Rubin, E. M. Comparative genomics at the vertebrate extremes. Nature Rev. Genet. 5, 456–465 (2004)

  18. 18

    Ahituv, N., Prabhakar, S., Poulin, F., Rubin, E. M. & Couronne, O. Mapping cis-regulatory domains in the human genome using multi-species conservation of synteny. Hum. Mol. Genet. 14, 3057–3063 (2005)

  19. 19

    Kohlhase, J., Wischermann, A., Reichenbach, H., Froster, U. & Engel, W. Mutations in the SALL1 putative transcription factor gene cause Townes-Brocks syndrome. Nature Genet. 18, 81–83 (1998)

  20. 20

    Buck, A., Kispert, A. & Kohlhase, J. Embryonic expression of the murine homologue of SALL1, the gene mutated in Townes–Brocks syndrome. Mech. Dev. 104, 143–146 (2001)

  21. 21

    Carroll, S. B. Evolution at two levels: on genes and form. PLoS Biol. 3, e245 (2005)

  22. 22

    Davidson, E. H. Genomic Regulatory Systems: In Development and Evolution (Academic, San Diego, 2001)

  23. 23

    Lee, T. I. et al. Control of developmental regulators by Polycomb in human embryonic stem cells. Cell 125, 301–313 (2006)

  24. 24

    Bard, J. L. et al. An internet-accessible database of mouse developmental anatomy based on a systematic nomenclature. Mech. Dev. 74, 111–120 (1998)

  25. 25

    Gray, P. A. et al. Mouse brain organization revealed through direct genome-scale TF expression analysis. Science 306, 2255–2257 (2004)

  26. 26

    Poulin, F. et al. In vivo characterization of a vertebrate ultraconserved enhancer. Genomics 85, 774–781 (2005)

  27. 27

    van Helden, J., Andre, B. & Collado-Vides, J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281, 827–842 (1998)

  28. 28

    Kurokawa, D. et al. Regulation of Otx2 expression and its functions in mouse forebrain and midbrain. Development 131, 3319–3331 (2004)

  29. 29

    Zhou, J., Zwicker, J., Szymanski, P., Levine, M. & Tjian, R. TAFII mutations disrupt Dorsal activation in the Drosophila embryo. Proc. Natl Acad. Sci. USA 95, 13483–13488 (1998)

Download references


Research was conducted at the E. O. Lawrence Berkeley National Laboratory, under the Programs for Genomic Application, funded by the National Heart, Lung, and Blood Institute, USA as well as the National Human Genome Research Institute, USA, and performed under a Department of Energy Contract with the University of California.

Author information

Competing interests

Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests.

Correspondence to Len A. Pennacchio.

Supplementary information

  1. Supplementary Table 1.

    A summary of all the human conserved noncoding fragments tested for enhancer activity at embryonic day 11.5. Enhancer ID refers to a unique identifier defined at http://enhancer.lbl.gov. (XLS 37 kb)

  2. Supplementary Table 2.

    A compilation of human-fugu conserved noncoding elements in the human genome. (XLS 208 kb)

  3. Supplementary Table 3.

    The top 30 forebrain enhancer predictions in the human genome. The strategy to generate this list can be found in the Supplementary Methods. (XLS 18 kb)

  4. Supplementary Methods.

    An expanded version of the Materials and Methods. (DOC 61 kb)

Rights and permissions

To obtain permission to re-use content from this article visit RightsLink.

About this article

Further reading

Figure 1: A summary of all sequences tested for enhancer activity in transgenic mice.
Figure 2: A 3 Mb region of human chromosome 16 enriched for human– Fugu non-coding conservation flanking the SALL1 gene.
Figure 3: Grouping of positive expression patterns captured in the transgenic mouse enhancer assay.
Figure 4: Application of a forebrain enhancer training set to identify forebrain-specific enhancer sequences elsewhere in the human genome.


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.