Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model

Abstract

Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of 5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Synthetic enhancer sequence design and controls.
Figure 2: Homotypic amplification of expression is compatible with a subset of transcription factor binding sites, independent of their spacing.
Figure 3: Heterotypic elements drive stronger expression than homotypic ones.
Figure 4: Combinatorial effects in heterotypic clusters of two different transcription factor binding sites.
Figure 5: Synthetic enhancers mimic mouse liver enhancers.
Figure 6: Effects of binding site order in heterotypic enhancers.

Accession codes

Primary accessions

Sequence Read Archive

References

  1. Halfon, M.S. et al. Ras pathway specificity is determined by the integration of multiple signal-activated and tissue-restricted transcription factors. Cell 103, 63–74 (2000).

    Article  CAS  PubMed  Google Scholar 

  2. Lettice, L.A. et al. Opposing functions of the ETS factor family define Shh spatial expression in limb buds and underlie polydactyly. Dev. Cell 22, 459–467 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Spitz, F. & Furlong, E.E. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).

    Article  CAS  PubMed  Google Scholar 

  4. Jeong, Y. et al. Regulation of a remote Shh forebrain enhancer by the Six3 homeoprotein. Nat. Genet. 40, 1348–1353 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Benko, S. et al. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence. Nat. Genet. 41, 359–364 (2009).

    Article  CAS  PubMed  Google Scholar 

  6. Sturm, R.A. et al. A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color. Am. J. Hum. Genet. 82, 424–431 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Harismendy, O. et al. 9p21 DNA variants associated with coronary artery disease impair interferon-γ signalling response. Nature 470, 264–268 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).

    CAS  PubMed  Google Scholar 

  9. Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Blow, M.J. et al. ChIP-Seq identification of weakly conserved heart enhancers. Nat. Genet. 42, 806–810 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Song, L. et al. Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 21, 1757–1767 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Rastegar, S. et al. The words of the regulatory code are arranged in a variable manner in highly conserved enhancers. Dev. Biol. 318, 366–377 (2008).

    Article  CAS  PubMed  Google Scholar 

  14. Kulkarni, M.M. & Arnosti, D.N. Information display by transcriptional enhancers. Development 130, 6569–6575 (2003).

    Article  CAS  PubMed  Google Scholar 

  15. Brown, C.D., Johnson, D.S. & Sidow, A. Functional architecture and evolution of transcriptional elements that drive gene coexpression. Science 317, 1557–1560 (2007).

    Article  CAS  PubMed  Google Scholar 

  16. Merika, M. & Thanos, D. Enhanceosomes. Curr. Opin. Genet. Dev. 11, 205–208 (2001).

    Article  CAS  PubMed  Google Scholar 

  17. Thanos, D. & Maniatis, T. Virus induction of human IFNβ gene expression requires the assembly of an enhanceosome. Cell 83, 1091–1100 (1995).

    Article  CAS  PubMed  Google Scholar 

  18. Krivan, W. & Wasserman, W.W. A predictive model for regulatory sequences directing liver-specific transcription. Genome Res. 11, 1559–1566 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Lee, D., Karchin, R. & Beer, M.A. Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 21, 2167–2180 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Narlikar, L. et al. Genome-wide discovery of human heart enhancers. Genome Res. 20, 381–392 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Gotea, V. et al. Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. Genome Res. 20, 565–577 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Sharon, E. et al. Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Grskovic, M., Chaivorapol, C., Gaspar-Maia, A., Li, H. & Ramalho-Santos, M. Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells. PLoS Genet. 3, e145 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Gertz, J., Siggia, E.D. & Cohen, B.A. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009).

    Article  CAS  PubMed  Google Scholar 

  25. Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Patwardhan, R.P. et al. Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol. 30, 265–270 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Kim, M.J. et al. Functional characterization of liver enhancers that regulate drug-associated transporters. Clin. Pharmacol. Ther. 89, 571–578 (2011).

    Article  CAS  PubMed  Google Scholar 

  28. Zhang, G., Budker, V. & Wolff, J.A. High levels of foreign gene expression in hepatocytes after tail vein injections of naked plasmid DNA. Hum. Gene Ther. 10, 1735–1737 (1999).

    Article  CAS  PubMed  Google Scholar 

  29. Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L.A. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007).

    Article  CAS  PubMed  Google Scholar 

  30. Donoghue, M., Ernst, H., Wentworth, B., Nadal-Ginard, B. & Rosenthal, N. A muscle-specific enhancer is located at the 3′ end of the myosin light-chain 1/3 gene locus. Genes Dev. 2, 1779–1790 (1988).

    Article  CAS  PubMed  Google Scholar 

  31. Issemann, I., Prince, R.A., Tugwood, J.D. & Green, S. The peroxisome proliferator–activated receptor:retinoid X receptor heterodimer is activated by fatty acids and fibrate hypolipidaemic drugs. J. Mol. Endocrinol. 11, 37–47 (1993).

    Article  CAS  PubMed  Google Scholar 

  32. Williams, T. & Tjian, R. Characterization of a dimerization motif in AP-2 and its function in heterologous DNA-binding proteins. Science 251, 1067–1071 (1991).

    Article  CAS  PubMed  Google Scholar 

  33. De Val, S. et al. Combinatorial regulation of endothelial gene expression by ets and forkhead transcription factors. Cell 135, 1053–1064 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Sakamoto, Y., Ishiguro, M. & Kitagawa, G. Akaike Information Criterion Statistics (KTK Scientific Publishers, Tokyo, 1986).

  35. Tomovic, A. & Oakeley, E.J. Position dependencies in transcription factor binding sites. Bioinformatics 23, 933–941 (2007).

    Article  CAS  PubMed  Google Scholar 

  36. Lupien, M. et al. FoxA1 translates epigenetic signatures into enhancer-driven lineage-specific transcription. Cell 132, 958–970 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Sladek, F.M., Zhong, W.M., Lai, E. & Darnell, J.E. Jr. Liver-enriched transcription factor HNF-4 is a novel member of the steroid hormone receptor superfamily. Genes Dev. 4, 2353–2365 (1990).

    Article  CAS  PubMed  Google Scholar 

  38. Badis, G. et al. Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Watt, A.J., Zhao, R., Li, J. & Duncan, S.A. Development of the mammalian liver and ventral pancreas is dependent on GATA4. BMC Dev. Biol. 7, 37 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Dame, C. et al. Hepatic erythropoietin gene regulation by GATA-4. J. Biol. Chem. 279, 2955–2961 (2004).

    Article  CAS  PubMed  Google Scholar 

  41. Schwartz, J.J., Lee, C. & Shendure, J. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nat. Methods 9, 913–915 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Zhang, Y., Werling, U. & Edelmann, W. SLiCE: a novel bacterial cell extract–based DNA cloning method. Nucleic Acids Res. 40, e55 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Ovcharenko, I. et al. Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res. 15, 184–194 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc., B 57, 289–300 (1995).

    Google Scholar 

  46. Cook, D. Detection of influential observation in linear regression. Technometrics 19, 15–18 (1977).

    Google Scholar 

  47. Bailey, T.L. & Gribskov, M. Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48–54 (1998).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by National Human Genome Research Institute (NHGRI) grant 1R01HG006768 (N.A. and J.S.) and the Pilot/Feasibility grant from the University of California, San Francisco (UCSF) Liver Center (P30 DK026743). N.A. is also supported by National Institute of Child and Human Development grant R01HD059862, NHGRI grant R01HG005058, National Institute of General Medical Sciences grant GM61390, National Institute of Neurological Disorders and Stroke grant 1R01NS079231, National Institute of Diabetes and Digestive and Kidney Diseases grant 1R01DK090382 and the Simons Foundation (SFARI 256769). R.P.S. was supported in part by a Canadian Institutes of Health Research (CIHR) fellowship in hepatology. M.J.K. was supported in part by US National Institutes of Health training grant T32 GM007175, the UCSF Quantitative Biosciences Consortium fellowship for Interdisciplinary Research and the Amgen Research Excellence in Bioengineering and Therapeutic Sciences fellowship. This work was funded in part by the Intramural Research Program of the US National Institutes of Health, National Library of Medicine (I.O.).

Author information

Authors and Affiliations

Authors

Contributions

R.P.S., L.T., R.P.P., J.S., I.O. and N.A. conceived key aspects of the project and planned experiments. R.P.S., R.P.P., M.J.K. and F.I. performed experiments. L.T., R.P.S. and R.P.P. analyzed data. R.P.S., L.T., R.P.P., I.O., J.S. and N.A. wrote the manuscript. All authors commented on and revised the manuscript.

Corresponding authors

Correspondence to Jay Shendure, Ivan Ovcharenko or Nadav Ahituv.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–10, Supplementary Tables 1–3 and 6, and Supplementary Note (PDF 2631 kb)

Supplementary Table 4

Complete listing of 4,966 SRESs (no controls) tested in the study (XLSX 506 kb)

Supplementary Table 5

Summary of expression for the best and worst permutations from 211 sets of SRESs containing the same complement of transcription factor binding sites in different orders (XLSX 96 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Smith, R., Taher, L., Patwardhan, R. et al. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model. Nat Genet 45, 1021–1028 (2013). https://doi.org/10.1038/ng.2713

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.2713

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research