Abstract
Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ∼5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Transcription factor binding site orientation and order are major drivers of gene regulatory activity
Nature Communications Open Access 22 April 2023
-
Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation
Nature Communications Open Access 21 March 2022
-
Transcriptional enhancers and their communication with gene promoters
Cellular and Molecular Life Sciences Open Access 19 August 2021
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout






Accession codes
References
Halfon, M.S. et al. Ras pathway specificity is determined by the integration of multiple signal-activated and tissue-restricted transcription factors. Cell 103, 63–74 (2000).
Lettice, L.A. et al. Opposing functions of the ETS factor family define Shh spatial expression in limb buds and underlie polydactyly. Dev. Cell 22, 459–467 (2012).
Spitz, F. & Furlong, E.E. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).
Jeong, Y. et al. Regulation of a remote Shh forebrain enhancer by the Six3 homeoprotein. Nat. Genet. 40, 1348–1353 (2008).
Benko, S. et al. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence. Nat. Genet. 41, 359–364 (2009).
Sturm, R.A. et al. A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color. Am. J. Hum. Genet. 82, 424–431 (2008).
Harismendy, O. et al. 9p21 DNA variants associated with coronary artery disease impair interferon-γ signalling response. Nature 470, 264–268 (2011).
Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).
Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).
Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).
Blow, M.J. et al. ChIP-Seq identification of weakly conserved heart enhancers. Nat. Genet. 42, 806–810 (2010).
Song, L. et al. Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 21, 1757–1767 (2011).
Rastegar, S. et al. The words of the regulatory code are arranged in a variable manner in highly conserved enhancers. Dev. Biol. 318, 366–377 (2008).
Kulkarni, M.M. & Arnosti, D.N. Information display by transcriptional enhancers. Development 130, 6569–6575 (2003).
Brown, C.D., Johnson, D.S. & Sidow, A. Functional architecture and evolution of transcriptional elements that drive gene coexpression. Science 317, 1557–1560 (2007).
Merika, M. & Thanos, D. Enhanceosomes. Curr. Opin. Genet. Dev. 11, 205–208 (2001).
Thanos, D. & Maniatis, T. Virus induction of human IFNβ gene expression requires the assembly of an enhanceosome. Cell 83, 1091–1100 (1995).
Krivan, W. & Wasserman, W.W. A predictive model for regulatory sequences directing liver-specific transcription. Genome Res. 11, 1559–1566 (2001).
Lee, D., Karchin, R. & Beer, M.A. Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 21, 2167–2180 (2011).
Narlikar, L. et al. Genome-wide discovery of human heart enhancers. Genome Res. 20, 381–392 (2010).
Gotea, V. et al. Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. Genome Res. 20, 565–577 (2010).
Sharon, E. et al. Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012).
Grskovic, M., Chaivorapol, C., Gaspar-Maia, A., Li, H. & Ramalho-Santos, M. Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells. PLoS Genet. 3, e145 (2007).
Gertz, J., Siggia, E.D. & Cohen, B.A. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009).
Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).
Patwardhan, R.P. et al. Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol. 30, 265–270 (2012).
Kim, M.J. et al. Functional characterization of liver enhancers that regulate drug-associated transporters. Clin. Pharmacol. Ther. 89, 571–578 (2011).
Zhang, G., Budker, V. & Wolff, J.A. High levels of foreign gene expression in hepatocytes after tail vein injections of naked plasmid DNA. Hum. Gene Ther. 10, 1735–1737 (1999).
Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L.A. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007).
Donoghue, M., Ernst, H., Wentworth, B., Nadal-Ginard, B. & Rosenthal, N. A muscle-specific enhancer is located at the 3′ end of the myosin light-chain 1/3 gene locus. Genes Dev. 2, 1779–1790 (1988).
Issemann, I., Prince, R.A., Tugwood, J.D. & Green, S. The peroxisome proliferator–activated receptor:retinoid X receptor heterodimer is activated by fatty acids and fibrate hypolipidaemic drugs. J. Mol. Endocrinol. 11, 37–47 (1993).
Williams, T. & Tjian, R. Characterization of a dimerization motif in AP-2 and its function in heterologous DNA-binding proteins. Science 251, 1067–1071 (1991).
De Val, S. et al. Combinatorial regulation of endothelial gene expression by ets and forkhead transcription factors. Cell 135, 1053–1064 (2008).
Sakamoto, Y., Ishiguro, M. & Kitagawa, G. Akaike Information Criterion Statistics (KTK Scientific Publishers, Tokyo, 1986).
Tomovic, A. & Oakeley, E.J. Position dependencies in transcription factor binding sites. Bioinformatics 23, 933–941 (2007).
Lupien, M. et al. FoxA1 translates epigenetic signatures into enhancer-driven lineage-specific transcription. Cell 132, 958–970 (2008).
Sladek, F.M., Zhong, W.M., Lai, E. & Darnell, J.E. Jr. Liver-enriched transcription factor HNF-4 is a novel member of the steroid hormone receptor superfamily. Genes Dev. 4, 2353–2365 (1990).
Badis, G. et al. Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009).
Watt, A.J., Zhao, R., Li, J. & Duncan, S.A. Development of the mammalian liver and ventral pancreas is dependent on GATA4. BMC Dev. Biol. 7, 37 (2007).
Dame, C. et al. Hepatic erythropoietin gene regulation by GATA-4. J. Biol. Chem. 279, 2955–2961 (2004).
Schwartz, J.J., Lee, C. & Shendure, J. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nat. Methods 9, 913–915 (2012).
Zhang, Y., Werling, U. & Edelmann, W. SLiCE: a novel bacterial cell extract–based DNA cloning method. Nucleic Acids Res. 40, e55 (2012).
Ovcharenko, I. et al. Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res. 15, 184–194 (2005).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc., B 57, 289–300 (1995).
Cook, D. Detection of influential observation in linear regression. Technometrics 19, 15–18 (1977).
Bailey, T.L. & Gribskov, M. Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48–54 (1998).
Acknowledgements
This work was supported by National Human Genome Research Institute (NHGRI) grant 1R01HG006768 (N.A. and J.S.) and the Pilot/Feasibility grant from the University of California, San Francisco (UCSF) Liver Center (P30 DK026743). N.A. is also supported by National Institute of Child and Human Development grant R01HD059862, NHGRI grant R01HG005058, National Institute of General Medical Sciences grant GM61390, National Institute of Neurological Disorders and Stroke grant 1R01NS079231, National Institute of Diabetes and Digestive and Kidney Diseases grant 1R01DK090382 and the Simons Foundation (SFARI 256769). R.P.S. was supported in part by a Canadian Institutes of Health Research (CIHR) fellowship in hepatology. M.J.K. was supported in part by US National Institutes of Health training grant T32 GM007175, the UCSF Quantitative Biosciences Consortium fellowship for Interdisciplinary Research and the Amgen Research Excellence in Bioengineering and Therapeutic Sciences fellowship. This work was funded in part by the Intramural Research Program of the US National Institutes of Health, National Library of Medicine (I.O.).
Author information
Authors and Affiliations
Contributions
R.P.S., L.T., R.P.P., J.S., I.O. and N.A. conceived key aspects of the project and planned experiments. R.P.S., R.P.P., M.J.K. and F.I. performed experiments. L.T., R.P.S. and R.P.P. analyzed data. R.P.S., L.T., R.P.P., I.O., J.S. and N.A. wrote the manuscript. All authors commented on and revised the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–10, Supplementary Tables 1–3 and 6, and Supplementary Note (PDF 2631 kb)
Supplementary Table 4
Complete listing of 4,966 SRESs (no controls) tested in the study (XLSX 506 kb)
Supplementary Table 5
Summary of expression for the best and worst permutations from 211 sets of SRESs containing the same complement of transcription factor binding sites in different orders (XLSX 96 kb)
Rights and permissions
About this article
Cite this article
Smith, R., Taher, L., Patwardhan, R. et al. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model. Nat Genet 45, 1021–1028 (2013). https://doi.org/10.1038/ng.2713
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.2713
This article is cited by
-
Transcription factor binding site orientation and order are major drivers of gene regulatory activity
Nature Communications (2023)
-
Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation
Chromosoma (2023)
-
Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation
Nature Communications (2022)
-
Generating specificity in genome regulation through transcription factor sensitivity to chromatin
Nature Reviews Genetics (2022)
-
DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers
Nature Genetics (2022)