Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ~5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers.
At a glance
Sequence Read Archive
- Ras pathway specificity is determined by the integration of multiple signal-activated and tissue-restricted transcription factors. Cell 103, 63–74 (2000). et al.
- Opposing functions of the ETS factor family define Shh spatial expression in limb buds and underlie polydactyly. Dev. Cell 22, 459–467 (2012). et al.
- Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012). &
- Regulation of a remote Shh forebrain enhancer by the Six3 homeoprotein. Nat. Genet. 40, 1348–1353 (2008). et al.
- Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence. Nat. Genet. 41, 359–364 (2009). et al.
- A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color. Am. J. Hum. Genet. 82, 424–431 (2008). et al.
- 9p21 DNA variants associated with coronary artery disease impair interferon-γ signalling response. Nature 470, 264–268 (2011). et al.
- Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007). , , &
- A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012). et al.
- ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009). et al.
- ChIP-Seq identification of weakly conserved heart enhancers. Nat. Genet. 42, 806–810 (2010). et al.
- Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 21, 1757–1767 (2011). et al.
- The words of the regulatory code are arranged in a variable manner in highly conserved enhancers. Dev. Biol. 318, 366–377 (2008). et al.
- Information display by transcriptional enhancers. Development 130, 6569–6575 (2003). &
- Functional architecture and evolution of transcriptional elements that drive gene coexpression. Science 317, 1557–1560 (2007). , &
- Enhanceosomes. Curr. Opin. Genet. Dev. 11, 205–208 (2001). &
- Virus induction of human IFNβ gene expression requires the assembly of an enhanceosome. Cell 83, 1091–1100 (1995). &
- A predictive model for regulatory sequences directing liver-specific transcription. Genome Res. 11, 1559–1566 (2001). &
- Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 21, 2167–2180 (2011). , &
- Genome-wide discovery of human heart enhancers. Genome Res. 20, 381–392 (2010). et al.
- Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. Genome Res. 20, 565–577 (2010). et al.
- Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012). et al.
- Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells. PLoS Genet. 3, e145 (2007). , , , &
- Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009). , &
- Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012). et al.
- Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol. 30, 265–270 (2012). et al.
- Functional characterization of liver enhancers that regulate drug-associated transporters. Clin. Pharmacol. Ther. 89, 571–578 (2011). et al.
- High levels of foreign gene expression in hepatocytes after tail vein injections of naked plasmid DNA. Hum. Gene Ther. 10, 1735–1737 (1999). , &
- VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007). , , &
- A muscle-specific enhancer is located at the 3′ end of the myosin light-chain 1/3 gene locus. Genes Dev. 2, 1779–1790 (1988). , , , &
- The peroxisome proliferator–activated receptor:retinoid X receptor heterodimer is activated by fatty acids and fibrate hypolipidaemic drugs. J. Mol. Endocrinol. 11, 37–47 (1993). , , &
- Characterization of a dimerization motif in AP-2 and its function in heterologous DNA-binding proteins. Science 251, 1067–1071 (1991). &
- Combinatorial regulation of endothelial gene expression by ets and forkhead transcription factors. Cell 135, 1053–1064 (2008). et al.
- Akaike Information Criterion Statistics (KTK Scientific Publishers, Tokyo, 1986). , &
- Position dependencies in transcription factor binding sites. Bioinformatics 23, 933–941 (2007). &
- FoxA1 translates epigenetic signatures into enhancer-driven lineage-specific transcription. Cell 132, 958–970 (2008). et al.
- Liver-enriched transcription factor HNF-4 is a novel member of the steroid hormone receptor superfamily. Genes Dev. 4, 2353–2365 (1990). , , &
- Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009). et al.
- Development of the mammalian liver and ventral pancreas is dependent on GATA4. BMC Dev. Biol. 7, 37 (2007). , , &
- Hepatic erythropoietin gene regulation by GATA-4. J. Biol. Chem. 279, 2955–2961 (2004). et al.
- Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nat. Methods 9, 913–915 (2012). , &
- SLiCE: a novel bacterial cell extract–based DNA cloning method. Nucleic Acids Res. 40, e55 (2012). , &
- Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res. 15, 184–194 (2005). et al.
- Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009). &
- Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc., B 57, 289–300 (1995). &
- Detection of influential observation in linear regression. Technometrics 19, 15–18 (1977).
- Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48–54 (1998). &
- Supplementary Text and Figures (2,694.781 KB)
Supplementary Figures 1–10, Supplementary Tables 1–3 and 6, and Supplementary Note
- Supplementary Table 4 (518,516 KB)
Complete listing of 4,966 SRESs (no controls) tested in the study
- Supplementary Table 5 (98,422 KB)
Summary of expression for the best and worst permutations from 211 sets of SRESs containing the same complement of transcription factor binding sites in different orders