Abstract
Phenotypic differences between closely related species are thought to arise primarily from changes in gene expression due to mutations in cis-regulatory sequences (enhancers). However, it has remained unclear how frequently mutations alter enhancer activity or create functional enhancers de novo. Here we use STARR-seq, a recently developed quantitative enhancer assay, to determine genome-wide enhancer activity profiles for five Drosophila species in the constant trans-regulatory environment of Drosophila melanogaster S2 cells. We find that the functions of a large fraction of D. melanogaster enhancers are conserved for their orthologous sequences owing to selection and stabilizing turnover of transcription factor motifs. Moreover, hundreds of enhancers have been gained since the D. melanogaster–Drosophila yakuba split about 11 million years ago without apparent adaptive selection and can contribute to changes in gene expression in vivo. Our finding that enhancer activity is often deeply conserved and frequently gained provides functional insights into regulatory evolution.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Carroll, S.B. Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134, 25–36 (2008).
King, M.C. & Wilson, A.C. Evolution at two levels in humans and chimpanzees. Science 188, 107–116 (1975).
Wittkopp, P.J. & Kalay, G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat. Rev. Genet. 13, 59–69 (2012).
Wray, G.A. The evolutionary significance of cis-regulatory mutations. Nat. Rev. Genet. 8, 206–216 (2007).
Gompel, N., Prud'homme, B., Wittkopp, P.J., Kassner, V.A. & Carroll, S.B. Chance caught on the wing: cis-regulatory evolution and the origin of pigment patterns in Drosophila. Nature 433, 481–487 (2005).
Prud'homme, B. et al. Repeated morphological evolution through cis-regulatory changes in a pleiotropic gene. Nature 440, 1050–1053 (2006).
Frankel, N. et al. Morphological evolution caused by many subtle-effect substitutions in regulatory DNA. Nature 474, 598–603 (2011).
Prabhakar, S. et al. Human-specific gain of function in a developmental enhancer. Science 321, 1346–1350 (2008).
Borneman, A.R. et al. Divergence of transcription factor binding sites across related yeast species. Science 317, 815–819 (2007).
He, Q. et al. High conservation of transcription factor binding and evidence for combinatorial regulation across six Drosophila species. Nat. Genet. 43, 414–420 (2011).
Odom, D.T. et al. Tissue-specific transcriptional regulation has diverged significantly between human and mouse. Nat. Genet. 39, 730–732 (2007).
Schmidt, D. et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 328, 1036–1040 (2010).
Arnold, C.D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).
Stark, A. et al. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 450, 219–232 (2007).
Drosophila 12 Genomes Consortium. Evolution of genes and genomes on the Drosophila phylogeny. Nature 450, 203–218 (2007).
Arendt, D. The evolution of cell types in animals: emerging principles from molecular studies. Nat. Rev. Genet. 9, 868–882 (2008).
Wittkopp, P.J., Haerum, B.K. & Clark, A.G. Evolutionary changes in cis and trans gene regulation. Nature 430, 85–88 (2004).
Wilson, M.D. et al. Species-specific transcription in mice carrying human chromosome 21. Science 322, 434–438 (2008).
Hare, E.E., Peterson, B.K., Iyer, V.N., Meier, R. & Eisen, M.B. Sepsid even-skipped enhancers are functionally conserved in Drosophila despite lack of sequence conservation. PLoS Genet. 4, e1000106 (2008).
Fisher, S., Grice, E.A., Vinton, R.M., Bessling, S.L. & McCallion, A.S. Conservation of RET regulatory function from human to zebrafish without sequence similarity. Science 312, 276–279 (2006).
Ludwig, M.Z., Bergman, C., Patel, N.H. & Kreitman, M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature 403, 564–567 (2000).
Bardet, A.F., He, Q., Zeitlinger, J. & Stark, A. A computational pipeline for comparative ChIP-seq analyses. Nat. Protoc. 7, 45–61 (2012).
Ho, M.C.W. et al. Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila. PLoS Genet. 5, e1000709 (2009).
Ludwig, M.Z. et al. Functional evolution of a cis-regulatory module. PLoS Biol. 3, e93 (2005).
Kalay, G. & Wittkopp, P.J. Nomadic enhancers: tissue-specific cis-regulatory elements of yellow have divergent genomic positions among Drosophila species. PLoS Genet. 6, e1001222 (2010).
Rämet, M., Manfruelli, P., Pearson, A., Mathey-Prevot, B. & Ezekowitz, R.A.B. Functional genomic analysis of phagocytosis and identification of a Drosophila receptor for E. coli. Nature 416, 644–648 (2002).
Moses, A.M. et al. Large-scale turnover of functional transcription factor binding sites in Drosophila. PLoS Comput. Biol. 2, e130 (2006).
Swanson, C.I., Schwimmer, D.B. & Barolo, S. Rapid evolutionary rewiring of a structurally constrained eye enhancer. Curr. Biol. 21, 1186–1196 (2011).
Crocker, J., Tamori, Y. & Erives, A. Evolution acts on enhancer organization to fine-tune gradient threshold readouts. PLoS Biol. 6, e263 (2008).
Saito, K. et al. A regulatory circuit for piwi by the large Maf gene traffic jam in Drosophila. Nature 461, 1296–1299 (2009).
Matts, J.A., Sytnikova, Y., Chirn, G.-W., Igloi, G.L. & Lau, N.C. Small RNA library construction from minute biological samples. Methods Mol. Biol. 1093, 123–136 (2014).
Paris, M. et al. Extensive divergence of transcription factor binding in Drosophila embryos with highly conserved gene expression. PLoS Genet. 9, e1003748 (2013).
Lawrie, D.S., Messer, P.W., Hershberg, R. & Petrov, D.A. Strong purifying selection at synonymous sites in D. melanogaster. PLoS Genet. 9, e1003527 (2013).
White, M.A., Myers, C.A., Corbo, J.C. & Cohen, B.A. Massively parallel in vivo enhancer assay reveals that highly local features determine the cis-regulatory function of ChIP-seq peaks. Proc. Natl. Acad. Sci. USA 110, 11952–11957 (2013).
Gibson, D.G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).
Pennington, K.L., Marr, S.K., Chirn, G.-W. & Marr, M.T. Holo-TFIID controls the magnitude of a transcription burst and fine-tuning of transcription. Proc. Natl. Acad. Sci. USA 110, 7678–7683 (2013).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Hinrichs, A.S. et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).
Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
McQuilton, P., St Pierre, S.E. & Thurmond, J. FlyBase Consortium. FlyBase 101–the basics of navigating FlyBase. Nucleic Acids Res. 40, D706–D714 (2012).
Meyer, L.R. et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 41, D64–D69 (2013).
Down, T.A., Bergman, C.M., Su, J. & Hubbard, T.J.P. Large-scale discovery of promoter motifs in Drosophila melanogaster. PLoS Comput. Biol. 3, e7 (2007).
Bryne, J.C. et al. JASPAR, the open access database of transcription factor–binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 36, D102–D106 (2008).
Bailey, T.L. & Gribskov, M. Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48–54 (1998).
Yáñez-Cuna, J.O., Dinh, H.Q., Kvon, E.Z., Shlyueva, D. & Stark, A. Uncovering cis-regulatory sequence requirements for context-specific transcription factor binding. Genome Res. 22, 2018–2030 (2012).
R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2010).
Acknowledgements
We thank the Drosophila Species Stock Center (DSSC) at the University of California, San Diego for genomic DNA, I. Sinitsyn and Ł.M. Boryń for help with library cloning, M. Rath, K. Schernhuber and O. Frank for help with experiments, J.O. Yáñez-Cuna and M.A. Zabidi for help with data analysis, and S. Westermann, L. Cochella, I. Crisostomo, J. Brennecke, J. Knoblich, U. Technau, K. Tessmar-Raible and F. Raible for comments on the manuscript. Deep sequencing was performed at the CSF (Campus Science Support Facilities) Next-Generation Sequencing Unit and the Brandeis Genomics Core Facility (laboratory of M. Rosbash). C.D.A. is supported by a European Research Council (ERC) Starting Grant (242922) awarded to A.S. This work was partly supported by the Austrian Science Fund (FWF), F4303-B09. The work of N.C.L. was supported by the Searle Scholars foundation and the US National Institutes of Health (HD057298). Basic research at IMP is supported by Boehringer Ingelheim.
Author information
Authors and Affiliations
Contributions
C.D.A. and A.S. conceived the project. C.D.A., D.S. and M.P. performed the experiments. J.A.M., Y.A.S. and N.C.L. generated the follicle cell RNA-seq data. D.G. and A.S. conducted all computational experiments and bioinformatics analyses. C.D.A., D.G. and A.S. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 High quantitative and qualitative reproducibility of STARR-seq in D. melanogaster S2 cells.
(a) The quantitative reproducibility of STARR-seq in two independent biological replicates was assessed at enhancer peak summits (D. melanogaster (D.mel), 2,325 peaks; D. yakuba (D.yak), 2,293 peaks; D. ananassae (D.ana), 2,096 peaks; D. pseudoobscura (D.pse), 3,469 peaks; D. willistoni (D.wil), 2,860 peaks) and for 100,000 positions randomly sampled from the genome (common D. melanogaster coordinates for all species). Each data point represents the fragment density for both replicates normalized to 1 million mapped fragments (FPM). The Pearson correlation coefficient (PCC) and the coefficient of determination (R2) for the linear fit (plus the regression line) are indicated in each subplot. The open black circle shows median values of coverage for replicate 1 versus replicate 2. (b) Qualitative reproducibility of STARR-seq measuring the consistency of enhancer calls between enhancers called in replicate 1 evaluated with enrichment data from replicate 2 equivalently to the assessment of conservation (bar height, relaxed settings with P ≤ 0.05; white line, P ≤ 0.001). The second bar for each species evaluates replicate 2 against replicate 1. (c,d) The same data are shown as for a and b, but using peak calls and fragment densities in the respective, original Drosophila genomes before coordinate translation.
Supplementary Figure 2 Similar genomic distribution of STARR-seq enhancers for all five Drosophila species in D. melanogaster S2 cells.
The pie charts show the absolute genomic distribution of enhancers across different functional regions, and the bar charts show enrichment or depletion relative to overall region sizes in the genome (a, D. melanogaster; b, D. yakuba; c, D. ananassae; d, D. pseudoobscura; e, D. willistoni). Globally, the majority of identified enhancers were located within introns (53.2–59.4%) and in intergenic regions (18.7–26.3%), as described in ref. 1. Overall the genomic distribution of enhancers is comparable among the five Drosophila species. These data show that enhancer location with respect to different genomic regions is similar for all five Drosophila species.
Supplementary Figure 3 Functional conservation of open and closed D. melanogaster S2 cell enhancers in D. melanogaster S2 cells.
D. melanogaster S2 cell enhancers were classified as open or closed depending on their accessibility in DNase I hypersensitivity (DHS) sequencing assays as described previously1. The functional conservation rates of (a) 1,554 open and (b) 771 closed D. melanogaster enhancers in the 4 other Drosophila species are shown (see Figure 1c for details of the conservation rate analysis). The conservation rate of open enhancers is roughly twice as high as for closed enhancers, whereas both show similar reproducibility in independent replicates (D. melanogaster bars, marked by an asterisk).
Supplementary Figure 4 Positional and compensatory conservation of enhancers.
The number of positionally conserved D. melanogaster enhancers declines with evolutionary distances. The drop in positionally conserved enhancers, however, is balanced by an increasing number of compensatory enhancers within the same respective gene loci, leading to a similar number of enhancers per gene locus. This might stabilize gene expression levels or confer regulatory robustness2–4. Note that this plot shows data from analysis of the two biological replicates combined. The white numbers inside the bars indicate the fraction of enhancers of each category as percentage of the total number of D. melanogaster enhancers (1,552) assigned to 1,201 gene loci.
Supplementary Figure 5 Motif conservation by positional sequence constraints.
(a) Pairwise sequence identity for functionally conserved (colored) and non-conserved (gray) enhancers along the entire 500-bp enhancer sequence or (c) for a 100-bp core enhancer sequence (boxes depict the median and the interquartile range, and whiskers depict the 10th and 90th percentiles). (b,d) Sequence identity as in a and c, respectively, but restricted to positions that overlap with motifs of the transcription factor Serpent (Srp). Note: a and b show the same data as Figure 3a,b to allow for a comparison with c and d, demonstrating that the results are robust with respect to the lengths of the analyzed regions (for b–d, n = 214, 338, 413, 196, 361, 216, 366 and 174, respectively). (e) Sequence identity as in b, but for the motif of the transcription factor Buttonhead (Btd), which is not expressed in S2 cells. The largely overlapping sequence identities of the Btd motifs in conserved and non-conserved enhancers indicate that the motifs for Btd in conserved enhancers in S2 cells are not under constraint. (f) Position weight matrix logos for the Srp and Btd motifs.
Supplementary Figure 6 Positional and compensatory conservation of TF motifs in functionally conserved and non-conserved enhancers.
(a) Rate of positionally conserved and compensatory D. melanogaster Srp motifs in relation to all four other Drosophila species in functional conserved enhancer regions. The total number of D. melanogaster Srp motifs for each comparison is shown above the bars (black), and percent conservation values are shown within the bars (white). (b) Plot as in a, but in functionally non-conserved enhancer regions. (c,d) Plots as in a and b, but limiting the evaluation to a subset of well-aligned enhancer regions that have no undefined nucleotides (Ns) in the pairwise alignments and have non-gapped orthologous ends. Together this shows that motif turnover is common and that the loss of positionally conserved Srp motifs can be compensated by the gain of Srp motifs at different positions within the same enhancer. Further, Srp motifs are conserved at much higher levels in conserved enhancers compared to non-conserved enhancers, suggesting that they are important for S2 cell enhancer function. (Supplementary Fig. 5). When assessing well-aligned sequences (c), motif turnover maintains the number of serpent motifs at high levels of around 80%, even over large evolutionary distances. (e,f) Relative contribution of compensatory motif turnover increases with evolutionary distance. (e) Fraction of functionally conserved enhancers with the same number of Srp motifs between species with positionally conserved motifs (blue), motifs conserved within an individual enhancer but not at the same position (compensatory; red) or a mix of positionally conserved and compensatory conserved Srp motifs (yellow). The total number of enhancers for each comparison is shown above each bar; percentages per category are plotted in white within the bars. (f) Plot as in e, but considering only a subset of enhancers that are well aligned (c,d). For the vast majority of conserved enhancers, the motifs are exclusively positionally conserved in closely related species such as D. yakuba in e. However, the fraction of enhancers with compensatory motifs increases significantly at larger evolutionary distances.
Supplementary Figure 7 Phylogeny of enhancer gain and loss events.
(a) Enhancer occurrences at non-redundant positions across species (binary representation in which blue boxes indicate enhancer presence/function). (b) Gain (blue triangles) and loss (red triangles) events assigned by parsimony to different branches of the phylogenetic tree. We assigned a gain event if two or more loss events would otherwise have to be assumed, but indicate the gain-versus-loss ratios below the trees (e.g., 1:4 = one gain or four losses; unclear events are shown in gray). Overall, the phylogeny of all 8,180 non-redundant enhancers identified in the genomes of the 5 species are shown on 31 (= 52 – 1) different trees.
Supplementary Figure 8 Sequence changes of D. melanogaster and D. yakuba gained enhancers are similar to expected neutral substitutions between the two species.
(a) Same data as Figure 4e (boxes depict the median and the interquartile range, and whiskers depict the 10th and 90th percentiles; outliers are shown individually). (b) As in a, but for 100-bp core enhancer sequences. This shows that the patterns of sequence conservation in gained and lost enhancers are consistent between 500-bp enhancer sequences and shorter regions of 100 bp centered on the enhancer peak summit.
Supplementary Figure 9 Newly gained enhancers in D. melanogaster are associated with expressed genes.
UCSC Genome Browser screenshots of expressed genes in S2 cells that are exclusively associated with a newly gained (D. melanogaster–specific) enhancer (inputs in gray; y-axis labels depict normalized fragment counts). (a) Sxn6 (RPKM 98.4). (b) roq (RPKM 25.9). (c) Hira (RPKM 12.3). RNA-seq data in D. melanogaster S2 cells are from ref. 1.
Supplementary Figure 10 High quantitative and qualitative reproducibility of STARR-seq in D. melanogaster ovarian somatic cells (OSCs).
(a) The quantitative reproducibility of STARR-seq in OSCs in two independent biological replicates was assessed at combined enhancer peak summits (D. melanogaster, 3,342 peaks; D. yakuba, 3,233 peaks; D. ananassae, 2,859 peaks) as in Supplementary Figure 1a. (b) Same data as in a, but using peak calls and fragment densities in the respective Drosophila genome coordinates before coordinate translation. (c) Qualitative reproducibility of STARR-seq in OSCs as in Supplementary Figure 1b. (d) Same as c, but using peak calls and fragment densities in the respective Drosophila genome coordinates before coordinate translation.
Supplementary Figure 11 Genomic distribution, functional conservation and sequence changes of STARR-seq OSC enhancers.
(a–c) Similar genomic distribution of STARR-seq enhancers for three Drosophila species in D. melanogaster OSCs. Genomic distribution analysis for (a) D. melanogaster, (b) D. yakuba and (c) D. ananassae enhancers in OSCs as in Supplementary Figure 2. Globally, the majority of identified enhancers were located in introns (44.1–52.8%) and in intergenic regions (23.7–26.5%; see ref. 1). (d,e) Functional conservation of open and closed D. melanogaster OSC enhancers in D. melanogaster OSCs. D. melanogaster OSC enhancers were classified as open and closed as described previously1. The conservation rates of (d) 2,269 open and (e) 1,073 closed D. melanogaster OSC enhancers in D. yakuba and D. ananassae (see Figs. 1c and 6a for details). (f,g) Number of sequence changes in D. melanogaster and D. yakuba gained, lost or deeply conserved S2 or OSC enhancers are similar. (f) Analysis as in Figure 4e, however, based on three species only (D. melanogaster, D. yakuba, D. ananassae) to allow the direct comparison between S2 cells and OSCs (boxes depict the median and the interquartile range, and whiskers depict the 10th and 90th percentiles). (g) As in f, but for OSC enhancers. The numbers of sequence changes for the different enhancer categories are highly similar between f and g, confirming the results shown in Figure 4e and suggesting that the reported numbers hold more generally, independent of the respective cell types.
Supplementary Figure 12 Changes in OSC enhancer activities and follicle cell in vivo gene expression between D. melanogaster and D. yakuba correlate globally.
Same data and heat-map presentation as in Figure 6d, but with matrix cells colored according to enrichments irrespective of their significance.
Supplementary Figure 13 Differences in quantitative enhancer strength follow a molecular clock.
Enhancer strength diverges with increasing evolutionary distance linearly, with the number of substitutions per neutral site (branch length) similar to qualitative enhancer conservation (Fig. 1c). The strong correlation of evolutionary distance and the fraction of enhancers with at (a) 2-fold, (b) 4-fold, (c) 6-fold and (d) 8-fold change in enhancer strength on non-redundant loci between D. melanogaster and other Drosophila species shows that enhancer strength is also conserved and follows a molecular clock. Note that the D. melanogaster replicate comparison considers replicate 1 against replicate 2 within the same species.
Supplementary Figure 14 Global range of sequence identities for functionally conserved and non-conserved enhancers.
Sequence identity distributions for functionally conserved (colored lines) and non-conserved (dark gray lines) enhancers between D. melanogaster and other Drosophila species. The distributions are largely overlapping, suggesting that there is no selective pressure on the overall enhancer sequence. In addition, the two extreme boundaries of the distributions indicate that sequences can be up to 95% identical (between D. melanogaster–D. yakuba) yet without conserved function (only active in the D. melanogaster genome), whereas enhancer function can be conserved despite as little as 39% sequence identity between D. melanogaster–D. willistoni.
Supplementary Figure 15 S2 cell and OSC enhancer gains are nearly additive.
Enhancers gained in S2 cells or OSCs show only limited overlap (right column), such that the number of gained enhancers is nearly additive for both cell types in D. melanogaster (top) and in D. yakuba (bottom; enhancer gains for both cell types are defined on the basis of three-way analyses considering only D. melanogaster, D. yakuba and D. ananassae as outgroup). The overlap of the gained enhancers is of the same magnitude as for the overlap of all enhancers in S2 cells and OSCs (left column), which suggests that different enhancers are gained in different cell types and the total number of enhancer gains in more complex tissues or organisms scales with the number of cell types and, presumably, the difference between cell types. (As the overlap of gained enhancers is even slightly lower than the overlap of all enhancers (1.8- to 3.6-fold), one could speculate that gained enhancers might have 'more unusual' sequence properties that are less likely to be shared by different cell types.)
Supplementary information
Supplementary Text and Figures
Supplementary Tables 1, 3 and 5, and Supplementary Figures 1–15 (PDF 3751 kb)
Supplementary Table 2
TF motif conservation in functionally conserved and D. melanogaster–specific S2 cell enhancers. (XLSX 73 kb)
Supplementary Table 4
RNA-seq in follicle cells—gene expression (RPKM) values. (XLSX 456 kb)
Supplementary Data Set 1
STARR-seq peak calls (ZIP 3342 kb)
Rights and permissions
About this article
Cite this article
Arnold, C., Gerlach, D., Spies, D. et al. Quantitative genome-wide enhancer activity maps for five Drosophila species show functional enhancer conservation and turnover during cis-regulatory evolution. Nat Genet 46, 685–692 (2014). https://doi.org/10.1038/ng.3009
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3009
This article is cited by
-
Decoding enhancer complexity with machine learning and high-throughput discovery
Genome Biology (2023)
-
Leveraging massively parallel reporter assays for evolutionary questions
Genome Biology (2023)
-
Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation
Chromosoma (2023)
-
Genome-wide identification of functional enhancers and their potential roles in pig breeding
Journal of Animal Science and Biotechnology (2022)
-
A cis-regulatory element promoting increased transcription at low temperature in cultured ectothermic Drosophila cells
BMC Genomics (2021)