Nature Genetics
34, 264 - 266 (2003)
Published online: 22 June 2003; | doi:10.1038/ng1181
Convergent evolution of gene circuitsGavin C Conant
& Andreas WagnerDepartment of Biology, 167 Castetter Hall, The University of New Mexico, Albuquerque, New Mexico 87131, USA.
Correspondence should be addressed to Andreas Wagner wagnera@unm.eduConvergent evolution is a potent indicator of optimal design. We show here that convergent evolution occurs in genetic networks. Specifically, we show that multiple types of transcriptional regulation circuitry in Escherichia coli and the yeast Saccharomyces cerevisiae have evolved independently and not by duplication of one or a few ancestral circuits.
Convergent evolution occurs on all levels of biological organization, from organ systems to proteins. For instance, eyes and wings have evolved independently multiple times, and many aquatic vertebrates share a streamlined shape, despite their independent evolutionary origins1. On the smaller scale of proteins, lysozymes have been recruited independently for foregut fermentation in bovids, colubine monkeys and a bird2,
3. Antifreeze glycoproteins in antarctic notothenioids and northern cod (living at opposite ends of the globe) have independently evolved similar amino acid sequences4.
Recent studies have identified abundant genetic circuit motifs in transcriptional regulation networks of the yeast S. cerevisiae5,
6 and the bacterium E. coli6,
7. These circuit motifs include regulatory chains, feed-forward circuits and a 'bi-fan' (Fig. 1). Such motifs may have had two principal evolutionary origins. First, they may have come about through the random duplication and subsequent diversification of a few ancestral circuits. Given the high frequency at which genes and genomes undergo duplication8, this is a plausible scenario. It is equally possible, however, that these circuits arose independently by recruitment of unrelated genes. If such convergent circuit evolution is prevalent, then these circuits owe their abundance to the action of natural selection.
 | |  | To determine the evolutionary origin of transcriptional regulation circuits, we defined two indicators of common circuit ancestry, A and Fmax. Consider a genome containing n regulatory circuits, each with k genes and identical topology (for details see Supplementary Methods online). A pair of circuits shares a common ancestor if all k gene pairs in the circuit pair are gene duplicates. We next defined a 'circuit graph' whose n nodes represent the n circuits and where an edge connects two nodes (circuits) if the circuits have a common ancestor. Our first indicator, A, of common circuit ancestry, is equal to A = 1 - (C/n), where C is the number of components in the graph (Fig. 1a). The greater A is, the greater is the fraction of circuits sharing a common ancestor. Our second indicator is Fmax, the size of the largest family of circuits with common ancestry (Fig. 1a).
We identified duplicate genes using BLASTP9 at a significance threshold of E 10-5 (E values between 10-3 and 10-11 yield the same results). Using this criterion, neither of two circuit types in E. coli showed evidence of common ancestry (A = 0 and Fmax = 1 for both; Fig. 1b). We also studied 18 yeast circuit types, and only three (feed-forward loops, multi-input modules of size 2 and bi-fans) showed evidence of common ancestry (A > 0 and Fmax > 1; Fig. 1b). This may be due to chance alone, however, simply because duplicate genes are abundant in the yeast genome. Therefore, we used permutation tests (described in Supplementary Methods online) to assess the statistical evidence of A and Fmax. For no circuit type was A significantly different from the chance expectation. For example, yeast contains 542 bi-fan motifs with A = 0.197. The probability of observing A = 0.197 by chance is P = 0.18: too large to reject the null hypothesis. We observed a marginally significant value of Fmax = 5 for feed-forward loops (P = 0.05). Even for this circuit type, however, most circuits (43 of 48) showed independent ancestry.
Our analysis of yeast circuits rests on genome-scale chromatin precipitation experiments that use a statistical error threshold (Pe) to identify true regulatory interactions5. The results reported in Figure 1b are based on Pe = 10-3, but we found the same results when varying Pe between 10-2 and 10-5. As above, only feed-forward loops yielded a marginally significant value of A = 0.11 (P = 0.03) and Fmax = 3 (P = 0.03) at Pe 10-4. Lowering Pe further to Pe = 10-5 yielded A = 0 and Fmax = 1.
We also asked whether members of one gene family preferentially occurred in one type of gene circuit. This would be expected if many circuits originated through duplication. Specifically, we asked whether the likelihood of a gene occurring in a given circuit type increases if one of its duplicates occurs in that type. The answer is no (Table 1).
 | |  | In sum, we found no common ancestry among the E. coli circuit types, the yeast regulatory chains or the yeast multi-input motifs with more than two regulators. Of the remaining three yeast circuit types, two showed common ancestry indistinguishable from that expected by chance. Only feed-forward loops showed marginally significant values of either A or Fmax, but this finding is not statistically robust. Moreover, most (43 of 48) feed-forward loops have clearly independent origins. We also note that the probability of falsely identifying a pair of circuits as duplicates decreases with increasing circuit size. The larger a circuit is, the less evidence of duplication it shows in our analysis.
Multiple lines of evidence indicate that duplicate genes diverge rapidly in function10,
11,
12. Our findings that gene circuits do not share common ancestry and that duplicate regulatory genes are randomly distributed across gene circuit types underscore this point, because they imply that duplicate transcriptional regulators can readily evolve new interactions. The short DNA binding sites of transcriptional regulators account for much of this plasticity. In microbes like yeast and E. coli, new regulatory interactions can arise rapidly13, even on the time scale of laboratory evolution experiments14. Transcriptional regulation circuits are thus ideal systems for studying convergent evolution, because natural selection has much raw material (variation in regulatory interactions) to shape such circuits.
The finding that gene circuits have evolved repeatedly makes a strong case for their optimal design. For example, the design of a feed-forward loop may serve to activate the regulated (downstream) genes only if the farthest-upstream regulator is persistently activated. Moreover, the same design rapidly deactivates genes once this regulator is shut off7. Our results also suggest that convergent evolution, probably rare in protein sequences, may have an important role in the higher organizational level of gene circuits. Stephen Jay Gould famously asked what would be conserved if life's tape, its evolutionary history, was replayed15. Transcriptional regulation circuits, it seems, might come out just about the same.
Note: Supplementary information is available on the Nature Genetics website.
Received 28 March 2003; Accepted 21 May 2003; Published online: 22 June 2003.
REFERENCES
- Futuyma, D.J. Evolutionary Biology 3rd edn. (Sinauer Associates, Sunderland, Massachusetts, 1998).
- Stewart, C.-B., Schilling, J.W. & Wilson, A.C. Nature 330, 401404 (1987). | Article | PubMed | ISI | ChemPort |
- Kornegay, J.R., Schilling, J.W. & Wilson, A.C. Mol. Biol. Evol. 11, 921928 (1994). | PubMed | ISI | ChemPort |
- Chen, L., DeVries, A.L. & Cheng, C.-H.C. Proc. Natl. Acad. Sci. USA 94, 38173822 (1997). | Article | PubMed | ChemPort |
- Lee, T.I. et al. Science 298, 799804 (2002). | Article | PubMed | ISI | ChemPort |
- Milo, R. et al. Science 298, 824827 (2002). | Article | PubMed | ISI | ChemPort |
- Shen-Orr, S.S., Milo, R., Mangan, S. & Alon, U. Nat. Genet. 31, 6468 (2002). | Article | PubMed | ISI | ChemPort |
- Lynch, M. & Conery, J.S. Science 290, 11511155 (2000). | Article | PubMed | ISI | ChemPort |
- Altschul, S.F. et al. Nucleic Acids Res. 25, 33893402 (1997). | Article | PubMed | ISI | ChemPort |
- Wagner, A. Mol. Biol. Evol. 18, 12831292 (2001). | PubMed | ISI | ChemPort |
- Dermitzakis, E.T. & Clark, A.G. Mol. Biol. Evol. 19, 11141121 (2002). | PubMed | ISI | ChemPort |
- Wagner, A. Proc. Natl. Acad. Sci. USA 97, 65796584 (2000). | Article | PubMed | ChemPort |
- Stone, J.R. & Wray, G.A. Mol. Biol. Evol. 18, 17641770 (2001). | PubMed | ISI | ChemPort |
- Ferea, T.L., Botstein, D., Brown, P.O. & Rosenzweig, R.F. Proc. Natl. Acad. Sci. USA 96, 97219726 (1999). | Article | PubMed | ChemPort |
- Gould, S.J. Wonderful Life: The Burgess Shale and the Nature of History (W.W. Norton, New York, 1989).
Acknowledgments G.C.C. is supported by the US Department of Energy's Computational Sciences Graduate Fellowship program, administered by the Krell Institute. A.W. would like to thank the US National Institutes of Health and the Santa Fe Institute for their support.
Competing interests statement:
The authors declare that they have no competing financial interests. |