Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Constraints and plasticity in genome and molecular-phenome evolution

Key Points

  • Different classes of genome sequence, genome architectures, gene repertoires and molecular phenomes are subject to diverse evolutionary constraints that greatly vary in strength and in the nature of the underlying selective and neutral factors.

  • Sequences coding for proteins and structural RNAs typically include the most strongly conserved sites in genomes.

  • Most of the non-coding sequences are less strongly constrained than coding sequences, with the exception of some regulatory sites.

  • Genome architecture is weakly constrained with the exception of the strong association seen between genes in operons, which is partly maintained by horizontal gene transfer.

  • Principles of genome evolution widely differ between groups of organisms: prokaryotic genomes consist mostly of coding sequences and so are on average highly constrained; genomes of multicellular eukaryotes are much larger and contain large fractions of unconstrained, 'junk' DNA; and genomes of unicellular eukaryotes evolve under an intermediate regime that is closer to the prokaryote mode.

  • Some molecular-phenomic features, such as the abundance of proteins encoded by orthologous genes, seem to be subject to surprisingly strong constraints.

  • Evolutionary trajectories that lead to a particular phenotype are substantially constrained, limiting the potential of evolutionary tinkering.

  • The overall level of constraint that affects a given evolving lineage depends on the intensity of selection: this is primarily determined by the characteristic effective population size, although selection is also strongly modulated by the lifestyle properties of the respective organisms.

  • Despite the diversity of evolutionary constraints acting at different levels of biological organization, comparative-genomic and molecular-phenomic analyses reveal universal patterns that could be compatible with relatively simple, general models of evolution.

  • The evolutionary constraints on genome and molecular-phenome evolution are complemented and partially offset by the robustness of biological systems, which is manifested at different levels and is likely to be an evolved feature.

Abstract

Multiple constraints variously affect different parts of the genomes of diverse life forms. The selective pressures that shape the evolution of viral, archaeal, bacterial and eukaryotic genomes differ markedly, even among relatively closely related animal and bacterial lineages; by contrast, constraints affecting protein evolution seem to be more universal. The constraints that shape the evolution of genomes and phenomes are complemented by the plasticity and robustness of genome architecture, expression and regulation. Taken together, these findings are starting to reveal complex networks of evolutionary processes that must be integrated to attain a new synthesis of evolutionary biology.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Approximate distribution of evolutionary constraints across genomes with different architectures.
Figure 2: The universal distribution of evolutionary rates across orthologous gene sets.
Figure 3: Genomic and phenomic constraints on different levels of biological organization.

References

  1. 1

    Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1983).

    Book  Google Scholar 

  2. 2

    Lynch, M. The Origins of Genome Architecture (Sinauer Associates, Sunderland, Massachusetts, 2007). A definitive presentation of the population-genetic perspective on genome evolution, with an emphasis on effective population size as the dominant factor of evolution and a non-adaptive origin of genomic complexity.

    Google Scholar 

  3. 3

    Loewe, L. A framework for evolutionary systems biology. BMC Syst. Biol. 3, 27 (2009).

    PubMed  PubMed Central  Google Scholar 

  4. 4

    Koonin, E. V. & Wolf, Y. I. Evolutionary systems biology: links between gene evolution and function. Curr. Opin. Biotechnol. 17, 481–487 (2006).

    CAS  PubMed  Google Scholar 

  5. 5

    Yamada, T. & Bork, P. Evolution of biomolecular networks: lessons from metabolic and protein interactions. Nature Rev. Mol. Cell Biol. 10, 791–803 (2009).

    CAS  Google Scholar 

  6. 6

    Snell-Rood, E. C., Van Dyken, J. D., Cruickshank, T., Wade, M. J. & Moczek, A. P. Toward a population genetic framework of developmental evolution: the costs, limits, and consequences of phenotypic plasticity. Bioessays 32, 71–81 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7

    Palsson, B. Metabolic systems biology. FEBS Lett. 583, 3900–3904 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8

    Erwin, D. H. & Davidson, E. H. The evolution of hierarchical gene regulatory networks. Nature Rev. Genet. 10, 141–148 (2009).

    CAS  PubMed  Google Scholar 

  9. 9

    Shabalina, S. A. & Kondrashov, A. S. Pattern of selective constraint in C. elegans and C. briggsae genomes. Genet. Res. 74, 23–30 (1999).

    CAS  Google Scholar 

  10. 10

    Margulies, E. H. et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 17, 760–774 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11

    Petersen, L., Bollback, J. P., Dimmic, M., Hubisz, M. & Nielsen, R. Genes under positive selection in Escherichia coli. Genome Res. 17, 1336–1343 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Muzzi, A., Moschioni, M., Covacci, A., Rappuoli, R. & Donati, C. Pilus operon evolution in Streptococcus pneumoniae is driven by positive selection and recombination. PLoS ONE 3, e3660 (2008).

    PubMed  PubMed Central  Google Scholar 

  13. 13

    Nielsen, R. et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 3, e170 (2005).

    PubMed  PubMed Central  Google Scholar 

  14. 14

    Turner, L. M., Chuong, E. B. & Hoekstra, H. E. Comparative analysis of testis protein evolution in rodents. Genetics 179, 2075–2089 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15

    Worth, C. L., Gong, S. & Blundell, T. L. Structural and functional constraints in the evolution of protein families. Nature Rev. Mol. Cell Biol. 10, 709–720 (2009).

    CAS  Google Scholar 

  16. 16

    Grishin, N. V., Wolf, Y. I. & Koonin, E. V. From complete genomes to measures of substitution rate variability within and between proteins. Genome Res. 10, 991–1000 (2000). An early study that suggests that the evolutionary rates of orthologous genes from diverse life forms follow a universal distribution, and that derives a link between intra-gene and across-gene distributions of evolutionary rates.

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17

    Nielsen, R. Molecular signatures of natural selection. Annu. Rev. Genet. 39, 197–218 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18

    Ohta, T. & Ina, Y. Variation in synonymous substitution rates among mammalian genes and the correlation between synonymous and nonsynonymous divergences. J. Mol. Evol. 41, 717–720 (1995).

    CAS  Google Scholar 

  19. 19

    Makalowski, W. & Boguski, M. S. Synonymous and nonsynonymous substitution distances are correlated in mouse and rat genes. J. Mol. Evol. 47, 119–121 (1998).

    CAS  Google Scholar 

  20. 20

    Ellegren, H. Comparative genomics and the study of evolution by natural selection. Mol. Ecol. 17, 4586–4596 (2008).

    PubMed  Google Scholar 

  21. 21

    Drummond, D. A. & Wilke, C. O. The evolutionary consequences of erroneous protein synthesis. Nature Rev. Genet. 10, 715–724 (2009).

    PubMed  Google Scholar 

  22. 22

    Lynch, M. & Conery, J. S. The origins of genome complexity. Science 302, 1401–1404 (2003). A seminal work that expounds the population-genetic perspective on the evolution of genomic complexity. The authors argue that genomic complexity is driven by weak purifying selection in populations with small Ne ; in such populations, slightly deleterious features, such as gene duplications or introns, cannot be efficiently eliminated. Collected data on Ne and genomic complexity in diverse life forms are shown to be compatible with this perspective, at least as a rough approximation.

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23

    Koonin, E. V. Evolution of genome architecture. Int. J. Biochem. Cell Biol. 41, 298–306 (2009).

    CAS  PubMed  Google Scholar 

  24. 24

    Harrison, P. M. & Gerstein, M. Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J. Mol. Biol. 318, 1155–1174 (2002).

    CAS  Google Scholar 

  25. 25

    Monot, M. et al. Comparative genomic and phylogeographic analysis of Mycobacterium leprae. Nature Genet. 41, 1282–1289 (2009).

    CAS  PubMed  Google Scholar 

  26. 26

    Darby, A. C., Cho, N. H., Fuxelius, H. H., Westberg, J. & Andersson, S. G. Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends Genet. 23, 511–520 (2007).

    CAS  PubMed  Google Scholar 

  27. 27

    Molina, N. & van Nimwegen, E. Universal patterns of purifying selection at noncoding positions in bacteria. Genome Res. 18, 148–160 (2008). A rigorous method for detecting purifying selection in groups of closely related prokaryotes was applied to the study of intergenic region evolution. Universal patterns of purifying selection were detected, and translation-initiation sites were found to be the elements subject to the strongest selective pressure.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28

    Sella, G., Petrov, D. A., Przeworski, M. & Andolfatto, P. Pervasive natural selection in the Drosophila genome? PLoS Genet. 5, e1000495 (2009). A critical review of the evidence indicating that most sites in the fruitfly genome are subject to selection.

    PubMed  PubMed Central  Google Scholar 

  29. 29

    Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).

    CAS  Google Scholar 

  30. 30

    Lunter, G., Ponting, C. P. & Hein, J. Genome-wide identification of human functional DNA using a neutral indel model. PLoS Comput. Biol. 2, e5 (2006).

    PubMed  PubMed Central  Google Scholar 

  31. 31

    Wright, S. I. & Andolfatto, P. The impact of natural selection on the genome: emerging patterns in Drosophila and Arabidopsis. Annu. Rev. Ecol. Syst. 39, 193–213 (2008).

    Google Scholar 

  32. 32

    Gossmann, T. I. et al. Genome wide analyses reveal little evidence for adaptive evolution in many plant species. Mol. Biol. Evol. 18 Mar 2010 (doi:10.1093/molbev/msq079).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Doolittle, W. F. & Sapienza, C. Selfish genes, the phenotype paradigm and genome evolution. Nature 284, 601–603 (1980).

    CAS  PubMed  Google Scholar 

  34. 34

    Bowen, N. J. & Jordan, I. K. Exaptation of protein coding sequences from transposable elements. Genome Dyn. 3, 147–162 (2007).

    CAS  PubMed  Google Scholar 

  35. 35

    Drake, J. A. et al. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nature Genet. 38, 223–227 (2006).

    CAS  PubMed  Google Scholar 

  36. 36

    Shabalina, S. A., Ogurtsov, A. Y., Rogozin, I. B., Koonin, E. V. & Lipman, D. J. Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals. Nucleic Acids Res. 32, 1774–1782 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37

    Proux, E., Studer, R. A., Moretti, S. & Robinson-Rechavi, M. Selectome: a database of positive selection. Nucleic Acids Res. 37, D404–D407 (2009).

    CAS  PubMed  Google Scholar 

  38. 38

    Costa, F. F. Non-coding RNAs: new players in eukaryotic biology. Gene 357, 83–94 (2005).

    CAS  PubMed  Google Scholar 

  39. 39

    Shabalina, S. A. & Koonin, E. V. Origins and evolution of eukaryotic RNA interference. Trends Ecol. Evol. 23, 578–587 (2008).

    PubMed  PubMed Central  Google Scholar 

  40. 40

    Carthew, R. W. & Sontheimer, E. J. Origins and mechanisms of miRNAs and siRNAs. Cell 136, 642–655 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41

    Ponting, C. P., Oliver, P. L. & Reik, W. Evolution and functions of long noncoding RNAs. Cell 136, 629–641 (2009). A detailed review of long non-coding (macro) RNAs, a recently discovered class of mammalian genes that comprise a substantial part of the RNome.

    CAS  PubMed  Google Scholar 

  42. 42

    Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43

    Johnson, J. M., Edwards, S., Shoemaker, D. & Schadt, E. E. Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. Trends Genet. 21, 93–102 (2005).

    CAS  Google Scholar 

  44. 44

    Katzman, S. et al. Human genome ultraconserved elements are ultraselected. Science 317, 915 (2007). A rigorous demonstration of the exceptionally strong selection that affects ultraconserved elements of mammalian genomes that are located outside protein-coding genes.

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45

    Dermitzakis, E. T., Reymond, A. & Antonarakis, S. E. Conserved non-genic sequences — an unexpected feature of mammalian genomes. Nature Rev. Genet. 6, 151–157 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Elgar, G. Pan-vertebrate conserved non-coding sequences associated with developmental regulation. Brief. Funct. Genomic. Proteomic. 8, 256–265 (2009).

    PubMed  Google Scholar 

  47. 47

    Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48

    Baira, E., Greshock, J., Coukos, G. & Zhang, L. Ultraconserved elements: genomics, function and disease. RNA Biol. 5, 132–134 (2008).

    CAS  PubMed  Google Scholar 

  49. 49

    Koonin, E. V., Aravind, L. & Kondrashov, A. S. The impact of comparative genomics on our understanding of evolution. Cell 101, 573–576 (2000).

    CAS  PubMed  Google Scholar 

  50. 50

    Wuchty, S. & Almaas, E. Evolutionary cores of domain co-occurrence networks. BMC Evol. Biol. 5, 24 (2005).

    PubMed  PubMed Central  Google Scholar 

  51. 51

    Basu, M. K., Carmel, L., Rogozin, I. B. & Koonin, E. V. Evolution of protein domain promiscuity in eukaryotes. Genome Res. 18, 449–461 (2008). A quantitative comparative analysis of promiscuous domains across eukaryotic lineages, including demonstration of a positive correlation between domain promiscuity and the strength of purifying selection.

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52

    Rogozin, I. B., Wolf, Y. I., Sorokin, A. V., Mirkin, B. G. & Koonin, E. V. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr. Biol. 13, 1512–1517 (2003).

    CAS  PubMed  Google Scholar 

  53. 53

    Roy., S. W. & Gilbert, W. The evolution of spliceosomal introns: patterns, puzzles and progress. Nature Rev. Genet. 7, 211–221 (2006).

    PubMed  PubMed Central  Google Scholar 

  54. 54

    Roy., S. W. & Penny, D. Patterns of intron loss and gain in plants: intron loss-dominated evolution and genome-wide comparison of O. sativa and A. thaliana. Mol. Biol. Evol. 24, 171–181 (2007).

    CAS  PubMed  Google Scholar 

  55. 55

    Carmel, L., Wolf, Y. I., Rogozin, I. B. & Koonin, E. V. Three distinct modes of intron dynamics in the evolution of eukaryotes. Genome Res. 17, 1034–1044 (2007). A detailed analysis of differential dynamics of intron gain and loss across eukaryotic lineages reveals three distinct modes of evolution characterized by pervasive intron loss, equilibrium and relatively rare intron gain, respectively.

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56

    Carmel, L., Rogozin, I. B., Wolf, Y. I. & Koonin, E. V. Patterns of intron gain and conservation in eukaryotic genes. BMC Evol. Biol. 7, 192 (2007).

    PubMed  PubMed Central  Google Scholar 

  57. 57

    Koonin, E. V. & Wolf, Y. I. Genomics of Bacteria and Archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res. 36, 6688–6719 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58

    Novichkov, P. S., Wolf, Y. I., Dubchak, I. & Koonin, E. V. Trends in prokaryotic evolution revealed by comparison of closely related bacterial and archaeal genomes. J. Bacteriol. 191, 65–73 (2009). This study provides a comparative analysis of selective and neutral evolutionary processes between multiple bacterial and archaeal lineages. The article demonstrates high, variable rates of genome rearrangement and the lack of correlation between genome streamlining and selective constraints on sequence evolution.

    CAS  PubMed  Google Scholar 

  59. 59

    Eisen, J. A., Heidelberg, J. F., White, O. & Salzberg, S. L. Evidence for symmetric chromosomal inversions around the replication origin in bacteria. Genome Biol. 1, research0011.1–research0011.9 (2000).

    Google Scholar 

  60. 60

    Zhou, F., Olman, V. & Xu, Y. Insertion sequences show diverse recent activities in Cyanobacteria and Archaea. BMC Genomics 9, 36 (2008).

    PubMed  PubMed Central  Google Scholar 

  61. 61

    Rogozin, I. B. et al. Connected gene neighborhoods in prokaryotic genomes. Nucleic Acids Res. 30, 2212–2223 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62

    Ling, X., He, X. & Xin, D. Detecting gene clusters under evolutionary constraint in a large number of genomes. Bioinformatics 25, 571–577 (2009).

    CAS  PubMed  Google Scholar 

  63. 63

    Wolf, Y. I., Rogozin, I. B., Kondrashov, A. S. & Koonin, E. V. Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context. Genome Res. 11, 356–372 (2001).

    CAS  PubMed  Google Scholar 

  64. 64

    Lawrence, J. Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes. Curr. Opin. Genet. Dev. 9, 642–648 (1999).

    CAS  PubMed  Google Scholar 

  65. 65

    Rocha, E. P. The organization of the bacterial genome. Annu. Rev. Genet. 42, 211–233 (2008).

    CAS  PubMed  Google Scholar 

  66. 66

    Osbourn, A. E. & Field, B. Operons. Cell. Mol. Life Sci. 66, 3755–3775 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. 67

    Hurst, L. D., Pal, C. & Lercher, M. J. The evolutionary dynamics of eukaryotic gene order. Nature Rev. Genet. 5, 299–310 (2004).

    CAS  Google Scholar 

  68. 68

    Liao, B. Y. & Zhang, J. Coexpression of linked genes in Mammalian genomes is generally disadvantageous. Mol. Biol. Evol. 25, 1555–1565 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. 69

    Lemons, D. & McGinnis, W. Genomic evolution of Hox gene clusters. Science 313, 1918–1922 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. 70

    Wong, S. & Wolfe, K. H. Birth of a metabolic gene cluster in yeast by adaptive gene relocation. Nature Genet. 37, 777–782 (2005).

    CAS  PubMed  Google Scholar 

  71. 71

    Eichler, E. E. & Sankoff, D. Structural dynamics of eukaryotic chromosome evolution. Science 301, 793–797 (2003).

    CAS  PubMed  Google Scholar 

  72. 72

    Koonin, E. V. Comparative genomics, minimal gene-sets and the last universal common ancestor. Nature Rev. Microbiol. 1, 127–136 (2003). This article demonstrates the difference between the shrinking set of ubiquitously conserved orthologous genes and the larger minimal set of functional niches. Minimal gene sets are also examined in relation to different prokaryotic lifestyles.

    CAS  Google Scholar 

  73. 73

    Moya, A. et al. Toward minimal bacterial cells: evolution vs. design. FEMS Microbiol Rev. 33, 225–235 (2009). The latest update on minimal gene sets and the promise of synthetic biology for de novo synthesis of custom genomes.

    CAS  PubMed  Google Scholar 

  74. 74

    Koonin, E. V. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 39, 309–338 (2005).

    CAS  PubMed  Google Scholar 

  75. 75

    Mushegian, A. R. & Koonin, E. V. A minimal gene set for cellular life derived by comparison of complete bacterial genomes [see comments]. Proc. Natl Acad. Sci. USA 93, 10268–10273 (1996).

    CAS  PubMed  Google Scholar 

  76. 76

    Charlebois, R. L. & Doolittle, W. F. Computing prokaryotic gene ubiquity: rescuing the core from extinction. Genome Res. 14, 2469–2477 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. 77

    Koonin, E. V., Mushegian, A. R. & Bork, P. Non-orthologous gene displacement. Trends Genet. 12, 334–336 (1996).

    CAS  PubMed  Google Scholar 

  78. 78

    Nilsen, T. W. & Graveley, B. R. Expansion of the eukaryotic proteome by alternative splicing. Nature 463, 457–463 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  79. 79

    Lynch, M. & Conery, J. S. The evolutionary fate and consequences of duplicate genes. Science 290, 1151–1155 (2000).

    CAS  Google Scholar 

  80. 80

    Lespinet, O., Wolf, Y. I., Koonin, E. V. & Aravind, L. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 12, 1048–1059 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  81. 81

    Huynen, M. A. & van Nimwegen, E. The frequency distribution of gene family sizes in complete genomes. Mol. Biol. Evol. 15, 583–589 (1998). The authors report the discovery that the sizes of paralogous gene families follow a power-law-like distribution. They also present a simple model of gene family evolution.

    CAS  PubMed  Google Scholar 

  82. 82

    Karev, G. P., Wolf, Y. I., Rzhetsky, A. Y., Berezovskaya, F. S. & Koonin, E. V. Birth and death of protein domains: a simple model of evolution explains power law behavior. BMC Evol. Biol. 2, 18 (2002).

    PubMed  PubMed Central  Google Scholar 

  83. 83

    Koonin, E. V., Wolf, Y. I. & Karev, G. P. The structure of the protein universe and genome evolution. Nature 420, 218–223 (2002). A discussion of non-adaptive models of genome evolution — in particular, how patterns of gene birth and death reproduce the observed size distributions of paralogous gene families.

    CAS  PubMed  Google Scholar 

  84. 84

    Putnam, N. H. et al. Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317, 86–94 (2007).

    CAS  Google Scholar 

  85. 85

    Srivastava, M. et al. The Trichoplax genome and the nature of placozoans. Nature 454, 955–960 (2008).

    CAS  PubMed  Google Scholar 

  86. 86

    Krylov, D. M., Wolf, Y. I., Rogozin, I. B. & Koonin, E. V. Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res. 13, 2229–2235 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. 87

    Wang, X., Grus, W. E. & Zhang, J. Gene losses during human origins. PLoS Biol. 4, e52 (2006).

    PubMed  PubMed Central  Google Scholar 

  88. 88

    Wolf, Y. I., Novichkov, P. S., Karev, G. P., Koonin, E. V. & Lipman, D. J. The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc. Natl Acad. Sci. USA 106, 7273–7280 (2009). This is the definitive demonstration of the universal character of the approximately log-normal distribution of the evolutionary rate of orthologous genes. The distribution of genes by age also follows a similar pattern. The article presents a simple, non-adaptive model according to which the universal distribution of gene-loss rates is a fundamental feature of genome evolution.

    CAS  PubMed  Google Scholar 

  89. 89

    Pal, C., Papp, B. & Hurst, L. D. Highly expressed genes in yeast evolve slowly. Genetics 158, 927–931 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  90. 90

    Drummond, D. A. & Wilke, C. O. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134, 341–352 (2008). A comprehensive analysis of the anticorrelation between evolution rate and expression of protein-coding genes in a variety of model organisms. This is a definitive presentation of the mistranslation-induced misfolding hypothesis of protein evolution.

    CAS  PubMed  PubMed Central  Google Scholar 

  91. 91

    Pal, C., Papp, B. & Lercher, M. J. An integrated view of protein evolution. Nature Rev. Genet. 7, 337–348 (2006).

    CAS  PubMed  Google Scholar 

  92. 92

    Grosjean, H. & Fiers, W. Preferential codon usage in prokaryotic genes: the optimal codon–anticodon interaction energy and the selective codon usage in efficiently expressed genes. Gene 18, 199–209 (1982).

    CAS  PubMed  Google Scholar 

  93. 93

    Lipman, D. J. & Wilbur, W. J. Interaction of silent and replacement changes in eukaryotic coding sequences. J. Mol. Evol. 21, 161–167 (1984).

    PubMed  Google Scholar 

  94. 94

    Hershberg, R. & Petrov, D. A. Selection on codon bias. Annu. Rev. Genet. 42, 287–299 (2008).

    CAS  PubMed  Google Scholar 

  95. 95

    Zhou, T., Weems, M. & Wilke, C. O. Translationally optimal codons associate with structurally sensitive sites in proteins. Mol. Biol. Evol. 26, 1571–1580 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. 96

    Lobkovsky, A. E., Wolf, Y. I. & Koonin, E. V. Universal distribution of protein evolution rates as a consequence of protein folding physics. Proc. Natl Acad. Sci. USA 107, 2983–2988 (2010). The universal distribution of evolutionary rates among orthologues is reproduced under a simple model of protein folding and under the assumption that misfolding is the only source of fitness cost in protein evolution.

    CAS  PubMed  Google Scholar 

  97. 97

    Wolf, Y. I., Carmel, L. & Koonin, E. V. Unifying measures of gene function and evolution. Proc. Biol. Sci. 273, 1507–1515 (2006). A systematic analysis of correlations between evolutionary and molecular phenomic variables leads to the idea of 'gene status', according to which genes with a high expression level, a large number of physical or regulatory interactions and high values of other phenomic variables evolve slowly and are rarely lost in the course of evolution.

    CAS  PubMed  PubMed Central  Google Scholar 

  98. 98

    Jordan, I. K., Wolf, Y. I. & Koonin, E. V. No simple dependence between protein evolution rate and the number of protein–protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol. Biol. 3, 1 (2003).

    PubMed  PubMed Central  Google Scholar 

  99. 99

    Bloom, J. D. & Adami, C. Evolutionary rate depends on number of protein–protein interactions independently of gene expression level: response. BMC Evol. Biol. 4, 14 (2004).

    PubMed  PubMed Central  Google Scholar 

  100. 100

    de Silva, E. et al. The effects of incomplete protein interaction data on structural and evolutionary inferences. BMC Biol. 4, 39 (2006).

    PubMed  PubMed Central  Google Scholar 

  101. 101

    Jordan, I. K., Wolf, Y. I. & Koonin, E. V. Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol. Biol. 4, 22 (2004).

    PubMed  PubMed Central  Google Scholar 

  102. 102

    Khaitovich, P. et al. A neutral model of transcriptome evolution. PLoS Biol. 2, e132 (2004).

    PubMed  PubMed Central  Google Scholar 

  103. 103

    Jordan, I. K., Marino-Ramirez, L., Wolf, Y. I. & Koonin, E. V. Conservation and coevolution in the scale-free human gene coexpression network. Mol. Biol. Evol. 21, 2058–2070 (2004).

    CAS  PubMed  Google Scholar 

  104. 104

    Denver, D. R. et al. The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans. Nature Genet. 37, 544–548 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  105. 105

    Jordan, I. K., Marino-Ramirez, L. & Koonin, E. V. Evolutionary significance of gene expression divergence. Gene 345, 119–126 (2005).

    CAS  PubMed  Google Scholar 

  106. 106

    Liao, B. Y. & Zhang, J. Evolutionary conservation of expression profiles between human and mouse orthologous genes. Mol. Biol. Evol. 23, 530–540 (2006).

    CAS  PubMed  Google Scholar 

  107. 107

    Gilad, Y., Oshlack, A. & Rifkin, S. A. Natural selection on gene expression. Trends Genet. 22, 456–461 (2006).

    CAS  PubMed  Google Scholar 

  108. 108

    Schrimpf, S. P. et al. Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol. 7, e48 (2009).

    PubMed  Google Scholar 

  109. 109

    Weiss, M., Schrimpf, S., Hengartner, M. O., Lercher, M. J. & von Mering, C. Shotgun proteomics data from multiple organisms reveals remarkable quantitative conservation of the eukaryotic core proteome. Proteomics 10, 1297–1306 (2010). This work extends the pioneering study reported in reference 108. The authors applied quantitative, highly accurate proteomic methods to reveal that the abundance of orthologous proteins is — unexpectedly — highly correlated among distantly related model organisms.

    CAS  PubMed  Google Scholar 

  110. 110

    Wolf, Y. I., Gopich, I. V., Lipman, D. J. & Koonin, E. V. Relative contributions of intrinsic structural-functional constraints and translation rate to the evolution of protein-coding genes. Genome Biol. Evol. 17 Mar 2010 (doi:10.1093/gbe/evq010).

    PubMed  PubMed Central  Google Scholar 

  111. 111

    Barabasi, A. L. & Oltvai, Z. N. Network biology: understanding the cell's functional organization. Nature Rev. Genet. 5, 101–113 (2004).

    CAS  Google Scholar 

  112. 112

    Bergmann, S., Ihmels, J. & Barkai, N. Similarities and differences in genome-wide expression data of six organisms. PLoS Biol. 2, e9 (2004).

    PubMed  Google Scholar 

  113. 113

    Tsaparas, P., Marino-Ramirez, L., Bodenreider, O., Koonin, E. V. & Jordan, I. K. Global similarity and local divergence in human and mouse gene co-expression networks. BMC Biol. 6, 70 (2006).

    Google Scholar 

  114. 114

    Jordan, I. K., Katz, L. S., Denver, D. R. & Streelman, J. T. Natural selection governs local, but not global, evolutionary gene coexpression networks in Caenorhabditis elegans. BMC Syst. Biol. 2, 96 (2008).

    PubMed  PubMed Central  Google Scholar 

  115. 115

    Lynch, M. The evolution of genetic networks by non-adaptive processes. Nature Rev. Genet. 8, 803–813 (2007). A model of the evolution of biological networks that shows how characteristic network properties could evolve through non-adaptive processes of mutation, drift and recombination.

    CAS  PubMed  Google Scholar 

  116. 116

    Kassen, R. Toward a general theory of adaptive radiation: insights from microbial experimental evolution. Ann. N. Y. Acad. Sci. 1168, 3–22 (2009).

    PubMed  Google Scholar 

  117. 117

    Jacob, F. Evolution and tinkering. Science 196, 1161–1166 (1977). A seminal conceptual analysis emphasizing the importance of contingency in evolution: evolution is construed as a bricolage that makes use of pre-existing states and is fundamentally unpredictable.

    CAS  Google Scholar 

  118. 118

    Mani, G. S. & Clarke, B. C. Mutational order: a major stochastic process in evolution. Proc. R. Soc. Lond. B 240, 29–37 (1990).

    CAS  PubMed  Google Scholar 

  119. 119

    Weinreich, D. M., Delaney, N. F., Depristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006). A key study on the landscape of protein evolution that revealed an unexpected level of constraint on evolutionary trajectories, apparently caused by interactions between mutations (epistasis).

    CAS  Google Scholar 

  120. 120

    Novais, A. et al. Evolutionary trajectories of b-lactamase CTX-M-1 cluster enzymes: predicting antibiotic resistance. PLoS Pathog. 6, e1000735 (2010).

    PubMed  PubMed Central  Google Scholar 

  121. 121

    Barrick, J. E. & Lenski, R. E. Genome-wide mutational diversity in an evolving population of Escherichia coli. Cold Spring Harb. Symp. Quant. Biol. 23 Sep 2009 (doi: 10.1101/sqb.2009.74.018). A summary of a series of long-term, extensive studies of bacterial populations in controlled experimental conditions. The studies revealed that evolutionary trajectories are affected by an interplay between contingency and constraint.

    CAS  PubMed  PubMed Central  Google Scholar 

  122. 122

    Stanek, M. T., Cooper, T. F. & Lenski, R. E. Identification and dynamics of a beneficial mutation in a long-term evolution experiment with Escherichia coli. BMC Evol. Biol. 9, 302 (2009).

    PubMed  PubMed Central  Google Scholar 

  123. 123

    Blount, Z. D., Borland, C. Z. & Lenski, R. E. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc. Natl Acad. Sci. USA 105, 7899–7906 (2008).

    CAS  PubMed  Google Scholar 

  124. 124

    Stewart, C. B., Schilling, J. W. & Wilson, A. C. Adaptive evolution in the stomach lysozymes of foregut fermenters. Nature 330, 401–404 (1987).

    CAS  Google Scholar 

  125. 125

    Yokoyama, R. & Yokoyama, S. Convergent evolution of the red- and green-like visual pigment genes in fish, Astyanax fasciatus, and human. Proc. Natl Acad. Sci. USA 87, 9315–9318 (1990).

    CAS  PubMed  Google Scholar 

  126. 126

    Zhang, J. Parallel adaptive origins of digestive RNases in Asian and African leaf monkeys. Nature Genet. 38, 819–823 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  127. 127

    Li, Y., Liu, Z., Shi, P. & Zhang, J. The hearing gene Prestin unites echolocating bats and whales. Curr. Biol. 20, R55–R56 (2010).

    CAS  PubMed  Google Scholar 

  128. 128

    Mustonen, V. & Lassig, M. Fitness flux and ubiquity of adaptive evolution. Proc. Natl Acad. Sci. USA. 107, 4248–4253 (2010). A reformulation of the principles of population genetics analogous to the transition from classic to non-equilibrium thermodynamics. The concept of fitness is replaced by fitness flux, and fitness landscape becomes a time-dependent seascape.

    CAS  PubMed  Google Scholar 

  129. 129

    Lynch, M. The frailty of adaptive hypotheses for the origins of organismal complexity. Proc. Natl Acad. Sci. USA 104 (Suppl. 1), 8597–8604 (2007).

    CAS  Google Scholar 

  130. 130

    Lynch, M. The origins of eukaryotic gene structure. Mol. Biol. Evol. 23, 450–468 (2006).

    CAS  Google Scholar 

  131. 131

    Irimia, M., Penny, D. & Roy., S. W. Coevolution of genomic intron number and splice sites. Trends Genet. 23, 321–325 (2007). A comparative analysis of splice sites showing that intron-poor organisms possess highly conserved splice sites that adhere to a strict consensus, whereas intron-rich genomes contain weak splice sites. A crucial corollary is that the evolution of alternative splicing is conditioned on relatively inefficient splice sites that are prevalent in organisms with weak selective pressure.

    CAS  PubMed  Google Scholar 

  132. 132

    Irimia, M. & Roy, S. W. Evolutionary convergence on highly-conserved 3′ intron structures in intron-poor eukaryotes and insights into the ancestral eukaryotic genome. PLoS Genet. 4, e1000148 (2008).

    PubMed  PubMed Central  Google Scholar 

  133. 133

    Irimia, M. et al. Complex selection on 5′ splice sites in intron-rich organisms. Genome Res. 19, 2021–2027 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  134. 134

    Lynch, M. Streamlining and simplification of microbial genome architecture. Annu. Rev. Microbiol. 60, 327–349 (2006).

    CAS  PubMed  Google Scholar 

  135. 135

    Wagner, A. Robustness, evolvability, and neutrality. FEBS Lett. 579, 1772–1778 (2005).

    CAS  Google Scholar 

  136. 136

    Dobrindt, U. et al. Analysis of genome plasticity in pathogenic and commensal Escherichia coli isolates by use of DNA arrays. J. Bacteriol. 185, 1831–1840 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  137. 137

    Lozada-Chavez, I., Janga, S. C. & Collado-Vides, J. Bacterial regulatory networks are extremely flexible in evolution. Nucleic Acids Res. 34, 3434–3445 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  138. 138

    Kazakov, A. E. et al. Comparative genomics of regulation of fatty acid and branched-chain amino acid utilization in proteobacteria. J. Bacteriol. 191, 52–64 (2009).

    CAS  PubMed  Google Scholar 

  139. 139

    Wagner, A. Neutralism and selectionism: a network-based reconciliation. Nature Rev. Genet. 9, 965–974 (2008). A conceptual perspective on (nearly) neutral networks that reconciles the neutralistic and adaptationist paradigms of evolution by showing how initially neutral mutations form the basis for subsequent adaptation.

    CAS  PubMed  Google Scholar 

  140. 140

    Masel, J. & Siegal, M. L. Robustness: mechanisms and consequences. Trends Genet. 25, 395–403 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  141. 141

    Bergman, A. & Siegal, M. L. Evolutionary capacitance as a general feature of complex gene networks. Nature 424, 549–552 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  142. 142

    Levy, S. F. & Siegal, M. L. Network hubs buffer environmental variation in Saccharomyces cerevisiae. PLoS Biol. 6, e264 (2008). An experimental demonstration of the unexpectedly large number of evolution capacitors among yeast genes, a finding that validates the theoretical predictions of reference 141.

    PubMed  PubMed Central  Google Scholar 

  143. 143

    Wang, Z. & Zhang, J. Abundant indispensable redundancies in cellular metabolic networks. Genome Biol. Evol. 2009, 23–33 (2009).

    Google Scholar 

  144. 144

    Koonin, E. V. Darwinian evolution in the light of genomics. Nucleic Acids Res. 37, 1011–1034 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  145. 145

    Frank, S. A. The common patterns of nature. J. Evol. Biol. 22, 1563–1585 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  146. 146

    Wilkins, A. S. Between 'design' and 'bricolage': genetic networks, levels of selection, and adaptive evolution. Proc. Natl Acad. Sci. USA 104 (Suppl. 1), 8590–8596 (2007).

    CAS  PubMed  Google Scholar 

  147. 147

    Resch, A. M. et al. Widespread positive selection in synonymous sites of mammalian genes. Mol. Biol. Evol. 24, 1821–1831 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  148. 148

    Parsch, J., Novozhilov, S., Saminadin-Peter, S. S., Wong, K. M. & Andolfatto, P. On the utility of short intron sequences as a reference for the detection of positive and negative selection in Drosophila . Mol. Biol. Evol. 27, 1226–1234 (2010).

    CAS  Google Scholar 

  149. 149

    Ellegren, H., Smith, N. G. & Webster, M. T. Mutation rate variation in the mammalian genome. Curr. Opin. Genet. Dev. 13, 562–568 (2003).

    CAS  Google Scholar 

  150. 150

    Charlesworth, J. & Eyre-Walker, A. The McDonald–Kreitman test and slightly deleterious mutations. Mol. Biol. Evol. 25, 1007–1015 (2008).

    CAS  PubMed  Google Scholar 

  151. 151

    Eyre-Walker, A. & Keightley, P. D. Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol. Biol. Evol. 26, 2097–2108 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  152. 152

    Hurst, L. D. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 18, 486–487 (2002).

    PubMed  Google Scholar 

  153. 153

    van Nimwegen, E. Scaling laws in the functional content of genomes. Trends Genet. 19, 479–484 (2003). A key study that reveals distinct scaling laws for different functional classes of genes and their virtual universality across a broad range of taxa.

    CAS  PubMed  PubMed Central  Google Scholar 

  154. 154

    Molina, N. & van Nimwegen, E. Scaling laws in functional genome content across prokaryotic clades and lifestyles. Trends Genet. 25, 243–247 (2009).

    CAS  PubMed  Google Scholar 

  155. 155

    Maslov, S., Krishna, S., Pang, T. Y. & Sneppen, K. Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc. Natl Acad. Sci. USA 106, 9743–9748 (2009). A simple model of evolution of metabolic networks that explains the universal scaling laws for regulators and enzymes.

    CAS  PubMed  Google Scholar 

  156. 156

    Lipman, D. J. & Wilbur, W. J. Modelling neutral and selective evolution of protein folding. Proc. Biol. Sci. 245, 7–11 (1991).

    CAS  PubMed  Google Scholar 

  157. 157

    Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O. & Arnold, F. H. Why highly expressed proteins evolve slowly. Proc. Natl Acad. Sci. USA 102, 14338–14343 (2005).

    CAS  PubMed  Google Scholar 

  158. 158

    Kramer, E. B. & Farabaugh, P. J. The frequency of translational misreading errors in E. coli is largely determined by tRNA competition. RNA 13, 87–96 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  159. 159

    Whitehead, D. J., Wilke, C. O., Vernazobres, D. & Bornberg-Bauer, E. The look-ahead effect of phenotypic mutations. Biol. Direct 3, 18 (2008). A modelling study that demonstrates the possibility of evolutionary capacitation through synergistic interactions between mutations and errors of transcription and translation (phenotypic mutations).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank A. Lobkovsky for providing part of the data used in the figure in Box 3 and T. Senkevich for critical reading of the manuscript. We apologize to the many colleagues whose work is not cited here because of space constraints. The authors' research is funded by the Intramural Research Program of the US Department of Health and Human Services (National Library of Medicine, US National Institutes of Health).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Eugene V. Koonin.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

Related links

FURTHER INFORMATION

Authors' homepage

Glossary

Robustness

The ability to maintain a phenotype or function in the presence of internal or external perturbations.

Purifying selection

(Also known as negative or stabilizing selection.) Mode of natural selection that eliminates deleterious mutations and preserves the status quo; in protein-coding genes, it is manifested as Ka/Ks << 1.

Non-synonymous substitutions

Nucleotide substitutions in protein-coding genes that lead to amino acid changes in the encoded protein.

Synonymous substitutions

Nucleotide substitutions in protein-coding genes that occur in synonymous positions of codons and accordingly do not lead to amino acid changes in the encoded protein.

Positive selection

(Also known as directional or Darwinian selection.) Mode of natural selection that increases the frequency of initially rare beneficial alleles in a population; in protein-coding genes, this often leads to Ka > Ks.

Ultraconserved elements

Sequences in animal genomes that have retained their identity throughout long evolutionary spans, such as the entire course of vertebrate evolution.

Evolutionary domains

Distinct units of gene/protein evolution that form combinations with varying degrees of evolutionary stability. Evolutionary domains may or may not correspond to structural domains (that is, an evolutionary domain could encompass one or more structural domains).

Promiscuous domain

A protein domain that combines with diverse other domains in numerous proteins, providing malleable connections in interaction and regulatory networks and complexes.

Orthologues

Genes that evolved from a single ancestral gene in the last common ancestor of the compared genomes (in contrast to paralogues).

Selfish operon concept

A hypothesis according to which the presence of the same or similar operons in different prokaryotes is due more to the horizontal transfer of operons as distinct units than to selection for co-expression and co-regulation. When a transferred piece of DNA includes an entire operon consisting of genes encoding a complete pathway or functional system, the chances of fixation dramatically increase.

Minimal gene set for cellular life

The minimal set of genes that is sufficient to maintain a functional cell.

Non-orthologous gene displacement

The utilization of unrelated or distantly related (not orthologous) genes for the same function.

Toolbox model of evolution

A model according to which enzymes for utilizing new metabolites, together with their dedicated regulators, are added (primarily by horizontal gene transfer) to a progressively versatile reaction network. Because of the growing complexity of the pre-existing network that provides enzymes for intermediate reactions, the ratio of regulators to regulated genes grows steadily.

Paralogous gene families

Gene families that evolved by duplication.

Neutral sequence network

A network of sequences connected by effectively single-step mutation distances (although not necessarily by single replacements), and in which there is a negligible fitness difference between neighbours.

Evolutionary anticipation

(Also known as the look-ahead effect.) A scenario for the evolution of complex traits that require multiple mutations. In this scenario, the fixation of the final, beneficial mutation that leads to the emergence of the complex feature is enabled by a preceding random mutational walk over the neutral sequence network or by phenotypic mutations, such as mistranslation.

Experimental evolution

The evolution of organisms with precisely defined genetic backgrounds and known evolutionary histories under controlled laboratory conditions.

Epistasis

When non-allelic genes interact to produce a joint phenotype that differs from the one that would have been produced if the two genes had acted independently.

Pleiotropy

Describes the multiple functions or mutation consequences of a single gene.

Fitness landscape

A multidimensional surface defining the relationships between the fitness and the genotype spaces.

Fitness seascape

A generalization of the concept of a fitness landscape, in which the dependence of fitness on sequence evolves over time.

Effective population size

The size of an idealized panmictic population whose evolutionary behaviour is equivalent to that of the analysed population.

Pathogenicity islands

Large clusters of genes in bacterial genomes that are typically transferred horizontally and contain pathogenicity determinants.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Koonin, E., Wolf, Y. Constraints and plasticity in genome and molecular-phenome evolution. Nat Rev Genet 11, 487–498 (2010). https://doi.org/10.1038/nrg2810

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing