Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Structural and functional constraints in the evolution of protein families

Key Points

  • The process of protein evolution is balanced between Darwinian selection for functionally advantageous mutations and neutral evolution, in which acceptance of amino acid substitution is constrained by the requirement for proper protein structure and function.

  • Comparative analyses of homologous proteins allow conserved features in both sequence and structure to be identified, along with constraints that give rise to distinct patterns of protein evolution.

  • The local structural environment of amino acids in the three-dimensional structures of proteins influences the probability of substitution during protein evolution. Solvent accessibility is the most important determinant, followed by the existence of hydrogen bonds from side-chain to main-chain groups and the nature of the element of secondary structure to which the amino acid contributes.

  • Solvent-inaccessible polar side chains provide strong structural and functional constraints in the evolution of protein families and can give rise to characteristic architectural motifs that are born from the need to satisfy hydrogen bonding.

  • Functional constraints operate through the requirement to maintain the interaction of proteins with other macromolecules in assemblies or with substrates, ligands or allosteric regulators.

  • Functional residues are under greater pressure to be conserved throughout the evolution process, in which they remain crucially important to the activity of proteins and thus to the selective advantage of the organism.

  • Structural and functional constraints in the evolution of protein families can be illustrated by the roles and properties of individual amino acids in the three-dimensional structure of proteins. Although it is an essential prerequisite to understanding protein evolution, further insights will depend on integrated and multidisciplinary systems approaches.

Abstract

 See more Darwin-related content in our Nature Publishing Group collection.

High-throughput genomic sequencing has focused attention on understanding differences between species and between individuals. When this genetic variation affects protein sequences, the rate of amino acid substitution reflects both Darwinian selection for functionally advantageous mutations and selectively neutral evolution operating within the constraints of structure and function. During neutral evolution, whereby mutations accumulate by random drift, amino acid substitutions are constrained by factors such as the formation of intramolecular and intermolecular interactions and the accessibility to water or lipids surrounding the protein. These constraints arise from the need to conserve a specific architecture and to retain interactions that mediate functions in protein families and superfamilies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Results of hierarchical clustering of 64 environments.
Figure 2: Structural constraints on pepsin-like aspartic proteinases.
Figure 3: Conserved polar residues that span secondary structures.
Figure 4: Conserved polar residues that support loops.

Similar content being viewed by others

References

  1. Bajaj, M. & Blundell, T. Evolution and the tertiary structure of proteins. Annu. Rev. Biophys. Bioeng. 13, 453–492 (1984).

    Article  CAS  PubMed  Google Scholar 

  2. Chothia, C. & Lesk, A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826 (1986). This paper quantifies the relationship between sequence variance and structural tolerance.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  3. Kimura, M. Evolutionary rate at the molecular level. Nature 217, 624–626 (1968). The first paper to introduce the neutral theory of evolution.

    Article  CAS  PubMed  Google Scholar 

  4. Ohta, T. Slightly deleterious mutant substitutions in evolution. Nature 246, 96–98 (1973). Introduces the nearly neutral theory of molecular evolution, a modification of that detailed in reference 3.

    Article  CAS  PubMed  Google Scholar 

  5. Zuckerkandl, E. Evolutionary processes and evolutionary noise at the molecular level. I. Functional density in proteins. J. Mol. Evol. 7, 167–183 (1976).

    Article  CAS  PubMed  Google Scholar 

  6. Zuckerkandl, E. Evolutionary processes and evolutionary noise at the molecular level. II. A selectionist model for random fixations in proteins. J. Mol. Evol. 7, 269–311 (1976).

    Article  CAS  PubMed  Google Scholar 

  7. Fraser, H. B., Hirsh, A. E., Steinmetz, L. M., Scharfe, C. & Feldman, M. W. Evolutionary rate in the protein interaction network. Science 296, 750–752 (2002).

    Article  CAS  PubMed  Google Scholar 

  8. Bloom, J. D. & Adami, C. Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evol. Biol. 3, 21 (2003).

    Article  PubMed Central  PubMed  Google Scholar 

  9. Jordan, I. K., Wolf, Y. I. & Koonin, E. V. No simple dependence between protein evolution rate and the number of protein–protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol. Biol. 3, 1 (2003).

    Article  PubMed Central  PubMed  Google Scholar 

  10. Orengo, C. A. & Thornton, J. M. Protein families and their evolution — a structural perspective. Annu. Rev. Biochem. 74, 867–900 (2005).

    Article  CAS  PubMed  Google Scholar 

  11. Bullock, A. N. et al. Thermodynamic stability of wild-type and mutant p53 core domain. Proc. Natl Acad. Sci. USA 94, 14338–14342 (1997). An elegant study that applied techniques initially devised to study the biophysics of protein folding to mutations in the protein p53, demonstrating that most of these changes are destabilizing.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Canadillas, J. M. et al. Solution structure of p53 core domain: structural basis for its instability. Proc. Natl Acad. Sci. USA 103, 2109–2114 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Friedler, A., Veprintsev, D. B., Hansson, L. O. & Fersht, A. R. Kinetic instability of p53 core domain mutants: implications for rescue by small molecules. J. Biol. Chem. 278, 24108–24112 (2003).

    Article  CAS  PubMed  Google Scholar 

  14. Joerger, A. C., Allen, M. D. & Fersht, A. R. Crystal structure of a superstable mutant of human p53 core domain. Insights into the mechanism of rescuing oncogenic mutations. J. Biol. Chem. 279, 1291–1296 (2004).

    Article  CAS  PubMed  Google Scholar 

  15. Nikolova, P. V., Henckel, J., Lane, D. P. & Fersht, A. R. Semirational design of active tumor suppressor p53 DNA binding domain with enhanced stability. Proc. Natl Acad. Sci. USA 95, 14675–14680 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Wang, X., Minasov, G. & Shoichet, B. K. Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs. J. Mol. Biol. 320, 85–95 (2002).

    Article  CAS  PubMed  Google Scholar 

  17. Aharoni, A. The 'evolvability' of promiscuous protein functions. Nature Genet. 37, 73–76 (2005). An original study on the evolution of new protein functions that shows that the process is driven by mutations having little effect on native function but large effects on promiscuous function.

    Article  CAS  PubMed  Google Scholar 

  18. Aharoni, A. et al. Directed evolution of mammalian paraoxonases PON1 and PON3 for bacterial expression and catalytic specialization. Proc. Natl Acad. Sci. USA 101, 482 (2004).

    Article  CAS  PubMed  Google Scholar 

  19. Andreeva, A. & Murzin, A. G. Evolution of protein fold in the presence of functional constraints. Curr. Opin. Struct. Biol. 16, 399–408 (2006). A review of the mechanisms by which a protein fold can evolve whilst maintaining the functional-site structure.

    Article  CAS  PubMed  Google Scholar 

  20. Caetano-Anollés, G., Wang, M., Caetano- Anollés, D. & Mittenthal, J. E. The origin, evolution and structure of the protein world. Biochem. J. 417, 621–637 (2009).

    Article  CAS  PubMed  Google Scholar 

  21. Copley, R. R., Letunic, I. & Bork, P. Genome and protein evolution in eukaryotes. Curr. Opin. Chem. Biol. 6, 39–45 (2002).

    Article  CAS  PubMed  Google Scholar 

  22. Kinch, L. N. & Grishin, N. V. Evolution of protein structures and functions. Curr. Opin. Struct. Biol. 12, 400–408 (2002).

    Article  CAS  PubMed  Google Scholar 

  23. Pal, C., Papp, B. & Lercher, M. J. An integrated view of protein evolution. Nature Rev. Genet. 7, 337–348 (2006). A comprehensive review of various approaches to study protein evolution.

    Article  CAS  PubMed  Google Scholar 

  24. Koonin, E. V. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 39, 309–338 (2005).

    Article  CAS  PubMed  Google Scholar 

  25. Hubbard, T. J. & Blundell, T. L. Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modelling. Protein Eng. 1, 159–171 (1987).

    Article  CAS  PubMed  Google Scholar 

  26. Garnier, J., Osguthorpe, D. J. & Robson, B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120, 97–120 (1978).

    Article  CAS  PubMed  Google Scholar 

  27. Gibrat, J. F., Garnier, J. & Robson, B. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198, 425–443 (1987).

    Article  CAS  PubMed  Google Scholar 

  28. Levin, J. M., Robson, B. & Garnier, J. An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett. 205, 303 (1986).

    Article  CAS  PubMed  Google Scholar 

  29. Pauling, L. & Corey, R. B. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl Acad. Sci. USA 37, 729–740 (1951).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Pauling, L., Corey, R. B. & Branson, H. R. The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl Acad. Sci. USA 37, 205–211 (1951). References 29 and 30 provided the first hint that regular secondary structure might form in folded proteins.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Hutchinson, E. G. & Thornton, J. M. A revised set of potentials for β-turn formation in proteins. Protein Sci. 3, 2207–2216 (1994).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  32. Sibanda, B. L., Blundell, T. L. & Thornton, J. M. Conformation of β-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J. Mol. Biol. 206, 759–777 (1989).

    Article  CAS  PubMed  Google Scholar 

  33. Wilmot, C. M. & Thornton, J. M. Analysis and prediction of the different types of β-turn in proteins. J. Mol. Biol. 203, 221–232 (1988).

    Article  CAS  PubMed  Google Scholar 

  34. Baker, E. N. & Hubbard, R. E. Hydrogen bonding in globular proteins. Prog. Biophys. Mol. Biol. 44, 97–179 (1984). The first comprehensive survey of hydrogen bonds in high-resolution protein structures.

    Article  CAS  PubMed  Google Scholar 

  35. Presta, L. G. & Rose, G. D. Helix signals in proteins. Science 240, 1632–1641 (1988).

    Article  CAS  PubMed  Google Scholar 

  36. Richardson, J. S. & Richardson, D. C. Amino acid preferences for specific locations at the ends of α helices. Science 240, 1648–1652 (1988).

    Article  CAS  PubMed  Google Scholar 

  37. Wan, W. Y. & Milner-White, E. J. A recurring two-hydrogen-bond motif incorporating a serine or threonine residue is found both at α-helical N termini and in other situations. J. Mol. Biol. 286, 1651–1662 (1999).

    Article  CAS  PubMed  Google Scholar 

  38. Wan, W. Y. & Milner-White, E. J. A natural grouping of motifs with an aspartate or asparagine residue forming two hydrogen bonds to residues ahead in sequence: their occurrence at α-helical N termini and in other situations. J. Mol. Biol. 286, 1633–1649 (1999).

    Article  CAS  PubMed  Google Scholar 

  39. Chan, A. W. E., Hutchinson, E. G. & Thornton, J. M. Identification, classification, and analysis of β-bulges in proteins. Protein Sci. 2, 1574–1590 (1993).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  40. Richardson, J. S., Getzoff, E. D. & Richardson, D. C. The β bulge: a common small unit of nonrepetitive protein structure. Proc. Natl Acad. Sci. USA 75, 2574–2578 (1978).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Barlow, D. J. & Thornton, J. M. Helix geometry in proteins. J. Mol. Biol. 201, 601–619 (1988).

    Article  CAS  PubMed  Google Scholar 

  42. Eswar, N. & Ramakrishnan, C. Secondary structures without backbone: an analysis of backbone mimicry by polar side chains in protein structures. Protein Eng. 12, 447–455 (1999).

    Article  CAS  PubMed  Google Scholar 

  43. Cubellis, M. V., Caillez, F., Blundell, T. L. & Lovell, S. C. Properties of polyproline II, a secondary structure element implicated in protein–protein interactions. Proteins 58, 880–892 (2005).

    Article  CAS  PubMed  Google Scholar 

  44. Stapley, B. J. & Creamer, T. P. A survey of left-handed polyproline II helices. Protein Sci. 8, 587–595 (1999).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  45. Milner-White, E., Ross, B. M., Ismail, R., Belhadj-Mostefa, K. & Poet, R. One type of γ-turn, rather than the other gives rise to chain-reversal in proteins. J. Mol. Biol. 204, 777–782 (1988).

    Article  CAS  PubMed  Google Scholar 

  46. Milner-White, E. J. β-bulges within loops as recurring features of protein structure. Biochim. Biophys. Acta 911, 261–265 (1987).

    Article  CAS  PubMed  Google Scholar 

  47. Blundell, T. L. & Wood, S. P. Is the evolution of insulin Darwinian or due to selectively neutral mutation? Nature 257, 197–203 (1975). An early paper discussing the evolution of protein structure and interactions in terms of adaptive processes and neutral mutations.

    Article  CAS  PubMed  Google Scholar 

  48. Guharoy, M. & Chakrabarti, P. Conservation and relative importance of residues across protein–protein interfaces. Proc. Natl Acad. Sci. USA 102, 15447–15452 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Kisters-Woike, B., Vangierdegom, C. & Mueller-Hill, B. On the conservation of protein sequences in evolution. Trends Biochem. Sci. 25, 419–421 (2000).

    Article  CAS  PubMed  Google Scholar 

  50. Lichtarge, O., Bourne, H. R. & Cohen, F. E. Evolutionarily conserved Gαβγ binding surfaces support a model of the G protein-receptor complex. Proc. Natl Acad. Sci. USA 93, 7507–7511 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Chelliah, V., Chen, L., Blundell, T. L. & Lovell, S. C. Distinguishing structural and functional restraints in evolution in order to identify interaction sites. J. Mol. Biol. 342, 1487–1504 (2004).

    Article  CAS  PubMed  Google Scholar 

  52. Blundell, T. L. et al. in Methods in Proteins Sequence Analysis (eds Jornvall, H. Hoog, J.O. Gustavsson, A.M.) 373–385 (Birkhauser, Basel, 1991).

    Book  Google Scholar 

  53. Overington, J., Johnson, M. S., Sali, A. & Blundell, T. L. Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction. Proc. Biol. Sci. 241, 132–145 (1990). The first study to quantify structural restraints on amino acid substitutions between homologous proteins, identifying particular patterns of substitution.

    Article  CAS  PubMed  Google Scholar 

  54. Overington, J., Donnelly, D., Johnson, M. S., Sali, A. & Blundell, T. L. Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. Protein Sci. 1, 216–226 (1992).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  55. Michener, C. D. & Sokal, R. R. A quantitative approach to a problem in classification. Evolution 11, 130 (1957).

    Article  PubMed Central  Google Scholar 

  56. Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl Acad. Sci. USA 103, 5869–5874 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Bloom, J. D. et al. Thermodynamic prediction of protein neutrality. Proc. Natl Acad. Sci. USA 102, 606–611 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Deane, C. M., Allen, F. H., Taylor, R. & Blundell, T. L. Carbonyl–carbonyl interactions stabilize the partially allowed Ramachandran conformations of asparagine and aspartic acid. Protein Eng. 12, 1025–1028 (1999).

    Article  CAS  PubMed  Google Scholar 

  59. Gong, S. & Blundell, T. L. Discarding functional residues from the substitution table improves predictions of active sites within three-dimensional structures. PLoS Comput. Biol. 4, e1000179 (2008).

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  60. Schell, D., Tsai, J., Scholtz, J. M. & Pace, C. N. Hydrogen bonding increases packing density in the protein interior. Proteins 63, 278–282 (2006).

    Article  CAS  PubMed  Google Scholar 

  61. Pace, C. N. Polar group burial contributes more to protein stability than nonpolar group burial. Biochemistry 16, 310–313 (2001).

    Article  CAS  Google Scholar 

  62. Fleming, P. J. & Rose, G. D. Do all backbone polar groups in proteins form hydrogen bonds? Protein Sci. 14, 1911–1917 (2005).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  63. McDonald, I. K. & Thornton, J. M. Satisfying hydrogen bonding potential in proteins. J. Mol. Biol. 238, 777–793 (1994).

    Article  CAS  PubMed  Google Scholar 

  64. Worth, C. L. & Blundell, T. L. Satisfaction of hydrogen-bonding potential influences the conservation of polar sidechains. Proteins 75, 413–429 (2009).

    Article  CAS  PubMed  Google Scholar 

  65. Eswar, N. & Ramakrishnan, C. Deterministic features of side-chain main-chain hydrogen bonds in globular protein structures. Protein Eng. 13, 227–238 (2000).

    Article  CAS  PubMed  Google Scholar 

  66. Vijayakumar, M., Qian, H. & Zhou, H. X. Hydrogen bonds between short polar side chains and peptide backbone: prevalence in proteins and effects on helix-forming propensities. Proteins 34, 497–507 (1999).

    Article  CAS  PubMed  Google Scholar 

  67. Hamill, S. J., Cota, E., Chothia, C. & Clarke, J. Conservation of folding and stability within a protein family: the tyrosine corner as an evolutionary cul-de-sac. J. Mol. Biol. 295, 641–649 (2000).

    Article  CAS  PubMed  Google Scholar 

  68. Bordo, D. & Argos, P. The role of side-chain hydrogen bonds in the formation and stabilization of secondary structure in soluble proteins. J. Mol. Biol. 243, 504–519 (1994).

    Article  CAS  PubMed  Google Scholar 

  69. Nicholson, H., Anderson, D. E., Dao-pin, S. & Matthews, B. W. Analysis of the interaction between charged side chains and the α-helix dipole using designed thermostable mutants of phage T4 lysozyme. Biochemistry 30, 9816–9828 (1991).

    Article  CAS  PubMed  Google Scholar 

  70. Mizuguchi, K., Deane, C. M., Blundell, T. L. & Overington, J. P. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci. 7, 2469–2471 (1998).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  71. Harper, E. T. & Rose, G. D. Helix stop signals in proteins and peptides: the capping box. Biochemistry 32, 7605–7609 (1993).

    Article  CAS  PubMed  Google Scholar 

  72. Serrano, L., Sancho, J., Hirshberg, M. & Fersht, A. R. α-Helix stability in proteins. I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. J. Mol. Biol. 227, 544–559 (1992).

    Article  CAS  PubMed  Google Scholar 

  73. Burley, S. K. & Petsko, G. A. Aromatic–aromatic interaction — a mechanism of protein-structure stabilization. Science 229, 23–28 (1985).

    Article  CAS  PubMed  Google Scholar 

  74. Hunter, C. A., Singh, J. & Thornton, J. M. Pi–Pi-interactions — the geometry and energetics of phenylalanine phenylalanine interactions in proteins. J. Mol. Biol. 218, 837–846 (1991).

    Article  CAS  PubMed  Google Scholar 

  75. Burley, S. K. & Petsko, G. A. Amino-aromatic interactions in proteins. FEBS Lett. 203, 139–143 (1986).

    Article  CAS  PubMed  Google Scholar 

  76. Mitchell, J. B. O., Nandi, C. L., Mcdonald, I. K., Thornton, J. M. & Price, S. L. Amino/aromatic interactions in proteins — is the evidence stacked against hydrogen-bonding. J. Mol. Biol. 239, 315–331 (1994).

    Article  CAS  PubMed  Google Scholar 

  77. Gallivan, J. P. & Dougherty, D. A. Cation–π interactions in structural biology. Proc. Natl Acad. Sci. USA 96, 9459–9464 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Ortlund, E. A., Bridgham, J. T., Redinbo, M. R. & Thornton, J. W. Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317, 1544–1548 (2007).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  79. Shakhnovich, E., Abkevich, V. & Ptitsyn, O. Conserved residues and the mechanism of protein folding. Nature 379, 96–98 (1996). The presentation of a novel computational method for identifying the residues that form the folding nucleus of a protein.

    Article  CAS  PubMed  Google Scholar 

  80. Itzhaki, L. S., Otzen, D. E. & Fersht, A. R. The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation-condensation mechanism for protein folding. J. Mol. Biol. 254, 260–288 (1995). Introduced the nucleation–condensation model of protein folding from experimental work in chymotrypsin inhibitor 2.

    Article  CAS  PubMed  Google Scholar 

  81. Mirny, L. A., Abkevich, V. I. & Shakhnovich, E. I. How evolution makes proteins fold quickly. Proc. Natl Acad. Sci. USA 95, 4976–4981 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Mirny, L. A. & Shakhnovich, E. I. Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J. Mol. Biol. 291, 177–196 (1999).

    Article  CAS  PubMed  Google Scholar 

  83. Plaxco, K. W. et al. Evolutionary conservation in protein folding kinetics. J. Mol. Biol. 298, 303 (2000).

    Article  CAS  PubMed  Google Scholar 

  84. Larson, S. M., Ruczinski, I., Davidson, A. R., Baker, D. & Plaxco, K. W. Residues participating in the protein folding nucleus do not exhibit preferential evolutionary conservation. J. Mol. Biol. 316, 225–233 (2002).

    Article  CAS  PubMed  Google Scholar 

  85. Tseng, Y. Y. & Liang, J. Are residues in a protein folding nucleus evolutionarily conserved? J. Mol. Biol. 335, 869–880 (2004).

    Article  CAS  PubMed  Google Scholar 

  86. Li, L., Mirny, L. A. & Shakhnovich, E. I. Kinetics, thermodynamics and evolution of non-native interactions in a protein folding nucleus. Nature Struct. Biol. 7, 336–342 (2000).

    Article  CAS  PubMed  Google Scholar 

  87. Kim, W. K., Bolser, D. M. & Park, J. H. Large-scale co-evolution analysis of protein structural interlogues using the global protein structural interactome map (PSIMAP). Bioinformatics 20, 1138–1150 (2004).

    Article  CAS  PubMed  Google Scholar 

  88. Pazos, F. & Valencia, A. Protein co-evolution, co-adaptation and interactions. EMBO J. 27, 2648–2655 (2008).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  89. Park, J. & Bolser, D. Conservation of protein interaction network in evolution. Genome Inform. 12, 135–140 (2001).

    CAS  PubMed  Google Scholar 

  90. Batada, N. N., Hurst, L. D. & Tyers, M. Evolutionary and physiological importance of hub proteins. PLoS Comput. Biol. 2, e88 (2006).

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  91. Pal, C., Papp, B. & Hurst, L. D. Genomic function: rate of evolution and gene dispensability. Nature 421, 496–497 (2003).

    Article  CAS  PubMed  Google Scholar 

  92. Wall, D. P. et al. Functional genomic analysis of the rates of protein evolution. Proc. Natl Acad. Sci. USA 102, 5483–5488 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Choi, J. K., Kim, S. C., Seo, J., Kim, S. & Bhak, J. Impact of transcriptional properties on essentiality and evolutionary rate. Genetics 175, 199–206 (2007).

    Article  PubMed Central  PubMed  Google Scholar 

  94. Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O. & Arnold, F. H. Why highly expressed proteins evolve slowly. Proc. Natl Acad. Sci. USA 102, 14338–14343 (2005). This paper suggests that the expression level of a protein is related to the demand for exact folding.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Drummond, D. A., Raval, A. & Wilke, C. O. A single determinant dominates the rate of yeast protein evolution. Mol. Biol. Evol. 23, 327–337 (2006).

    Article  CAS  PubMed  Google Scholar 

  96. Zeldovich, K. B. & Shakhnovich, E. I. Understanding protein evolution: from protein physics to Darwinian selection. Annu. Rev. Phys. Chem. 59, 105–127 (2008).

    Article  CAS  PubMed  Google Scholar 

  97. Akashi, H. Gene expression and molecular evolution. Curr. Opin. Genet. Dev. 11, 660–666 (2001).

    Article  CAS  PubMed  Google Scholar 

  98. Drummond, D. A. & Wilke, C. O. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134, 341–352 (2008).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  99. Hamill, S. J., Steward, A. & Clarke, J. The folding of an immunoglobulin-like Greek key protein is defined by a common-core nucleus and regions constrained by topology. J. Mol. Biol. 297, 165 (2000).

    Article  CAS  PubMed  Google Scholar 

  100. Chiti, F. & Dobson, C. M. Protein misfolding, functional amyloid, and human disease. Annu. Rev. Biochem. 75, 333–366 (2006).

    Article  CAS  PubMed  Google Scholar 

  101. Hamada, D. et al. Competition between folding, native-state dimerisation and amyloid aggregation in β-lactoglobulin. J. Mol. Biol. 386, 878–890 (2009).

    Article  CAS  PubMed  Google Scholar 

  102. Goldberg, A. L. Protein degradation and protection against misfolded or damaged proteins. Nature 426, 895–899 (2003).

    Article  CAS  PubMed  Google Scholar 

  103. Wolffe, A. P. & Matzke, M. A. Epigenetics: regulation through repression. Science 286, 481–486 (1999).

    Article  CAS  PubMed  Google Scholar 

  104. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995). Details the first protein hierarchical classification scheme.

    CAS  PubMed  Google Scholar 

  105. Orengo, C. A. et al. CATH–a hierarchic classification of protein domain structures. Structure 5, 1093–1108 (1997).

    Article  CAS  PubMed  Google Scholar 

  106. Bhaduri, A., Pugalenthi, G. & Sowdhamini, R. PASS2: an automated database of protein alignments organised as structural superfamilies. BMC Bioinformatics 5, 35 (2004).

    Article  PubMed Central  PubMed  Google Scholar 

  107. Worth, C. L. et al. A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease. J. Bioinform. Comput. Biol. 5, 1297–1318 (2007).

    Article  CAS  PubMed  Google Scholar 

  108. Holm, L., Kaariainen, S., Rosenstrom, P. & Schenkel, A. Searching protein structure databases with DaliLite v.3. Bioinformatics 24, 2780 (2008).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  109. Shindyalov, I. N. & Bourne, P. E. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 11, 739–747 (1998).

    Article  CAS  PubMed  Google Scholar 

  110. Marchler-Bauer, A. et al. MMDB: Entrez's 3D structure database. Nucleic Acids Res. 27, 240–243 (1999).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  111. Finn, R. D. et al. The Pfam protein families database. Nucleic Acids Res. 36, D281–D288 (2008).

    Article  CAS  PubMed  Google Scholar 

  112. Hunter, S. et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 37, D211–D215 (2009).

    Article  CAS  PubMed  Google Scholar 

  113. Hulo, N. et al. The PROSITE database. Nucleic Acids Res. 34, D227–D230 (2006).

    Article  CAS  PubMed  Google Scholar 

  114. Attwood, T. K. et al. PRINTS and its automatic supplement, prePRINTS. Nucleic Acids Res. 31, 400–402 (2003).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  115. Servant, F. et al. ProDom: automated clustering of homologous domains. Brief. Bioinformatics 3, 246–251 (2002).

    Article  CAS  PubMed  Google Scholar 

  116. Schultz, J., Milpetz, F., Bork, P. & Ponting, C. P. SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl Acad. Sci. USA 95, 5857–5864 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Haft, D. H., Selengut, J. D. & White, O. The TIGRFAMs database of protein families. Nucleic Acids Res. 31, 371–373 (2003).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  118. Buchan, D. W. et al. Gene3D: structural assignments for the biologist and bioinformaticist alike. Nucleic Acids Res. 31, 469–473 (2003).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  119. Wilson, D. et al. SUPERFAMILY — sophisticated comparative genomics, data mining, visualization and phylogeny. Nucleic Acids Res. 37, D380–D386 (2009).

    Article  CAS  PubMed  Google Scholar 

  120. Krishnamurthy, N., Brown, D., Kirshner, D. & Sjolander, K. PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification. Genome Biol. 7, R83 (2006).

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  121. Marchler-Bauer, A. et al. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res. 35, D237–D240 (2007).

    Article  CAS  PubMed  Google Scholar 

  122. Heger, A. et al. PairsDB atlas of protein sequence space. Nucleic Acids Res. 36, D276–D280 (2008).

    Article  CAS  PubMed  Google Scholar 

  123. Orengo, C. A., Stilltoe, I., Reeves, G. & Pearl, F. M. G. What can structural classifications reveal about protein evolution? J. Struct. Biol. 134, 145–165 (2001).

    Article  CAS  PubMed  Google Scholar 

  124. Mizuguchi, K., Deane, C. M., Blundell, T. L., Johnson, M. S. & Overington, J. P. JOY: protein sequence-structure representation and analysis. Bioinformatics 14, 617–623 (1998).

    Article  CAS  PubMed  Google Scholar 

  125. Dayhoff, M. O. & Eck, R. V. in Atlas of Protein Sequence and Structure 1967–1968 33–45 (National Biomedical Research Foundation, Silver Spring, Maryland, 1968).

    Google Scholar 

  126. Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  127. Lee, S. & Blundell, T. L. Ulla: a program for calculating environment-specific amino acid substitution tables. Bioinformatics 25, 1976–1977 (2009).

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  128. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

C.L.W. was funded by a Biotechnology and Biological Sciences Research Council studentship. S.G. was supported by the BiO foundation. T.L.B. is funded by the Wellcome Trust.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom L. Blundell.

Supplementary information

Related links

Related links

DATABASES

PDB

 1gk8

1m2z

1tme

2fgf

3app

5pal

FURTHER INFORMATION

The Blundell group's homepage

CATH

CDD

CE

Dali

ESSTs

Gene3D

HOMSTRAD

InterPro

JOY

MMDB

PairsDB

PASS2

Pfam

Phylofacts

Prints

ProDom

PROSITE

PyMol

SCOP

SMART

Superfamily

TIFRFAMs

Toccata

Ulla

Glossary

Chaperone

A protein that assists in the folding or unfolding and the assembly or disassembly of other macromolecular structures.

Neutral drift

The process whereby random sampling effects over successive generations give rise to stochastic changes in the allele frequencies within a population.

β-lactamase

An enzyme produced by some bacteria that confers resistance to β-lactam antibiotics.

Constraint

A structural and dynamic system, or functional factor, that influences the acceptance of amino acid substitutions that occur in divergent protein families. Given that selection occurs at the level of the organism and that individual proteins and the systems in which they evolve are plastic, these constraints tend not to 'force' but rather to 'restrain' the substitutions that occur in evolution.

Orthologues

Genes (or gene products) descended from a common ancestral origin that diverged as a result of a speciation event.

Hydrogen bonding potential

The capacity of atoms to act as proton donors or acceptors in the formation of hydrogen bonds.

Jelly roll

An eight-stranded β-sandwich that is formed by four Greek key motifs, each consisting of four sequential antiparallel β-strands.

β-propeller

An all-β protein architecture comprising four to eight blade-shaped β-sheets arranged toroidally around a central axis.

α-helical bundle

A protein fold consisting of multiple α-helices that are approximately parallel to one another.

αβ-Rossman fold

Two repeating β–α–β super-secondary motifs.

Distance matrix

An n×n array that represents the distances between a set of n elements.

Positive φ main-chain torsion angle

A positive dihedral angle around the nitrogen–α-carbon bonds in the protein main chain. For L-amino acids these bond angles are generally restricted to a negative value owing to steric hindrance from the side chains, but they can be positive when there is no side chain (Gly) or when polar side-chain interactions with the main-chain peptide units stabilize this conformation.

van der Waals interaction

A weak electrostatic interaction that is formed by the fluctuating electron clouds of two atoms.

Tyr corner motif

A motif that involves a conserved Tyr within Greek key proteins forming a hydrogen bond with the local protein backbone in an adjacent loop.

Cation–π interaction

A non-covalent interaction between an aromatic side chain and a cationic side chain.

SH3 domain

(Src homology 3 domain). A small domain that is found in various intracellular or membrane-associated proteins and has a β-barrel fold.

Euclidean distance

A geometric distance between two point sets in the n-dimensional (or Euclidean) space.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Worth, C., Gong, S. & Blundell, T. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol 10, 709–720 (2009). https://doi.org/10.1038/nrm2762

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrm2762

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing