The coming of age of de novo protein design

Abstract

There are 20200 possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Methods for de novo protein design.
Figure 2: Designing αβ proteins.
Figure 3: Designing proteins with internal symmetry.
Figure 4: De novo design using parametric backbone generation.
Figure 5: Designing self-assembling nanomaterials.
Figure 6: Designing hyperstable de novo constrained peptides.

Accession codes

Accessions

Protein Data Bank

References

  1. 1

    Firestein, S. How the olfactory system makes sense of scents. Nature 413, 211–218 (2001).

    Article  ADS  CAS  PubMed  Google Scholar 

  2. 2

    Rosenbaum, D. M., Rasmussen, S. G. F. & Kobilka, B. K. The structure and function of G-protein-coupled receptors. Nature 459, 356–363 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. 3

    Yoshida, M., Muneyuki, E. & Hisabori, T. ATP synthase — a marvellous rotary engine of the cell. Nature Rev. Mol. Cell Biol. 2, 669–677 (2001).

    Article  CAS  Google Scholar 

  4. 4

    Spudich, J. A. The myosin swinging cross-bridge model. Nature Rev. Mol. Cell Biol. 2, 387–392 (2001).

    Article  CAS  Google Scholar 

  5. 5

    Dougherty, M. J. & Arnold, F. H. Directed evolution: new parts and optimized function. Curr. Opin. Biotechnol. 20, 486–491 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. 6

    Arnold, F. H. The nature of chemical innovation: new enzymes by evolution. Q. Rev. Biophys. 48, 404–410 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. 7

    Goldsmith, M. & Tawfik, D. S. Directed enzyme evolution: beyond the low-hanging fruit. Curr. Opin. Struct. Biol. 22, 406–412 (2012).

    Article  CAS  PubMed  Google Scholar 

  8. 8

    Khoury, G. A., Smadbeck, J., Kieslich, C. A. & Floudas, C. A. Protein folding and de novo protein design for biotechnological applications. Trends Biotechnol. 32, 99–109 (2014).

    Article  CAS  PubMed  Google Scholar 

  9. 9

    Regan, L. et al. Protein design: past, present, and future. Biopolymers 104, 334–350 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. 10

    Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. 11

    Fisher, M. A., McKinley, K. L., Bradley, L. H., Viola, S. R. & Hecht, M. H. De novo designed proteins from a library of artificial sequences function in Escherichia coli and enable cell growth. PLoS ONE 6, e15364 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Murphy, G. S., Greisman, J. B. & Hecht, M. H. De novo proteins with life-sustaining functions are structurally dynamic. J. Mol. Biol. 428, 399–411 (2016).

    Article  CAS  PubMed  Google Scholar 

  13. 13

    Epstein, C. J., Goldberger, R. F. & Anfinsen, C. B. The genetic control of tertiary protein structure: studies with model systems. Cold Spring Harb. Simp. Quant. Biol. 28, 439–449 (1963).

    Article  CAS  Google Scholar 

  14. 14

    Leaver-Fay, A. et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. 15

    Wood, C. W. et al. CCBuilder: an interactive web-based tool for building, designing and assessing coiled-coil protein assemblies. Bioinformatics 30, 3029–3035 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. 16

    Negron, C. & Keating, A. E. Multistate protein design using CLEVER and CLASSY. Methods Enzymol. 523, 171–190 (2013).

    Article  CAS  PubMed  Google Scholar 

  17. 17

    Smadbeck, J., Peterson, M. B., Khoury, G. A., Taylor, M. S. & Floudas, C. A. Protein WISDOM: a workbench for in silico de novo design of biomolecules. J. Vis. Exp. 77, e50476 (2013).

    Google Scholar 

  18. 18

    Fleming, P. J. & Rose, G. D. Do all backbone polar groups in proteins form hydrogen bonds? Protein Sci. 14, 1911–1917 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. 19

    Ponder, J. W. & Richards, F. M. Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol. 193, 775–791 (1987).

    Article  CAS  PubMed  Google Scholar 

  20. 20

    Dahiyat, B. I. & Mayo, S. L. Protein design automation. Protein Sci. 5, 895–903 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. 21

    Dahiyat, B. I. & Mayo, S. L. De novo protein design: fully automated sequence selection. Science 278, 82–87 (1997).

    Article  CAS  PubMed  Google Scholar 

  22. 22

    Kuhlman, B. & Baker, D. Native protein sequences are close to optimal for their structures. Proc. Natl Acad. Sci. USA 97, 10383–10388 (2000).

    Article  ADS  CAS  PubMed  Google Scholar 

  23. 23

    Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).

    Article  CAS  PubMed  Google Scholar 

  24. 24

    Balakrishnan, S., Kamisetty, H., Carbonell, J. G., Lee, S.-I. & Langmead, C. J. Learning generative models for protein fold families. Proteins 79, 1061–1078 (2011).

    Article  CAS  PubMed  Google Scholar 

  25. 25

    Kamisetty, H., Ovchinnikov, S. & Baker, D. Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era. Proc. Natl Acad. Sci. USA 110, 15674–15679 (2013).

    Article  ADS  PubMed  Google Scholar 

  26. 26

    Ovchinnikov, S., Kamisetty, H. & Baker, D. Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information. eLife 3, e02030 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  27. 27

    Ovchinnikov, S. et al. Large-scale determination of previously unsolved protein structures using evolutionary information. eLife 4, e09248 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  28. 28

    Kuhlman, B. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003). Describes Top7, the first globular protein to be designed with a fold not observed in nature.

    Article  ADS  CAS  PubMed  Google Scholar 

  29. 29

    Huang, P.-S. et al. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS ONE 6, e24109 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  30. 30

    Harbury, P. B., Plecs, J. J., Tidor, B., Alber, T. & Kim, P. S. High-resolution protein design with backbone freedom. Science 282, 1462–1467 (1998). Describes RH4, the first protein to be designed using flexible-backbone methods and parametric equations.

    Article  CAS  PubMed  Google Scholar 

  31. 31

    Thomson, A. R. et al. Computational design of water-soluble α-helical barrels. Science 346, 485–488 (2014).

    Article  ADS  CAS  PubMed  Google Scholar 

  32. 32

    Grigoryan, G. & DeGrado, W. F. Probing designability via a generalized model of helical bundle geometry. J. Mol. Biol. 405, 1079–1100 (2011).

    Article  CAS  PubMed  Google Scholar 

  33. 33

    Huang, P.-S. et al. High thermodynamic stability of parametrically designed helical bundles. Science 346, 481–485 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  34. 34

    Joh, N. H. et al. De novo design of a transmembrane Zn2+-transporting four-helix bundle. Science 346, 1520–1524 (2014). Presents the design of a functional de novo helical bundle (known as Rocker) that can transport Zn2+ and Co2+, but not Ca2+, across membranes.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  35. 35

    Regan, L. & DeGrado, W. F. Characterization of a helical protein designed from first principles. Science 241, 976–978 (1988).

    Article  ADS  CAS  PubMed  Google Scholar 

  36. 36

    Lin, Y.-R. et al. Control over overall shape and size in de novo designed proteins. Proc. Natl Acad. Sci. USA 112, E5478–E5485 (2015).

    Article  CAS  PubMed  Google Scholar 

  37. 37

    Koga, N. et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012). Established sequence-independent design principles, which enabled the design of five αβ topologies.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    King, I. C. et al. Precise assembly of complex beta sheet topologies from de novo designed building blocks. eLife 4, e11012 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  39. 39

    Rämisch, S., Weininger, U., Martinsson, J., Akke, M. & André, I. Computational design of a leucine-rich repeat protein with a predefined geometry. Proc. Natl Acad. Sci. USA 111, 17875–17880 (2014).

    Article  ADS  CAS  PubMed  Google Scholar 

  40. 40

    Park, K. et al. Control of repeat-protein curvature by computational protein design. Nature Struct. Mol. Biol. 22, 167–174 (2015).

    Article  CAS  Google Scholar 

  41. 41

    Doyle, L. et al. Rational design of α-helical tandem repeat proteins with closed architectures. Nature 528, 585–588 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  42. 42

    Huang, P.-S. et al. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nature Chem. Biol. 12, 29–34 (2016). The first structurally verified design of a TIM barrel.

    Article  CAS  Google Scholar 

  43. 43

    Brunette, T. J. et al. Exploring the repeat protein universe through computational protein design. Nature 528, 580–584 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  44. 44

    Jacobs, T. M. et al. Design of structurally distinct proteins using strategies inspired by evolution. Science 352, 687–690 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  45. 45

    Murphy, G. S. et al. Increasing sequence diversity with flexible backbone protein design: the complete redesign of a protein hydrophobic core. Structure 20, 1086–1096 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Correia, B. E. et al. Proof of principle for epitope-focused vaccine design. Nature 507, 201–206 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  47. 47

    Crick, F. H. C. The packing of α-helices: simple coiled-coils. Acta Cryst. 6, 689–697 (1953).

    Article  CAS  MATH  Google Scholar 

  48. 48

    Walsh, S. T., Cheng, H., Bryson, J. W., Roder, H. & DeGrado, W. F. Solution structure and dynamics of a de novo designed three-helix bundle protein. Proc. Natl Acad. Sci. USA 96, 5486–5491 (1999).

    Article  ADS  CAS  PubMed  Google Scholar 

  49. 49

    Fletcher, J. M. et al. A basis set of de novo coiled-coil peptide oligomers for rational protein design and synthetic biology. ACS Synth. Biol. 1, 240–250 (2012).

    Article  CAS  PubMed  Google Scholar 

  50. 50

    Zaccai, N. R. et al. A de novo peptide hexamer with a mutable channel. Nature Chem. Biol. 7, 935–941 (2011).

    Article  CAS  Google Scholar 

  51. 51

    Eisenberg, D. et al. The design, synthesis, and crystallization of an alpha-helical peptide. Proteins 1, 16–22 (1986).

    Article  CAS  PubMed  Google Scholar 

  52. 52

    Keating, A. E., Malashkevich, V. N., Tidor, B. & Kim, P. S. Side-chain repacking calculations for predicting structures and stabilities of heterodimeric coiled coils. Proc. Natl Acad. Sci. USA 98, 14825–14830 (2001).

    Article  ADS  CAS  PubMed  Google Scholar 

  53. 53

    Grigoryan, G. et al. Computational design of virus-like protein assemblies on carbon nanotube surfaces. Science 332, 1071–1076 (2011). Describes the design of functional helical peptides that coat single-walled carbon nanotubes.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  54. 54

    Fletcher, J. M. et al. Self-assembling cages from coiled-coil peptide modules. Science 340, 595–599 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  55. 55

    Burton, A. J., Thomson, A. R., Dawson, W. M., Brady, R. L. & Woolfson, D. N. Installing hydrolytic activity into a completely de novo protein framework. Nature Chem. http://dx.doi.org/10.1038/nchem.2555 (2016).

  56. 56

    Grigoryan, G., Reinke, A. W. & Keating, A. E. Design of protein-interaction specificity gives selective bZIP-binding peptides. Nature 458, 859–864 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  57. 57

    Reinke, A. W., Grant, R. A. & Keating, A. E. A synthetic coiled-coil interactome provides heterospecific modules for molecular engineering. J. Am. Chem. Soc. 132, 6025–6031 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. 58

    London, N. & Ambroggio, X. An accurate binding interaction model in de novo computational protein design of interactions: if you build it, they will bind. J. Struct. Biol. 185, 136–146 (2014).

    Article  CAS  PubMed  Google Scholar 

  59. 59

    Gradišar, H. et al. Design of a single-chain polypeptide tetrahedron assembled from coiled-coil segments. Nature Chem. Biol. 9, 362–366 (2013).

    Article  CAS  Google Scholar 

  60. 60

    Gradišar, H. & Jerala, R. De novo design of orthogonal peptide pairs forming parallel coiled-coil heterodimers. J. Pept. Sci. 17, 100–106 (2011).

    Article  CAS  PubMed  Google Scholar 

  61. 61

    Schreiber, G. & Keating, A. E. Protein binding specificity versus promiscuity. Curr. Opin. Struct. Biol. 21, 50–61 (2011).

    Article  CAS  PubMed  Google Scholar 

  62. 62

    Stranges, P. B. & Kuhlman, B. A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds. Protein Sci. 22, 74–82 (2013).

    Article  CAS  PubMed  Google Scholar 

  63. 63

    Boyken, S. E. et al. De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science 352, 680–687 (2016). Describes the design of helical bundles with extensive buried hydrogen-bond networks that mediate interaction specificity in a manner analogous to DNA base pairing.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  64. 64

    Seeman, N. C. DNA in a material world. Nature 421, 427–431 (2003).

    Article  ADS  MathSciNet  CAS  Google Scholar 

  65. 65

    Linko, V. & Dietz, H. The enabled state of DNA nanotechnology. Curr. Opin. Biotechnol. 24, 555–561 (2013).

    Article  CAS  PubMed  Google Scholar 

  66. 66

    Zhang, F., Nangreave, J., Liu, Y. & Yan, H. Structural DNA nanotechnology: state of the art and future perspective. J. Am. Chem. Soc. 136, 11198–11211 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. 67

    Hilvert, D. Design of protein catalysts. Annu. Rev. Biochem. 82, 447–470 (2013).

    Article  CAS  PubMed  Google Scholar 

  68. 68

    Kries, H., Blomberg, R. & Hilvert, D. De novo enzymes by computational design. Curr. Opin. Chem. Biol. 17, 221–228 (2013).

    Article  CAS  PubMed  Google Scholar 

  69. 69

    Kiss, G., Çelebi-Ölçüm, N., Moretti, R., Baker, D. & Houk, K. N. Computational enzyme design. Angew. Chem. Int. Edn Engl. 52, 5700–5725 (2013).

    Article  CAS  Google Scholar 

  70. 70

    Richter, F., Leaver-Fay, A., Khare, S. D., Bjelic, S. & Baker, D. De novo enzyme design using Rosetta3. PLoS ONE 6, e19230 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  71. 71

    Baker, D. An exciting but challenging road ahead for computational enzyme design. Protein Sci. 19, 1817–1819 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. 72

    Kiss, G., Röthlisberger, D., Baker, D. & Houk, K. N. Evaluation and ranking of enzyme designs. Protein Sci. 19, 1760–1773 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. 73

    Garrabou, X., Wicky, B. I. & Hilvert, D. Fast Knoevenagel condensations catalyzed by an artificial Schiff-base-forming enzyme. J. Am. Chem. Soc. 138, 6972–6974 (2016).

    Article  CAS  PubMed  Google Scholar 

  74. 74

    Koday, M. T. et al. A computationally designed hemagglutinin stem-binding protein provides in vivo protection from influenza independent of a host immune response. PLoS Pathog. 12, e1005409 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. 75

    Tinberg, C. E. et al. Computational design of ligand-binding proteins with high affinity and selectivity. Nature 501, 212–216 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  76. 76

    Griss, R. et al. Bioluminescent sensor proteins for point-of-care therapeutic drug monitoring. Nature Chem. Biol. 10, 598–603 (2014).

    Article  CAS  Google Scholar 

  77. 77

    Feng, J. et al. A general strategy to construct small molecule biosensors in eukaryotes. eLife 4, e10606 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  78. 78

    Mou, Y., Huang, P.-S., Hsu, F.-C., Huang, S.-J. & Mayo, S. L. Computational design and experimental verification of a symmetric protein homodimer. Proc. Natl Acad. Sci. USA 112, 10714–10719 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  79. 79

    King, N. P. et al. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science 336, 1171–1174 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  80. 80

    King, N. P. et al. Accurate design of co-assembling multi-component protein nanomaterials. Nature 510, 103–108 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  81. 81

    Gonen, S., Dimaio, F., Gonen, T. & Baker, D. Design of ordered two-dimensional arrays mediated by noncovalent protein-protein interfaces. Science 348, 1365–1368 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  82. 82

    Hsia, Y. et al. Design of a hyperstable 60-subunit protein icosahedron. Nature 535, 136–139 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  83. 83

    Bale, J. B. et al. Accurate design of megadalton-scale two-component icosahedral protein complexes. Science 353, 389–394 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  84. 84

    Bhardwaj, G., Mulligan, V. K., Bahl, C. D. & Baker, D. Accurate de novo design of hyperstable constrained peptides. Nature http://dx.doi.org/10.1038/nature19791 (2016). The design of hyperstable constrained peptides that incorporate both L - and D -amino acids is described.

  85. 85

    Giger, L. et al. Evolution of a designed retro-aldolase leads to complete active site remodeling. Nature Chem. Biol. 9, 494–498 (2013).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank all members of the Baker laboratory and the Institute for Protein Design at the University of Washington, as well as the RosettaCommons community. We apologize to the researchers and protein designers whose work we were unable to acknowledge due to space and scope limitations. The authors are supported by the Howard Hughes Medical Institute (HHMI-027779).

Author information

Affiliations

Authors

Corresponding author

Correspondence to David Baker.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reprints and permissions information is available at www.nature.com/reprints.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, P., Boyken, S. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016). https://doi.org/10.1038/nature19946

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.