Review Article | Published:

The coming of age of de novo protein design

Nature volume 537, pages 320327 (15 September 2016) | Download Citation

Abstract

There are 20200 possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Protein Data Bank

References

  1. 1.

    How the olfactory system makes sense of scents. Nature 413, 211–218 (2001).

  2. 2.

    , & The structure and function of G-protein-coupled receptors. Nature 459, 356–363 (2009).

  3. 3.

    , & ATP synthase — a marvellous rotary engine of the cell. Nature Rev. Mol. Cell Biol. 2, 669–677 (2001).

  4. 4.

    The myosin swinging cross-bridge model. Nature Rev. Mol. Cell Biol. 2, 387–392 (2001).

  5. 5.

    & Directed evolution: new parts and optimized function. Curr. Opin. Biotechnol. 20, 486–491 (2009).

  6. 6.

    The nature of chemical innovation: new enzymes by evolution. Q. Rev. Biophys. 48, 404–410 (2015).

  7. 7.

    & Directed enzyme evolution: beyond the low-hanging fruit. Curr. Opin. Struct. Biol. 22, 406–412 (2012).

  8. 8.

    , , & Protein folding and de novo protein design for biotechnological applications. Trends Biotechnol. 32, 99–109 (2014).

  9. 9.

    et al. Protein design: past, present, and future. Biopolymers 104, 334–350 (2015).

  10. 10.

    & Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

  11. 11.

    , , , & De novo designed proteins from a library of artificial sequences function in Escherichia coli and enable cell growth. PLoS ONE 6, e15364 (2011).

  12. 12.

    , & De novo proteins with life-sustaining functions are structurally dynamic. J. Mol. Biol. 428, 399–411 (2016).

  13. 13.

    , & The genetic control of tertiary protein structure: studies with model systems. Cold Spring Harb. Simp. Quant. Biol. 28, 439–449 (1963).

  14. 14.

    et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011).

  15. 15.

    et al. CCBuilder: an interactive web-based tool for building, designing and assessing coiled-coil protein assemblies. Bioinformatics 30, 3029–3035 (2014).

  16. 16.

    & Multistate protein design using CLEVER and CLASSY. Methods Enzymol. 523, 171–190 (2013).

  17. 17.

    , , , & Protein WISDOM: a workbench for in silico de novo design of biomolecules. J. Vis. Exp. 77, e50476 (2013).

  18. 18.

    & Do all backbone polar groups in proteins form hydrogen bonds? Protein Sci. 14, 1911–1917 (2005).

  19. 19.

    & Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol. 193, 775–791 (1987).

  20. 20.

    & Protein design automation. Protein Sci. 5, 895–903 (1996).

  21. 21.

    & De novo protein design: fully automated sequence selection. Science 278, 82–87 (1997).

  22. 22.

    & Native protein sequences are close to optimal for their structures. Proc. Natl Acad. Sci. USA 97, 10383–10388 (2000).

  23. 23.

    , , & Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).

  24. 24.

    , , , & Learning generative models for protein fold families. Proteins 79, 1061–1078 (2011).

  25. 25.

    , & Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era. Proc. Natl Acad. Sci. USA 110, 15674–15679 (2013).

  26. 26.

    , & Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information. eLife 3, e02030 (2014).

  27. 27.

    et al. Large-scale determination of previously unsolved protein structures using evolutionary information. eLife 4, e09248 (2015).

  28. 28.

    Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003). Describes Top7, the first globular protein to be designed with a fold not observed in nature.

  29. 29.

    et al. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS ONE 6, e24109 (2011).

  30. 30.

    , , , & High-resolution protein design with backbone freedom. Science 282, 1462–1467 (1998). Describes RH4, the first protein to be designed using flexible-backbone methods and parametric equations.

  31. 31.

    et al. Computational design of water-soluble α-helical barrels. Science 346, 485–488 (2014).

  32. 32.

    & Probing designability via a generalized model of helical bundle geometry. J. Mol. Biol. 405, 1079–1100 (2011).

  33. 33.

    et al. High thermodynamic stability of parametrically designed helical bundles. Science 346, 481–485 (2014).

  34. 34.

    et al. De novo design of a transmembrane Zn2+-transporting four-helix bundle. Science 346, 1520–1524 (2014). Presents the design of a functional de novo helical bundle (known as Rocker) that can transport Zn2+ and Co2+, but not Ca2+, across membranes.

  35. 35.

    & Characterization of a helical protein designed from first principles. Science 241, 976–978 (1988).

  36. 36.

    et al. Control over overall shape and size in de novo designed proteins. Proc. Natl Acad. Sci. USA 112, E5478–E5485 (2015).

  37. 37.

    et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012). Established sequence-independent design principles, which enabled the design of five αβ topologies.

  38. 38.

    et al. Precise assembly of complex beta sheet topologies from de novo designed building blocks. eLife 4, e11012 (2015).

  39. 39.

    , , , & Computational design of a leucine-rich repeat protein with a predefined geometry. Proc. Natl Acad. Sci. USA 111, 17875–17880 (2014).

  40. 40.

    et al. Control of repeat-protein curvature by computational protein design. Nature Struct. Mol. Biol. 22, 167–174 (2015).

  41. 41.

    et al. Rational design of α-helical tandem repeat proteins with closed architectures. Nature 528, 585–588 (2015).

  42. 42.

    et al. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nature Chem. Biol. 12, 29–34 (2016). The first structurally verified design of a TIM barrel.

  43. 43.

    et al. Exploring the repeat protein universe through computational protein design. Nature 528, 580–584 (2015).

  44. 44.

    et al. Design of structurally distinct proteins using strategies inspired by evolution. Science 352, 687–690 (2016).

  45. 45.

    et al. Increasing sequence diversity with flexible backbone protein design: the complete redesign of a protein hydrophobic core. Structure 20, 1086–1096 (2012).

  46. 46.

    et al. Proof of principle for epitope-focused vaccine design. Nature 507, 201–206 (2014).

  47. 47.

    The packing of α-helices: simple coiled-coils. Acta Cryst. 6, 689–697 (1953).

  48. 48.

    , , , & Solution structure and dynamics of a de novo designed three-helix bundle protein. Proc. Natl Acad. Sci. USA 96, 5486–5491 (1999).

  49. 49.

    et al. A basis set of de novo coiled-coil peptide oligomers for rational protein design and synthetic biology. ACS Synth. Biol. 1, 240–250 (2012).

  50. 50.

    et al. A de novo peptide hexamer with a mutable channel. Nature Chem. Biol. 7, 935–941 (2011).

  51. 51.

    et al. The design, synthesis, and crystallization of an alpha-helical peptide. Proteins 1, 16–22 (1986).

  52. 52.

    , , & Side-chain repacking calculations for predicting structures and stabilities of heterodimeric coiled coils. Proc. Natl Acad. Sci. USA 98, 14825–14830 (2001).

  53. 53.

    et al. Computational design of virus-like protein assemblies on carbon nanotube surfaces. Science 332, 1071–1076 (2011). Describes the design of functional helical peptides that coat single-walled carbon nanotubes.

  54. 54.

    et al. Self-assembling cages from coiled-coil peptide modules. Science 340, 595–599 (2013).

  55. 55.

    , , , & Installing hydrolytic activity into a completely de novo protein framework. Nature Chem. (2016).

  56. 56.

    , & Design of protein-interaction specificity gives selective bZIP-binding peptides. Nature 458, 859–864 (2009).

  57. 57.

    , & A synthetic coiled-coil interactome provides heterospecific modules for molecular engineering. J. Am. Chem. Soc. 132, 6025–6031 (2010).

  58. 58.

    & An accurate binding interaction model in de novo computational protein design of interactions: if you build it, they will bind. J. Struct. Biol. 185, 136–146 (2014).

  59. 59.

    et al. Design of a single-chain polypeptide tetrahedron assembled from coiled-coil segments. Nature Chem. Biol. 9, 362–366 (2013).

  60. 60.

    & De novo design of orthogonal peptide pairs forming parallel coiled-coil heterodimers. J. Pept. Sci. 17, 100–106 (2011).

  61. 61.

    & Protein binding specificity versus promiscuity. Curr. Opin. Struct. Biol. 21, 50–61 (2011).

  62. 62.

    & A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds. Protein Sci. 22, 74–82 (2013).

  63. 63.

    et al. De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science 352, 680–687 (2016). Describes the design of helical bundles with extensive buried hydrogen-bond networks that mediate interaction specificity in a manner analogous to DNA base pairing.

  64. 64.

    DNA in a material world. Nature 421, 427–431 (2003).

  65. 65.

    & The enabled state of DNA nanotechnology. Curr. Opin. Biotechnol. 24, 555–561 (2013).

  66. 66.

    , , & Structural DNA nanotechnology: state of the art and future perspective. J. Am. Chem. Soc. 136, 11198–11211 (2014).

  67. 67.

    Design of protein catalysts. Annu. Rev. Biochem. 82, 447–470 (2013).

  68. 68.

    , & De novo enzymes by computational design. Curr. Opin. Chem. Biol. 17, 221–228 (2013).

  69. 69.

    , , , & Computational enzyme design. Angew. Chem. Int. Edn Engl. 52, 5700–5725 (2013).

  70. 70.

    , , , & De novo enzyme design using Rosetta3. PLoS ONE 6, e19230 (2011).

  71. 71.

    An exciting but challenging road ahead for computational enzyme design. Protein Sci. 19, 1817–1819 (2010).

  72. 72.

    , , & Evaluation and ranking of enzyme designs. Protein Sci. 19, 1760–1773 (2010).

  73. 73.

    , & Fast Knoevenagel condensations catalyzed by an artificial Schiff-base-forming enzyme. J. Am. Chem. Soc. 138, 6972–6974 (2016).

  74. 74.

    et al. A computationally designed hemagglutinin stem-binding protein provides in vivo protection from influenza independent of a host immune response. PLoS Pathog. 12, e1005409 (2016).

  75. 75.

    et al. Computational design of ligand-binding proteins with high affinity and selectivity. Nature 501, 212–216 (2013).

  76. 76.

    et al. Bioluminescent sensor proteins for point-of-care therapeutic drug monitoring. Nature Chem. Biol. 10, 598–603 (2014).

  77. 77.

    et al. A general strategy to construct small molecule biosensors in eukaryotes. eLife 4, e10606 (2015).

  78. 78.

    , , , & Computational design and experimental verification of a symmetric protein homodimer. Proc. Natl Acad. Sci. USA 112, 10714–10719 (2015).

  79. 79.

    et al. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science 336, 1171–1174 (2012).

  80. 80.

    et al. Accurate design of co-assembling multi-component protein nanomaterials. Nature 510, 103–108 (2014).

  81. 81.

    , , & Design of ordered two-dimensional arrays mediated by noncovalent protein-protein interfaces. Science 348, 1365–1368 (2015).

  82. 82.

    et al. Design of a hyperstable 60-subunit protein icosahedron. Nature 535, 136–139 (2016).

  83. 83.

    et al. Accurate design of megadalton-scale two-component icosahedral protein complexes. Science 353, 389–394 (2016).

  84. 84.

    , , & Accurate de novo design of hyperstable constrained peptides. Nature (2016). The design of hyperstable constrained peptides that incorporate both L- and D-amino acids is described.

  85. 85.

    et al. Evolution of a designed retro-aldolase leads to complete active site remodeling. Nature Chem. Biol. 9, 494–498 (2013).

Download references

Acknowledgements

We thank all members of the Baker laboratory and the Institute for Protein Design at the University of Washington, as well as the RosettaCommons community. We apologize to the researchers and protein designers whose work we were unable to acknowledge due to space and scope limitations. The authors are supported by the Howard Hughes Medical Institute (HHMI-027779).

Author information

Author notes

    • Po-Ssu Huang
    • , Scott E. Boyken
    •  & David Baker

    These authors contributed equally to this work.

    • Po-Ssu Huang

    Present address: Department of Bioengineering, Stanford University, Stanford, California 94305, USA.

Affiliations

  1. Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA.

    • Po-Ssu Huang
    • , Scott E. Boyken
    •  & David Baker
  2. Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA.

    • Po-Ssu Huang
    • , Scott E. Boyken
    •  & David Baker
  3. Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA.

    • Scott E. Boyken
    •  & David Baker

Authors

  1. Search for Po-Ssu Huang in:

  2. Search for Scott E. Boyken in:

  3. Search for David Baker in:

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to David Baker.

Reprints and permissions information is available at www.nature.com/reprints.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nature19946

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.