Accurate de novo design of hyperstable constrained peptides

Bhardwaj, Gaurav; Mulligan, Vikram Khipple; Bahl, Christopher D.; Gilmore, Jason M.; Harvey, Peta J.; Cheneval, Olivier; Buchko, Garry W.; Pulavarti, Surya V. S. R. K.; Kaas, Quentin; Eletsky, Alexander; Huang, Po-Ssu; Johnsen, William A.; Greisen, Per Jr; Rocklin, Gabriel J.; Song, Yifan; Linsky, Thomas W.; Watkins, Andrew; Rettie, Stephen A.; Xu, Xianzhong; Carter, Lauren P.; Bonneau, Richard; Olson, James M.; Coutsias, Evangelos; Correnti, Colin E.; Szyperski, Thomas; Craik, David J.; Baker, David

doi:10.1038/nature19791

Article
Published: 14 September 2016

Accurate de novo design of hyperstable constrained peptides

Gaurav Bhardwaj^1,2^na1,
Vikram Khipple Mulligan^1,2^na1,
Christopher D. Bahl^1,2^na1,
Jason M. Gilmore^1,2,
Peta J. Harvey³,
Olivier Cheneval³,
Garry W. Buchko⁴,
Surya V. S. R. K. Pulavarti⁵,
Quentin Kaas³,
Alexander Eletsky⁵,
Po-Ssu Huang^1,2,
William A. Johnsen⁶,
Per Jr Greisen^1,2,7,
Gabriel J. Rocklin^1,2,
Yifan Song^1,2,8,
Thomas W. Linsky^1,2,
Andrew Watkins⁹,
Stephen A. Rettie²,
Xianzhong Xu⁵,
Lauren P. Carter²,
Richard Bonneau^10,11,
James M. Olson⁶,
Evangelos Coutsias¹²,
Colin E. Correnti⁶,
Thomas Szyperski⁵,
David J. Craik³ &
…
David Baker^1,2,13

Nature volume 538, pages 329–335 (2016)Cite this article

33k Accesses
245 Citations
161 Altmetric
Metrics details

Subjects

Abstract

Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18–47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N–C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Designed peptide topologies.**

**Figure 2: Computational design and biophysical characterization of genetically encodable disulfide-rich peptides.**

**Figure 3: X-ray crystal structures and NMR solution structures of designed peptides are very close to design models.**

**Figure 4: Design and characterization of heterochiral disulfide-constrained peptides.**

**Figure 5: Design and characterization of N-C backbone cyclic peptides.**

**Figure 6: Design and characterization of a peptide with non-canonical secondary and tertiary structure.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

John Jumper, Richard Evans, … Demis Hassabis

De novo design of protein structure and function with RFdiffusion

Article Open access 11 July 2023

Joseph L. Watson, David Juergens, … David Baker

Proteome-scale discovery of protein degradation and stabilization effectors

Article 20 March 2024

Juline Poirson, Hanna Cho, … Mikko Taipale

Accession codes

Primary accessions

Protein Data Bank

Data deposits

Peptide structures have been deposited in the RCSB Protein Data Bank with accession codes 5JG9, 2ND2, 2ND3, 5JHI, 5JI4, 5KVN, 5KWO, 5KWP, 5KWX, 5KX2, 5KWZ, 5KX1, 5KX0.

References

Conibear, A. C. et al. Approaches to the stabilization of bioactive epitopes by grafting and peptide cyclization. Biopolymers 106, 89–100 (2016)
Article CAS PubMed Google Scholar
Craik, D. J., Fairlie, D. P., Liras, S. & Price, D. The future of peptide-based drugs. Chem. Biol. Drug Des. 81, 136–147 (2013)
Article CAS PubMed Google Scholar
Góngora-Benítez, M., Tulla-Puche, J. & Albericio, F. Multifaceted roles of disulfide bonds. Peptides as therapeutics. Chem. Rev. 114, 901–926 (2014)
Article CAS PubMed Google Scholar
Kimura, R. H., Levin, A. M., Cochran, F. V. & Cochran, J. R. Engineered cystine knot peptides that bind αvβ3, αvβ5, and α5β1 integrins with low-nanomolar affinity. Proteins 77, 359–369 (2009)
Article CAS PubMed PubMed Central Google Scholar
Boyken, S. E. et al. De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science 352, 680–687 (2016)
Article CAS ADS PubMed PubMed Central Google Scholar
Brunette, T. J. et al. Exploring the repeat protein universe through computational protein design. Nature 528, 580–584 (2015)
Article CAS ADS PubMed PubMed Central Google Scholar
Lin, Y.-R. et al. Control over overall shape and size in de novo designed proteins. Proc. Natl Acad. Sci. USA 112, E5478–E5485 (2015)
Article CAS PubMed PubMed Central Google Scholar
Doyle, L. et al. Rational design of α-helical tandem repeat proteins with closed architectures. Nature 528, 585–588 (2015)
Article CAS ADS PubMed PubMed Central Google Scholar
Koga, N. et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012)
Article CAS ADS PubMed PubMed Central Google Scholar
Leaver-Fay, A. et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011)
Article CAS PubMed PubMed Central Google Scholar
Huang, P.-S. et al. High thermodynamic stability of parametrically designed helical bundles. Science 346, 481–485 (2014)
Article CAS ADS PubMed PubMed Central Google Scholar
Bandaranayake, A. D. et al. Daedalus: a robust, turnkey platform for rapid production of decigram quantities of active recombinant proteins in human cell lines using novel lentiviral vectors. Nucleic Acids Res. 39, e143 (2011)
Article CAS PubMed PubMed Central Google Scholar
Sagaram, U. S. et al. Structural and functional studies of a phosphatidic acid-binding antifungal plant defensin MtDef4: identification of an RGFRRR motif governing fungal cell entry. PLoS One 8, e82485 (2013)
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, G. et al. NMR data collection and analysis protocol for high-throughput protein structure determination. Proc. Natl Acad. Sci. USA 102, 10487–10492 (2005)
Article CAS ADS PubMed PubMed Central Google Scholar
Sharma, D. & Rajarathnam, K. ¹³C NMR chemical shifts can predict disulfide bond formation. J. Biomol. NMR 18, 165–171 (2000)
Article CAS PubMed Google Scholar
Richardson, J. S. β-Sheet topology and the relatedness of proteins. Nature 268, 495–500 (1977)
Article CAS ADS PubMed Google Scholar
Syud, F. A., Stanger, H. E. & Gellman, S. H. Interstrand side chain–side chain interactions in a designed β-hairpin: significance of both lateral and diagonal pairings. J. Am. Chem. Soc. 123, 8667–8677 (2001)
Article CAS PubMed Google Scholar
Lai, J. R., Huck, B. R., Weisblum, B. & Gellman, S. H. Design of non-cysteine-containing antimicrobial β-hairpins: structure-activity relationship studies with linear protegrin-1 analogues. Biochemistry 41, 12835–12842 (2002)
Article CAS PubMed Google Scholar
Wang, J., Yadav, V., Smart, A. L., Tajiri, S. & Basit, A. W. Toward oral delivery of biopharmaceuticals: an assessment of the gastrointestinal stability of 17 peptide drugs. Mol. Pharm. 12, 966–973 (2015)
Article CAS PubMed Google Scholar
Coutsias, E. A., Seok, C., Jacobson, M. P. & Dill, K. A. A kinematic view of loop closure. J. Comput. Chem. 25, 510–528 (2004)
Article CAS PubMed Google Scholar
Mandell, D. J., Coutsias, E. A. & Kortemme, T. Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling. Nat. Methods 6, 551–552 (2009)
Article CAS PubMed PubMed Central Google Scholar
Trabi, M., Schirra, H. J. & Craik, D. J. Three-dimensional structure of RTD-1, a cyclic antimicrobial defensin from Rhesus macaque leukocytes. Biochemistry 40, 4211–4221 (2001)
Article CAS PubMed Google Scholar
Sia, S. K. & Kim, P. S. A designed protein with packing between left-handed and right-handed helices. Biochemistry 40, 8981–8989 (2001)
Article CAS PubMed Google Scholar
Renfrew, P. D., Douglas Renfrew, P., Choi, E. J., Richard, B. & Brian, K. Incorporation of noncanonical amino acids into Rosetta and use in computational protein-peptide interface design. PLoS One 7, e32637 (2012)
Article CAS ADS PubMed PubMed Central Google Scholar
Drew, K. et al. Adding diverse noncanonical backbones to Rosetta: enabling peptidomimetic design. PLoS One 8, e67051 (2013)
Article CAS ADS PubMed PubMed Central Google Scholar
Fleishman, S. J. et al. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science 332, 816–821 (2011)
Article CAS ADS PubMed PubMed Central Google Scholar
Huang, P.-S. et al. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS One 6, e24109 (2011)
Article CAS ADS PubMed PubMed Central Google Scholar
Lee, J., Lee, D., Park, H., Coutsias, E. A. & Seok, C. Protein loop modeling by using fragment assembly and analytical loop closure. Proteins 78, 3428–3436 (2010)
Article CAS PubMed PubMed Central Google Scholar
Harrison, P. M. & Sternberg, M. J. Analysis and classification of disulphide connectivity in proteins. The entropic effect of cross-linkage. J. Mol. Biol. 244, 448–463 (1994)
Article CAS PubMed Google Scholar
Rodriguez-Granillo, A., Annavarapu, S., Zhang, L., Koder, R. L. & Nanda, V. Computational design of thermostabilizing d-amino acid substitutions. J. Am. Chem. Soc. 133, 18750–18759 (2011)
Article CAS PubMed PubMed Central Google Scholar
O’Meara, M. J. et al. Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with Rosetta. J. Chem. Theory Comput. 11, 609–622 (2015)
Article CAS PubMed PubMed Central Google Scholar
Bradley, P., Misura, K. M. S. & Baker, D. Toward high-resolution de novo structure prediction for small proteins. Science 309, 1868–1871 (2005)
Article CAS ADS PubMed Google Scholar
Caves, L. S., Evanseck, J. D. & Karplus, M. Locally accessible conformations of proteins: multiple molecular dynamics simulations of crambin. Protein Sci. 7, 649–666 (1998)
Article CAS PubMed PubMed Central Google Scholar
Wijma, H. J. et al. Computationally designed libraries for rapid enzyme stabilization. Protein Eng. Des. Sel. 27, 49–58 (2014)
Article CAS PubMed PubMed Central Google Scholar
Case, D. A. et al. AMBER 12 http://ambermd.org/doc12/Amber12.pdf (Univ. California, 2012)
Jorgensen, W. L. & Corky, J. Temperature dependence of TIP3P, SPC, and TIP4P water from NPT Monte Carlo simulations: seeking temperatures of maximum density. J. Comput. Chem. 19, 1179–1186 (1998)
Article CAS Google Scholar
Loncharich, R. J., Brooks, B. R. & Pastor, R. W. Langevin dynamics of peptides: the frictional dependence of isomerization rates of N-acetylalanyl-N′-methylamide. Biopolymers 32, 523–535 (1992)
Article CAS PubMed Google Scholar
Darden, T., York, D. & Pedersen, L. Particle mesh Ewald: an N · log(N) method for Ewald sums in large systems. J. Chem. Phys. 98, 10089–10092 (1993)
Article CAS ADS Google Scholar
Ryckaert, J.-P., Giovanni, C. & Berendsen, H. J. C. Numerical integration of the Cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 23, 327–341 (1977)
Article CAS ADS Google Scholar
Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph. 14, 33–38 (1996)
Article CAS PubMed Google Scholar
Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009)
Article CAS PubMed Google Scholar
Kotzsch, A. et al. A secretory system for bacterial production of high-profile protein targets. Protein Sci. 20, 597–609 (2011)
Article CAS PubMed PubMed Central Google Scholar
Marblestone, J. G. et al. Comparison of SUMO fusion technology with traditional gene fusion systems: enhanced expression and solubility with SUMO. Protein Sci. 15, 182–189 (2006)
Article CAS PubMed PubMed Central Google Scholar
Studier, F. W. Protein production by auto-induction in high-density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005)
Article CAS PubMed Google Scholar
Neu, H. C. & Heppel, L. A. The release of enzymes from Escherichia coli by osmotic shock and during the formation of spheroplasts. J. Biol. Chem. 240, 3685–3692 (1965)
CAS PubMed Google Scholar
Cheneval, O. et al. Fmoc-based synthesis of disulfide-rich cyclic peptides. J. Org. Chem. 79, 5538–5544 (2014)
Article CAS PubMed Google Scholar
Pace, C. N. Determination and analysis of urea and guanidine hydrochloride denaturation curves. Methods Enzymol. 131, 266–280 (1986)
Article CAS PubMed Google Scholar
Neri, D. et al. Stereospecific nuclear magnetic resonance assignments of the methyl groups of valine and leucine in the DNA-binding domain of the 434 repressor by biosynthetically directed fractional carbon-13 labeling. Biochemistry 28, 7510–7516 (1989)
Article CAS PubMed Google Scholar
Herve du Penhoat, C. et al. The NMR solution structure of the 30S ribosomal protein S27e encoded in gene RS27_ARCFU of Archaeoglobus fulgidis reveals a novel protein fold. Protein Sci. 13, 1407–1416 (2004)
Article CAS PubMed PubMed Central Google Scholar
Shen, Y., Delaglio, F., Cornilescu, G. & Bax, A. TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J. Biomol. NMR 44, 213–223 (2009)
Article CAS PubMed PubMed Central Google Scholar
Linge, J. P., Williams, M. A., Spronk, C. A. E. M., Alexandre, M. J. & Michael, N. Refinement of protein structures in explicit solvent. Proteins Struct. Funct. Bioinf. 50, 496–506 (2003)
Article CAS Google Scholar
Bhattacharya, A., Tejero, R. & Montelione, G. T. Evaluating protein structures determined by structural genomics consortia. Proteins 66, 778–795 (2007)
Article CAS PubMed Google Scholar
Vranken, W. F. et al. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins Struct. Funct. Bioinf. 59, 687–696 (2005)
Article CAS Google Scholar
Shen, Y. & Bax, A. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J. Biomol. NMR 56, 227–241 (2013)
Article CAS PubMed PubMed Central Google Scholar
Brunger, A. T. Version 1.2 of the Crystallography and NMR system. Nat. Protocols 2, 2728–2733 (2007)
Article CAS PubMed Google Scholar
Nederveen, A. J. et al. RECOORD: a recalculated coordinate database of 500 proteins from the PDB using restraints from the BioMagResBank. Proteins Struct. Funct. Bioinf. 59, 662–672 (2005)
Article CAS Google Scholar
Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D 66, 12–21 (2010)
Article CAS PubMed Google Scholar
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Cryst. 40, 658–674 (2007)
Article CAS Google Scholar

Download references

Acknowledgements

Computer time was awarded by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. This research used resources of the Argonne Leadership Computing Facility, a Department of Energy (DOE) Office of Science User Facility supported under contract DE-AC02-06CH11357. We thank the University of Washington Hyak supercomputing network for computing and data storage resources, and Rosetta@Home volunteer participants on BOINC for additional computing resources. We are grateful for facility access at the Queensland NMR Network. We thank D. Alonso, J. Bardwell, G. Bhabha, T.J. Brunette, D. Ekiert, A. Ford, N. Hasle, B. Keir, N. Koga, Y. Liu, D. Madden, B. Mao, D. May, V. Ovchinnikov, S. Srivatsan, L. Stewart, R. van Deursen, and M. Williamson for help and advice, and R. Krishnamurty, P. Hosseinzadeh, and A. Vorobieva for critical comments and manuscript suggestions. This work was supported by NIH grant P50 AG005136 supporting the Alzheimer’s Disease Research Center, philanthropic gifts from the Three Dreamers and Washington Research Foundation, and funding from the Howard Hughes Medical Institute. The Australian Research Council funds D.J.C. as an Australian Laureate Fellow (FL150100146). C.D.B. was supported by NIH grant T32-H600035. T.S. acknowledges NIH support (GM094597), and S.V.S.R.K.P., A.E. and X.X. were supported with NESG funds. E.C. is funded by NIGMS GM090205. We thank P. Rupert and R.K. Strong at the Fred Hutchinson Cancer Research Center for aid in collecting and refining X-ray data for gEHEE_06. G.W.B. was funded by the National Institute of Allergy and Infectious Diseases, National Institute of Health, Department of Health and Human Services (Federal contract HHSN272201200025C). A portion of this research was performed using EMSL, a DOE Office of Science User Facility sponsored by the Office of Biological and Environmental Research and located at Pacific Northwest National Laboratory.

Author information

Gaurav Bhardwaj, Vikram Khipple Mulligan and Christopher D. Bahl: These authors contributed equally to this work.

Authors and Affiliations

Department of Biochemistry, University of Washington, Seattle, 98195, Washington, USA
Gaurav Bhardwaj, Vikram Khipple Mulligan, Christopher D. Bahl, Jason M. Gilmore, Po-Ssu Huang, Per Jr Greisen, Gabriel J. Rocklin, Yifan Song, Thomas W. Linsky & David Baker
Institute for Protein Design, University of Washington, Seattle, 98195, Washington, USA
Gaurav Bhardwaj, Vikram Khipple Mulligan, Christopher D. Bahl, Jason M. Gilmore, Po-Ssu Huang, Per Jr Greisen, Gabriel J. Rocklin, Yifan Song, Thomas W. Linsky, Stephen A. Rettie, Lauren P. Carter & David Baker
Institute for Molecular Bioscience, The University of Queensland, Brisbane, 4072, Queensland, Australia
Peta J. Harvey, Olivier Cheneval, Quentin Kaas & David J. Craik
Seattle Structural Genomics Center for Infectious Diseases, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, 99352, Washington, USA
Garry W. Buchko
Department of Chemistry, State University of New York at Buffalo, Buffalo, 14260, New York, USA
Surya V. S. R. K. Pulavarti, Alexander Eletsky, Xianzhong Xu & Thomas Szyperski
Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, 98109, Washington, USA
William A. Johnsen, James M. Olson & Colin E. Correnti
Global Research, Novo Nordisk A/S, Måløv, DK-2760, Denmark
Per Jr Greisen
Cyrus Biotechnology, Seattle, 98109, Washington, USA
Yifan Song
Department of Chemistry, New York University, New York, 10003, New York, USA
Andrew Watkins
Department of Biology, New York University, New York, 10003, New York, USA
Richard Bonneau
Center for Computational Biology, Simons Foundation, New York, 10010, New York, USA
Richard Bonneau
Applied Mathematics and Statistics and Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, 11794, USA
Evangelos Coutsias
Howard Hughes Medical Institute, University of Washington, Seattle, 98195, Washington, USA
David Baker

Authors

Gaurav Bhardwaj
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Khipple Mulligan
View author publications
You can also search for this author in PubMed Google Scholar
Christopher D. Bahl
View author publications
You can also search for this author in PubMed Google Scholar
Jason M. Gilmore
View author publications
You can also search for this author in PubMed Google Scholar
Peta J. Harvey
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Cheneval
View author publications
You can also search for this author in PubMed Google Scholar
Garry W. Buchko
View author publications
You can also search for this author in PubMed Google Scholar
Surya V. S. R. K. Pulavarti
View author publications
You can also search for this author in PubMed Google Scholar
Quentin Kaas
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Eletsky
View author publications
You can also search for this author in PubMed Google Scholar
Po-Ssu Huang
View author publications
You can also search for this author in PubMed Google Scholar
William A. Johnsen
View author publications
You can also search for this author in PubMed Google Scholar
Per Jr Greisen
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel J. Rocklin
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Song
View author publications
You can also search for this author in PubMed Google Scholar
Thomas W. Linsky
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Watkins
View author publications
You can also search for this author in PubMed Google Scholar
Stephen A. Rettie
View author publications
You can also search for this author in PubMed Google Scholar
Xianzhong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lauren P. Carter
View author publications
You can also search for this author in PubMed Google Scholar
Richard Bonneau
View author publications
You can also search for this author in PubMed Google Scholar
James M. Olson
View author publications
You can also search for this author in PubMed Google Scholar
Evangelos Coutsias
View author publications
You can also search for this author in PubMed Google Scholar
Colin E. Correnti
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Szyperski
View author publications
You can also search for this author in PubMed Google Scholar
David J. Craik
View author publications
You can also search for this author in PubMed Google Scholar
David Baker
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.D.B., G.B., V.K.M. and D.B. designed the study. V.K.M. developed algorithms with help from A.W., E.C., Y.S., G.B., R.B., C.D.B., G.J.R. and T.W.L. C.D.B. and J.M.G. designed canonical peptides with help from D.B., G.J.R. and T.W.L. G.B. designed heterochiral and backbone-cyclized peptides with help from V.K.M., D.B., P.G. and P.S.H. C.D.B. expressed and characterized designed canonical peptides from E. coli with help from J.M.G. and S.A.R. J.M.G. performed MS analysis. W.A.G. and C.E.C. purified canonical peptides via Daedalus and determined X-ray crystal structures. G.W.B., S.V.S.R.K.P., A.E. and T.S. determined NMR solution structures of canonical peptides, purified with isotopic labelling by C.D.B. O.C. and G.B. synthesized, purified and characterized designed non-canonical peptides. P.J.H. and D.J.C. determined NMR solution structures of non-canonical peptides. P.J.H., Q.K. and D.J.C. analysed data from structure determination of non-canonical peptides. C.D.B., G.B., V.K.M. and D.B. wrote the manuscript with help from all authors.

Corresponding author

Correspondence to David Baker.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks V. Nanda and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Extended data figures and tables

Extended Data Figure 1 Disulfide bonds are well defined by X-ray crystallography.

An F_o − F_c omit-map is shown in blue, contoured at 4σ, for design gEHEE_06. Disulfide sulfur atoms were removed, and the omit-map was calculated following real-space refinement. The gEHEE_06 structure is shown in grey as a cartoon representation. Disulfide bonds are shown here as sticks, with sulfur atoms in yellow and carbon atoms in grey.

Extended Data Figure 2 Flowchart of pipelines for designing non-canonical cyclic peptides.

Inputs are shown in blue, RosettaScripts-automated parts of the pipeline are in green, parts carried out by Rosetta standalone applications are pink (the fragment picker application) and purple (the various structure prediction applications), parts performed with MD software are yellow, and manual steps are grey. a, Fragment-dependent design workflow. Final computational validation was carried out using MD simulations and fragment-based Rosetta ab initio structure prediction. For peptides containing isolated d-amino acids, these residues were mutated to glycine for Rosetta ab initio structure prediction. b, Fragment-free design workflow using GenKIC. This approach permits design of non-canonical topologies like the mixed H_LH_R topology, which occurs in no known natural protein. The GenKIC-based structure prediction algorithm is described in Extended Data Fig. 7 and in Supplementary Information.

Extended Data Figure 3 Sidechain placement in non-canonical peptide designs chosen for experimental characterization.

Designs are shown as cartoon and stick representations (top row in each box) and as van der Waals spheres showing sidechain packing (bottom row in each box). l-amino acid residues are shown in cyan, and d-amino acid residues are coloured orange. Sidechains of d- or l-variants of alanine, phenylalanine, isoleucine, leucine, valine, tryptophan and tyrosine are coloured grey to aid visualization of hydrophobic packing interactions. Top box, disulfide-stapled non-canonical peptide designs; bottom box, N-to-C cyclic non-canonical peptide designs.

Extended Data Figure 4 Molecular dynamics screening of designed peptides.

Fifty independent molecular dynamics (MD) simulations in explicit solvent conditions, all starting from the designed peptide, were used for discriminating good, kinetically stable (for example, EHE_D1) designs from non-optimal designs of the same topology (for example, EHE_X18 and EHE_X11). a, Five representative trajectories from MD simulation runs. Designs that showed good convergence and smaller fluctuations were selected for further experimental characterization. b, r.m.s.d. distribution from all 50 trajectories. Blue line indicates the Gaussian kernel density estimate for the data. Only the last one-third of the trajectory was used for this analysis. Designs with narrower distributions were picked for further testing. c, Concatenated trajectory of all 50 independent runs show lower fluctuations for the more optimal designs.

Extended Data Figure 5 Structural characterization of NC_EEH_D1.

The NMR structure of NC_EEH_D1 does not match the designed topology. a, Rosetta-designed model for NC_EEH_D1. b, Ensemble of conformers representing the NMR solution structure. c, Superposition of the designed model (blue) with a representative NMR conformer (green).

Extended Data Figure 6 Structural mapping of sequence-aligned region between NC_EHE_D1 and 2MA5.

Design NC_EHE_D1 and PDB entry 2MA5 show weak but significant (e-value, 2 × 10⁻⁴) sequence alignment, which is highlighted in purple. The aligned region folds into very different structures in the different contexts of peptide and protein.

Extended Data Figure 7 Generalized kinematic closure (GenKIC) algorithm flowchart.

GenKIC allows sampling of closed conformations of arbitrary chains of atoms, passing through canonical or non-canonical backbone or sidechain linkages. Bond length, bond angle and torsional degrees of freedom in the chain can be fixed, perturbed from a starting value by small amounts, set to user-defined values, or sampled randomly. The algorithm then solves for six torsion angles adjacent to three user-defined pivot atoms in order to enforce closure of the loop. The many solutions from the closure are then filtered internally, and each can be subjected to arbitrary user-defined Rosetta protocols and filtration in order to prune the solution list further. A single solution is selected from those passing filters by a user-defined selection criterion. This flowchart shows the steps in a single invocation of the algorithm; for sampling, a user may specify that the algorithm be applied any number of times. User inputs are shown in blue, steps carried out by the GenKIC algorithm itself are in green, steps carried out by Rosetta code external to the GenKIC algorithm are shown in yellow, and outputs are shown in salmon.

Extended Data Figure 8 A new fragment-free structure prediction algorithm.

a, Flowchart of the steps required to generate a single sampled conformation. In typical usage, this process would be repeated tens of thousands of times to produce many samples. Inputs (the peptide sequence and an optional PDB file for the design structure) are shown in blue, and outputs (the sampled structure, its energy, and its r.m.s.d. from the design structure) are shown in salmon. Steps performed by the GenKIC algorithm are shaded green, and setup and completion steps performed by the simple_cycpep_predict application are shown in yellow. Further details of this algorithm are discussed in Supplementary Information. b, The initial, random peptide conformation with bad terminal peptide bond geometry. c, Ensemble of closed conformations found for a single closure attempt. In this example, residue 7 (cyan) is the fixed anchor residue. Certain regions of the peptide have been set to left- or right-handed helical conformations before solving closure equations. d, A single closed solution with relative cysteine sidechain orientations that pass the initial, low-stringency filter for disulfide (fa_dslf) conformational energy. e, The resulting structure, following sidechain repacking, energy minimization, and cyclic de-permutation.

Extended Data Figure 9 Mutational tolerance of selected genetically-encodable designs.

Left column, RP-HPLC traces for the parental designs; middle and right, same for the resurfaced designs where applicable. Traces for proteins run under oxidizing conditions are shown as black lines, while traces for proteins run following reduction with 10 mM DTT are shown as red lines. Insets, gels highlighting the SDS–PAGE mobility of each purified protein under oxidizing (left band) and reducing conditions (right band). Under each row of panels are shown sequence alignments with the mutated positions highlighted in red, along with theoretical isoelectric points as calculated by ProtParam.

Extended Data Figure 10 Mutational tolerance of selected NC designs.

a, b, Mutational tolerance of the d-proline, l-proline loop of design NC_cEE_D1 (green in a), assessed by secondary ¹H_α chemical shift (p.p.m.) for the design sequence (black bars in b) and the p18d loop mutation (red bars). Eliminating this key proline residue does not result in loss of β-strand signal. c, d, Mutational tolerance of loop region of design NC_HEE_D1 (green in c), as assessed by CD spectroscopy for the design sequence (left plot in d) and for the D19T, p20q, P21D triple mutant (right plot in d). Both proline residues may be mutated without loss of secondary structure or major change in the thermal stability. e–g, Computationally predicted mutational tolerance of design NC_H_LH_R_D1, across the entire sequence. Each position was successively mutated in silico to d- or l-alanine, arginine, aspartate, phenylalanine, or valine (preserving the position’s chirality), and full folding simulations were carried out with the Rosetta simple_cycpep_predict application. Folding funnel quality was evaluated using the P_near metric described in Methods. e, Representative plots of energy versus r.m.s.d. from the design structure, plotted for the design sequence (top), for the non-disruptive R14F mutation (middle), and for the e18v mutation (bottom). Results from GenKIC-based structure prediction runs are shown in blue, and relaxation runs, in orange. Note that the bottom case shows many sampled states far from the design state with energy equal to or less than the design state energy. f, Mutational tolerance by position (vertical axis) and mutation (horizontal axis). Blue rectangles represent well-tolerated mutations, and red to black rectangles represent disruptive mutations, based on P_near evaluation of the folding funnel. Black borders indicate the design sequence. g, Mutational tolerance mapped onto the NC_H_LH_R_D1 structure, with colours as in f. Most positions tolerate mutation well, with only the disulfide bridge (C8–c21) and the salt bridges formed by e18 being highly sensitive. The hydrogen bond networks formed by residues Q5, e24 and s25 show some moderate sensitivity to mutation, as do residues E3 and e16.

Supplementary information

Supplementary Information

This file contains Supplementary Sections 1-4. Section 1 contains a detailed description of the computational methods development and example protocols for running the computational methods. Section 2 contains NMR spectra and structure determination statistics. Section 3 contains data from experimental screening of designs. Section 4 contains detailed experimental characterization and validation of reported designs. Collectively, this supplementary information contains details enabling the critical assessment and reproduction of the computational and experimental results described in the main text. (PDF 26062 kb)

Supplementary Data

This tar archive contains the PDB output files from Rosetta for all designed peptides reported in the main text. (ZIP 1750 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

PowerPoint slide for Fig. 4

PowerPoint slide for Fig. 5

PowerPoint slide for Fig. 6

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhardwaj, G., Mulligan, V., Bahl, C. et al. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329–335 (2016). https://doi.org/10.1038/nature19791

Download citation

Received: 26 April 2016
Accepted: 18 August 2016
Published: 14 September 2016
Issue Date: 20 October 2016
DOI: https://doi.org/10.1038/nature19791

This article is cited by

Design of complicated all-α protein structures
- Koya Sakuma
- Naohiro Kobayashi
- Nobuyasu Koga
Nature Structural & Molecular Biology (2024)
De novo design and directed folding of disulfide-bridged peptide heterodimers
- Sicong Yao
- Adam Moyer
- Chuanliu Wu
Nature Communications (2022)
Protocol for iterative optimization of modified peptides bound to protein targets
- Rodrigo Ochoa
- Pilar Cossio
- Thomas Fox
Journal of Computer-Aided Molecular Design (2022)
PepEngine: A Manually Curated Structural Database of Peptides Containing α, β- Dehydrophenylalanine (ΔPhe) and α-Amino Isobutyric Acid (Aib)
- Siddharth Yadav
- Samuel Bharti
- Puniti Mathur
International Journal of Peptide Research and Therapeutics (2022)
Bottom-up de novo design of functional proteins with complex structural features
- Che Yang
- Fabian Sesterhenn
- Bruno E. Correia
Nature Chemical Biology (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.