De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Open Access 11 July 2023
Nature Structural & Molecular Biology Open Access 03 July 2023
Nature Communications Open Access 06 May 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Kintzing, J. R. & Cochran, J. R. Engineered knottin peptides as diagnostics, therapeutics, and drug delivery vehicles. Curr. Opin. Chem. Biol. 34, 143–150 (2016)
Gebauer, M. & Skerra, A. Engineered protein scaffolds as next-generation antibody therapeutics. Curr. Opin. Chem. Biol. 13, 245–255 (2009)
Zahnd, C. et al. Efficient tumor targeting with high-affinity designed ankyrin repeat proteins: effects of affinity and molecular size. Cancer Res. 70, 1595–1605 (2010)
Vazquez-Lombardi, R. et al. Challenges and opportunities for non-antibody scaffold drugs. Drug Discov. Today 20, 1271–1283 (2015)
Bhardwaj, G. et al. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329–335 (2016)
Rocklin, G. J. et al. Global analysis of protein folding using massively parallel design, synthesis, and testing. Science 357, 168–175 (2017)
Berger, S. et al. Computationally designed high specificity inhibitors delineate the roles of BCL2 family proteins in cancer. eLife 5, e20352 (2016)
Procko, E. et al. A computationally designed inhibitor of an Epstein-Barr viral Bcl-2 protein induces apoptosis in infected cells. Cell 157, 1644–1656 (2014)
Cleary, M. A. et al. Production of complex nucleic acid libraries using highly parallel in situ oligonucleotide synthesis. Nat. Methods 1, 241–248 (2004)
Sun, M. G. F., Seo, M.-H., Nim, S., Corbi-Verge, C. & Kim, P. M. Protein engineering by highly parallel screening of computationally designed variants. Sci. Adv. 2, e1600692 (2016)
Fleishman, S. J. et al. RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS One 6, e20161 (2011)
Hurt, A. C. et al. Antiviral resistance during the 2009 influenza A H1N1 pandemic: public health, laboratory, and clinical perspectives. Lancet Infect. Dis. 12, 240–248 (2012)
Blitzer, A. Spasmodic dysphonia and botulinum toxin: experience from the largest treatment series. Eur. J. Neurol. 17, (Suppl 1), 28–30 (2010)
Koday, M. T. et al. A computationally designed hemagglutinin stem-binding protein provides in vivo protection from influenza independent of a host immune response. PLoS Pathog. 12, e1005409 (2016)
Whitehead, T. A. et al. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat. Biotechnol. 30, 543–548 (2012)
Fleishman, S. J. et al. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science 332, 816–821 (2011)
Berntsson, R. P. A., Peng, L., Dong, M. & Stenmark, P. Structure of Botulinum neurotoxin B binding domain in complex with both synaptotagmin II and GD1a. RCSB Protein Data Bank. http://dx.doi.org/10.2210/pdb4kbb/pdb. (2013)
Corti, D. et al. A neutralizing antibody selected from plasma cells that binds to group 1 and group 2 influenza A hemagglutinins. Science 333, 850–856 (2011)
Cass, L. M. R., Efthymiopoulos, C. & Bye, A. Pharmacokinetics of zanamivir after intravenous, oral, inhaled or intranasal administration to healthy volunteers. Clin. Pharmacokinet. 36 (Suppl. 1), 1–11 (1999)
King, C. et al. Removing T-cell epitopes with computational protein design. Proc. Natl Acad. Sci. USA 111, 8577–8582 (2014)
Huang, P.-S. et al. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS One 6, e24109 (2011)
Lin, Y.-R. et al. Control over overall shape and size in de novo designed proteins. Proc. Natl Acad. Sci. USA 112, E5478–E5485 (2015)
Koga, N. et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012)
Silva, D.-A., Correia, B. E. & Procko, E. Motif-driven design of protein–protein interfaces. Methods Mol. Biol. 1414, 285–304 (2016)
Hoover, D. M. & Lubkowski, J. DNAWorks: an automated method for designing oligonucleotides for PCR-based gene synthesis. Nucleic Acids Res. 30, e43 (2002)
Bawono, P. & Heringa, J. PRALINE: A versatile multiple sequence alignment toolkit. Methods in Mol. Biol. 1079, 245–262 (2013)
Zhang, J ., Kobert, K ., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014)
Waskom, M. et al. seaborn: v0.7.1. http://dx.doi.org/:10.5281/zenodo.54844 (2016)
Chao, G. et al. Isolating and engineering human antibodies using yeast surface display. Nat. Protocols 1, 755–768 (2006)
Benatuil, L., Perez, J. M., Belk, J. & Hsieh, C.-M. An improved yeast transformation method for the generation of very large human antibody libraries. Protein Eng. Des. Sel. 23, 155–159 (2010)
Jacobs, T. M., Yumerefendi, H., Kuhlman, B. & Leaver-Fay, A. SwiftLib: rapid degenerate-codon-library optimization through dynamic programming. Nucleic Acids Res. 43, e34 (2015)
Jin, R., Rummel, A., Binz, T. & Brunger, A. T. Botulinum neurotoxin B recognizes its protein receptor with high affinity and specificity. Nature 444, 1092–1095 (2006)
Kabsch, W. XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 (2010)
Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004)
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007)
Brünger, A. T. Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature 355, 472–475 (1992)
Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010)
Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326 (1997)
Adams, P. D. et al. The Phenix software for automated determination of macromolecular structures. Methods 55, 94–106 (2011)
Gamblin, S. J. et al. The structure and receptor binding properties of the 1918 influenza hemagglutinin. Science 303, 1838–1842 (2004)
Van Der Spoel, D. et al. GROMACS: fast, flexible, and free. J. Comput. Chem. 26, 1701–1718 (2005)
Lindorff-Larsen, K. et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78, 1950–1958 (2010)
Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. & Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926–935 (1983)
Hess, B., Bekker, H., Berendsen, H. J. C. & Fraaije, J. G. E. M. LINCS: A linear constraint solver for molecular simulations. J. Comput. Chem. 18, 1463–1472 (1997)
Essmann, U. et al. A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593 (1995)
Berendsen, H. J. C. in Computer Simulation in Materials Science (eds Meyer. M. & Pontikis, V. ) 139–155 (Springer, 1991)
Bussi, G., Donadio, D. & Parrinello, M. Canonical sampling through velocity rescaling. J. Chem. Phys. 126, 014101 (2007)
Nosé, S. & Klein, M. L. Constant pressure molecular dynamics for molecular systems. Mol. Phys. 50, 1055–1076 (1983)
We thank M. Levitt and M. Zhang for discussions, A. Ford for data analysis advice, and Rosetta@Home participants for donating computing time. D.-A.S. thanks T. J. Brunette, J. E. Hsu and M. J. Countryman for their assistance. R.J. thanks K. Perry for X-ray data collection. We acknowledge funding support from: Life Sciences Discovery Fund Launch grant 9598385 (A.C.); PEW Latin-American fellow in the biomedical sciences and a CONACyT postdoctoral fellowship (D.-A.S.); Merck fellow of the Life Sciences Research Foundation (G.J.R.); CONACyT and Doctorado en Ciencias Bioquímicas UNAM (R.V.); NIH (R56AI117675) and Molecular Basis of Viral Pathogenesis Training Grant (T32AI007354-26A1) (S.M.B.); Investigator in the Pathogenesis of Infectious Disease award from the Burroughs Wellcome Fund and NIH (1R01NS080833) (M.D.); CoMotion Mary Gates Innovation Fellow program (T.C.); generous gift from Rocky and Genie Higgins (C.B.); Shenzhen Science and Technology Innovation Committee (JCYJ20170413173837121), Hong Kong Research Grant Council C6009-15G and AoE/P-705/16 (X.H.); PAPIIT UNAM (IN220516), CONACyT (254514) and Facultad de Medicina UNAM (D.A.F.-V.); NIAID grants (AI091823, AI123920, and AI125704) (R.J.); NIAID grant 1R41AI122431 (M.T.K. and D.H.F.); NIAID grant 1R21AI119258 and Life Sciences Discovery Fund grant 20040757 (D.H.F.). We acknowledge computing resources provided by the Supercomputing Laboratory at King Abdullah University of Science and Technology and the Hyak supercomputer system funded by the STF at the University of Washington. The Berkeley Center for Structural Biology is supported in part by the NIH, NIGMS, and HHMI. The Advanced Light Source is a DOE Office of Science User Facility under contract no. DE-AC02-05CH11231. The Northeastern Collaborative Access Team beamlines are funded by NIGMS grant P41 GM103403 and a NIH-ORIP HEI grant (S10OD021527). Advanced Photon Source is a U.S. DOE Office of Science User Facility operated by Argonne National Laboratory under Contract No. DE-AC02-06CH11357.
Authors declare competing interests: A.C., M.T.K., D.H.F. and D.B. are co-founders and stockholders of Virvio, Inc., a company that aims to develop the therapeutics described in this manuscript. A.C., D.-A.S., G.J.R., C.D.B. and D.B. are co-inventors on a U.S. provisional patent application (No. 62/471,637) that incorporates discoveries described in this manuscript.
Reviewer Information Nature thanks G. Nabel and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Figure 1 Target proteins architecture and interactions with anti-BoNT/B and anti-influenza motifs.
a, Full complex of BoNT, showing heavy chain binding domain (HCB) target epitope position in relation to catalytic and translocation domains. Inset shows inhibitory fragment Syt-II (in orange) bound to HCB with hotspots shown as sticks, and grey areas excluded from design calculations. b, Crystal structure of SC1918/H1 showing HA1 and HA2 subunits in complex with HB36.3. Inset shows detailed view of HB36.3 (in green) bound to stem region epitope with hotspots shown as sticks, and grey areas excluded from design calculations. c, Crystal structure of SC1918/H1 showing HA1 and HA2 subunits in complex with HB80.4. Inset shows detailed view of HB80.4 (in magenta) bound to stem region epitope with hotspots shown as sticks, and grey areas excluded from design calculations.
Extended Data Figure 2 Categorization of binders from high-throughput sequencing data of yeast-display FACS-sorted yeast pools.
a, Schematic representation of a resulting yeast pool experiment transformed with four genes, corresponding to four different binder designs (colours: blue, orange, grey, yellow). The first column represents the initial yeast pool, which presents some variability in the initial number of cells transformed with each gene. Subsequently, the cells are subject to different stringencies of selection condition (display, high, medium and low target concentrations). The number of cells selected during FACS (see Methods) is proportional to both the binding affinity and the fractional population of the design. b, Instead of observing a ‘classical’ readout where each measurement is directly proportional to the amount of binding, the result is a convoluted readout (using high-throughput sequencing of each FACS of selected yeast pools under different conditions, see Methods) of both the population fraction and the binding strength. c, Our method of analysing the strength of an individual design is to assign each of them to a binding condition (category) if they produce a peak in its enrichment (as compared to its own initial population in the unselected, but displaying, population). Since, at higher categories, ‘better’ binders will always out-compete weaker ones, this method clusters binders into categories of binding (for example, weak, medium, or strong). If protease is used to further select the populations for stability, the same concept applies (see Fig. 2).
Extended Data Figure 3 Molecular dynamics simulations to assess the flexibility of mini-protein binder designs, their binding motifs and hotspots.
a, Schematic representations of the helical segments and hotspots used to calculate the average r.m.s.d. for mini-protein binders containing binding motifs from HB36, HB80 and Syt-II. The four conserved hotspots (orange) used to calculate the average r.m.s.d. of each binding motif are also shown. b, Top, average r.m.s.d.s (with respect to the designed bound conformation) of the whole mini-proteins versus those of the hotspots. The results for non-binders and binders are shown in black and red, respectively. Bottom panel, same as top, except that the x-axis displays the r.m.s.d.s of the entire helical motif. These results were obtained from an aggregation of 108 μs molecular dynamics simulations, from a representative sample of designs (143 for BoNT and 146 for influenza, see Methods for details). The r.m.s.d. values for hotspot residues were calculated using a subset of side-chain heavy atoms that are invariant to the rotation of the aromatic ring (CG and CZ for Phe and Tyr). The backbone heavy atoms were used for the r.m.s.d. calculations of ‘binding helical motif’ and ‘whole protein’. c, The convergence of molecular dynamics simulations discriminates binders and non-binders as a function of simulation length (30 ns, 40 ns, 50 ns and 100 ns), subject to a similar amount of total sampling. The results show that simulations of 50 ns in duration are sufficient to discriminate the stability of binders and non-binders, even though longer molecular dynamics simulations (such as 100 ns) may further improve the discrimination power. Ten randomly selected mini-proteins designed against BoNT (which are also included in b) were used in this figure. d, Similar to Fig. 3d, the normalized traces of the histograms (fitted using a normal probability density function) show that, for both targets, the designs that are binders (cyan, yellow and red lines) show trends with smaller fluctuations in hotspot residues than non-binders (blue lines); however, no particular trend is observed regarding strength of binding.
a, Designed mini-proteins that were co-crystallized in complex with their respective targets (as shown in Fig. 4). Designed anti-HA mini-protein HB1.6928.2.3 does not denature up to a temperature of 95 °C. Designed anti-BoNT/B mini-protein shows partial denaturation at 95 °C that is completely reversible after fast-cooling to 25 °C. Black shows the circular dichroism spectrum at 25 °C, red at 95 °C, and yellow at 25 °C (after fast refolding, 5 min). Proteins were measured at 0.25 mg ml−1 in PBS buffer pH 7 (see Methods). b, Proteins that were solubly expressed or chemically synthesized. Plots are analogous to a. HB1.10027.3 contains two disulfides, HB1.6394.2.3 contains three disulfides, Bot.6782.4, Bot.6827.4, Bot.7075.4, Bot.4024.4, Bot.3318.4, Bot.5721.4, and Bot.5916.4, each contain one disulfide bond. The rest of the proteins were designed without disulfide bonds. c, Three disulfide-containing proteins with and without reducing agent. Plots are analogous to a. Proteins were measured at 0.25 mg ml−1 in PBS buffer pH 7 without (top row) and with (bottom row) the reducing agent TCEP. The disulfides are shown to be crucial for the thermal stability of these disulfide-containing proteins (HB1.6928.2.3 contains two disulfides, Bot.2110.4 and Bot.3194.4 each contain one disulfide).
Chemically synthesized HA binder (0.3 mg ml−1) was incubated in PBS with various dilutions of trypsin (52 μM stock) for 20 min at room temperature. Reactions were quenched with addition of 1% weight per volume BSA and samples run on SDS–PAGE gel. The relative concentrations of trypsin are shown at the top. ImageJ was used to quantify the intensity of each band (below the band). a, Both HB36.6 and HB1.5702.3 show weaker gel bands at trypsin concentrations higher than 0.055 stock (2.86 μM), indicating proteolytic degradation. HB1.6928.2 and HB1.6394.2, both of which contain disulfides, show no degradation at any trypsin concentration. b, Scatter plot of gel intensities in a.
a, A simulated annealing FO−FC omit map for HB1.6928.2.3 (green) residues 10–22 (contoured at 3σ) shows clear density for amino-acid side chains at the interface (dark blue HA1, light blue HA2). A single residue (Asn32), in a loop between the first and second β–strands, is not observed in the electron density. b, 2FO−FC map for Bot.671.2 (green) residues 2-13 (contoured at 2σ) shows clear density for side chains at the interface except for the flexible lysine residue. BoNT HCB is shown in dark blue. The entire backbone, interface, and core residues for Bot.671.2 are all well resolved in the electron density map.
a, b, Immunoblots of cultured primary rat cortical neurons that were exposed to BoNT/B (20 nM) or BoNT/A (10 nM) with or without GST–Syt-II or Bot.671.2 (see Methods). The supernatants of lysed neurons were collected for immunoblot analysis to detect the indicated proteins, and actin served as control for loading. The designed mini-protein appears to confer protection against degradation of VAMP2, but not against degradation of the negative control, SNAP25 (the intracellular target of BoNT/A). c, Immunocytochemistry for detection of BoNT/B in neurons (see Methods). Left, negative control (no toxin); middle, positive control (cells incubated with 20 nM of BoNT/B for 10 min); right, near-total protective effect against 20 nM of BoNT/B conferred by co-incubating the cells with 600 nM of the design Bot.671.2. Top panels show a representative image of fluorescence microscopy for the detection of BoNT/B; bottom panels show backfield illumination microscopy for the same area.
Comparison of in vitro neutralization of influenza viruses by HB36.6, FI6v3 and the designed mini-protein HB1.6928.2.3. Each antiviral was compared for its efficiency (EC50) in inhibiting the infection of Madin–Darby canine kidney cells by a range of influenza strains. It is clear that HB1.6928.2.3 most efficiently inhibited infection for all of the group-1 influenza strains tested (H1N1, H5N1 and H6N2). As expected, no neutralization was observed against H3N2 (group 2). In all experiments, n = 3 independent samples were tested for each condition, except for T/Mass/1965 (H6N2) and HK/ X31 (H3N2), for which n = 2 samples were tested. Dots show raw values for independent tests and whiskers show ± 1 s.d.
About this article
Cite this article
Chevalier, A., Silva, DA., Rocklin, G. et al. Massively parallel de novo protein design for targeted therapeutics. Nature 550, 74–79 (2017). https://doi.org/10.1038/nature23912
This article is cited by
Nature Structural & Molecular Biology (2023)
Design of an artificial phage-display library based on a new scaffold improved for average stability of the randomized proteins
Scientific Reports (2023)
Nature Communications (2023)
Nature Communications (2023)
Nature Reviews Bioengineering (2023)