De novo protein design has enabled the creation of new protein structures. However, the design of functional proteins has proved challenging, in part due to the difficulty of transplanting structurally complex functional sites to available protein structures. Here, we used a bottom-up approach to build de novo proteins tailored to accommodate structurally complex functional motifs. We applied the bottom-up strategy to successfully design five folds for four distinct binding motifs, including a bifunctionalized protein with two motifs. Crystal structures confirmed the atomic-level accuracy of the computational designs. These de novo proteins were functional as components of biosensors to monitor antibody responses and as orthogonal ligands to modulate synthetic signaling receptors in engineered mammalian cells. Our work demonstrates the potential of bottom-up approaches to accommodate complex structural motifs, which will be essential to endow de novo proteins with elaborate biochemical functions, such as molecular recognition or catalysis.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Crystal structures have been deposited in the PDB with accession codes 6YWD (4H.01 in complex with Mota Fab) and 6YWC (4E1H.95 in complex with 101F Fab). Amino acid and nucleotide sequences of experimentally characterized variants are available in Supplementary Table 5. Expression plasmids of 4E1H.95_LUMABS, 4E2H.210_LUMABS, 4H.01, 3E2H.37, 4E2H.210, 4E1H.95 and 3H1L_02.395 are available from Addgene under accession numbers 155208, 155209, 155210, 155211, 155212, 155213 and 155198, respectively. Plasmid information for the cellular receptors is available in Supplementary Table 6, and plasmids can be directly requested from the corresponding author.
All scripts used for the computational design and for the analysis of next-generation sequencing data have been deposited and are available in the public GitHub repository at https://github.com/LPDI-EPFL/Bottom-up-de-novo-design.
Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003).
Brunette, T. J. et al. Exploring the repeat protein universe through computational protein design. Nature 528, 580–584 (2015).
Thomson, A. R. et al. Computational design of water-soluble alpha-helical barrels. Science 346, 485–488 (2014).
Marcos, E. et al. Principles for designing proteins with cavities formed by curved beta sheets. Science 355, 201–206 (2017).
Koga, N. et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012).
Pan, X. et al. Expanding the space of protein geometries by computational design of de novo fold families. Science 369, 1132–1136 (2020).
Dawson, W. M., Rhys, G. G. & Woolfson, D. N. Towards functional de novo designed proteins. Curr. Opin. Chem. Biol. 52, 102–111 (2019).
Correia, B. E. et al. Proof of principle for epitope-focused vaccine design. Nature 507, 201–206 (2014).
Sesterhenn, F. et al. Boosting subdominant neutralizing antibody responses with a computationally designed epitope-focused immunogen. PLoS Biol. 17, e3000164 (2019).
Sesterhenn, F. et al. De novo protein design enables the precise induction of RSV-neutralizing antibodies. Science 368, eaay5051 (2020).
Silva, D. A. et al. De novo design of potent and selective mimics of IL-2 and IL-15. Nature 565, 186–191 (2019).
Chevalier, A. et al. Massively parallel de novo protein design for targeted therapeutics. Nature 550, 74–79 (2017).
Boyken, S. E. et al. De novo design of tunable, pH-driven conformational changes. Science 364, 658–664 (2019).
Joh, N. H. et al. De novo design of a transmembrane Zn(2)(+)-transporting four-helix bundle. Science 346, 1520–1524 (2014).
Dou, J. et al. De novo design of a fluorescence-activating beta-barrel. Nature 561, 485–491 (2018).
Langan, R. A. et al. De novo design of bioactive protein switches. Nature 572, 205–210 (2019).
Silva, D. A., Correia, B. E. & Procko, E. Motif-driven design of protein–protein interfaces. Methods Mol. Biol. 1414, 285–304 (2016).
Procko, E. et al. A computationally designed inhibitor of an Epstein–Barr viral Bcl-2 protein induces apoptosis in infected cells. Cell 157, 1644–1656 (2014).
Berger, S. et al. Computationally designed high specificity inhibitors delineate the roles of BCL2 family proteins in cancer. eLife 5, 20352 (2016).
Correia, B. E. et al. Computational design of epitope-scaffolds allows induction of antibodies specific for a poorly immunogenic HIV vaccine epitope. Structure 18, 1116–1126 (2010).
Azoitei, M. L. et al. Computation-guided backbone grafting of a discontinuous motif onto a protein scaffold. Science 334, 373–376 (2011).
Holliday, G. L., Fischer, J. D., Mitchell, J. B. & Thornton, J. M. Characterizing the complexity of enzymes on the basis of their mechanisms and structures with a bio-computational analysis. FEBS J. 278, 3835–3845 (2011).
Jones, S. & Thornton, J. M. Principles of protein-protein interactions. Proc. Natl Acad. Sci. USA 93, 13–20 (1996).
Rubinstein, N. D. et al. Computational characterization of B-cell epitopes. Mol. Immunol. 45, 3477–3489 (2008).
Lechner, H., Ferruz, N. & Hocker, B. Strategies for designing non-natural enzymes and binders. Curr. Opin. Chem. Biol. 47, 67–76 (2018).
Burton, A. J., Thomson, A. R., Dawson, W. M., Brady, R. L. & Woolfson, D. N. Installing hydrolytic activity into a completely de novo protein framework. Nat. Chem. 8, 837–844 (2016).
Polizzi, N. F. et al. De novo design of a hyperstable non-natural protein-ligand complex with sub-A accuracy. Nat. Chem. 9, 1157–1164 (2017).
Bonet, J. et al. Rosetta FunFolDes—a general framework for the computational design of functional proteins. PLoS Comput. Biol. 14, e1006623 (2018).
McLellan, J. S. et al. Structure of a major antigenic site on the respiratory syncytial virus fusion glycoprotein in complex with neutralizing antibody 101F. J. Virol. 84, 12236–12244 (2010).
McLellan, J. S. et al. Structure of RSV fusion glycoprotein trimer bound to a prefusion-specific neutralizing antibody. Science 340, 1113–1117 (2013).
McLellan, J. S. et al. Structural basis of respiratory syncytial virus neutralization by motavizumab. Nat. Struct. Mol. Biol. 17, 248–250 (2010).
Fedechkin, S. O., George, N. L., Wolff, J. T., Kauvar, L. M. & DuBois, R. M. Structures of respiratory syncytial virus G antigen bound to broadly neutralizing antibodies. Sci. Immunol. 3, eaar3534 (2018).
Bonet, J., Harteveld, Z., Sesterhenn, F., Scheck, A. & Correia, B. E. rstoolbox – a Python library for large-scale analysis of computational protein design data and structural bioinformatics. BMC Bioinf. 20, 240 (2019).
Tian, D. et al. Structural basis of respiratory syncytial virus subtype-dependent neutralization by an antibody targeting the fusion glycoprotein. Nat. Commun. 8, 1877 (2017).
Ngwuta, J. O. et al. Prefusion F-specific antibodies determine the magnitude of RSV neutralizing activity in human sera. Sci. Transl. Med. 7, 309ra162 (2015).
Widjaja, I. et al. Characterization of epitope-specific anti-respiratory syncytial virus (Anti-RSV) antibody responses after natural infection and after vaccination with formalin-inactivated RSV. J. Virol. 90, 5965–5977 (2016).
Phung, E. et al. Epitope-specific serological assays for RSV: conformation matters. Vaccines 7, 23 (2019).
Graham, B. S., Gilman, M. S. A. & McLellan, J. S. Structure-based vaccine antigen design. Annu. Rev. Med. 70, 91–104 (2019).
Lee, P. S. & Wilson, I. A. Structural characterization of viral epitopes recognized by broadly cross-reactive antibodies. Curr. Top. Microbiol. Immunol. 386, 323–341 (2015).
Sesterhenn, F., Bonet, J. & Correia, B. E. Structure-based immunogen design-leading the way to the new age of precision vaccines. Curr. Opin. Struct. Biol. 51, 163–169 (2018).
Arts, R. et al. Detection of antibodies in blood plasma using bioluminescent sensor proteins and a smartphone. Anal. Chem. 88, 4525–4532 (2016).
Mousa, J. J. et al. Human antibody recognition of antigenic site IV on pneumovirus fusion proteins. PLoS Pathog. 14, 19 (2018).
Santorelli, M., Lam, C. & Morsut, L. Synthetic development: building mammalian multicellular structures with artificial genetic programs. Curr. Opin. Biotechnol. 59, 130–140 (2019).
Giordano-Attianese, G. et al. A computationally designed chimeric antigen receptor provides a small-molecule safety switch for T-cell therapy. Nat. Biotechnol. 38, 426–432 (2020).
Gainza-Cirauqui, P. & Correia, B. E. Computational protein design-the next generation tool to expand synthetic biology applications. Curr. Opin. Biotechnol. 52, 145–152 (2018).
Scheller, L., Strittmatter, T., Fuchs, D., Bojar, D. & Fussenegger, M. Generalized extracellular molecule sensor platform for programming cellular behavior. Nat. Chem. Biol. 14, 723–729 (2018).
Wood, C. W. et al. CCBuilder: an interactive web-based tool for building, designing and assessing coiled-coil protein assemblies. Bioinformatics 30, 3029–3035 (2014).
Crank, M. C. et al. A proof of concept for structure-based vaccine design targeting RSV in humans. Science 365, 505–509 (2019).
Taylor, W. R. A ‘periodic table’ for protein structures. Nature 416, 657–660 (2002).
Huang, P. S. et al. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS ONE 6, e24109 (2011).
Bhardwaj, G. et al. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329–335 (2016).
Chao, G. et al. Isolating and engineering human antibodies using yeast surface display. Nat. Protoc. 1, 755–768 (2006).
Rohou, A. & Grigorieff, N. CTFFIND4: fast and accurate defocus estimation from electron micrographs. J. Struct. Biol. 192, 216–221 (2015).
de la Rosa-Trevin, J. M. et al. Scipion: a software framework toward integration, reproducibility and validation in 3D electron microscopy. J. Struct. Biol. 195, 93–99 (2016).
Scheres, S. H. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).
Kabsch, W. XDS. Acta Crystallogr. D. 66, 125–132 (2010).
Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D. 66, 213–221 (2010).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D. 66, 486–501 (2010).
We thank K. Lau, A. Reynaud, L. Durrer, S. Quinche, D. Hacker and F. Pojer in the PTPSP facility at EPFL for protein expression and X-ray crystallography support, D. Demurtas from CIME and S. Nazarov from PTBIOEM for electron microscopy support, L. Menin from the EPFL proteomics core facility for mass spectrometry support, the flow cytometry core facility for technical support and the gene expression core facility for help with next-generation sequencing. We thank V. Olieric at the Paul Scherrer Institute for operation of the X06DA beamline. The computational simulations were facilitated by the CSCS Swiss National Supercomputing Centre as well by SCITAS at EPFL. This work was supported by the Swiss initiative for systems biology (SystemsX.ch), the European Research Council (starting grant no. 716058), the Swiss National Science Foundation (grant no. 310030_163139), the NCCR Molecular Systems Engineering and the NCCR Chemical Biology. F.S. was supported by an SNF/Innosuisse BRIDGE Proof-of-Concept grant, and J.B. was funded by the EPFL Fellows postdoctoral fellowship. T.K. received funding from the Cluster of Excellence RESIST (grant no. EXC 2155) of the German Research foundation and from the German Center of Infection Research, J.T.C. was supported by the ERA-Net PrionImmunity project no. 01GM1503 of the German Federal Ministry of Education and Research. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Left: Prefusion RSVF structure (PDB 4JHW) with site 0 (purple), site II (cyan) and site IV (orange) highlighted. Center/right: Close-up view on structural motifs, including the 2D10 epitope (wheat) that is presented by the RSV glycoprotein (RSVG), the structure of which has been determined as peptide bound to the 2D10 monoclonal antibody (PDB 5WN9). The secondary structure composition and residue lengths are indicated below. E, beta strand; H, alpha helix and L, loop structure.
Extended Data Fig. 2 Computational design, experimental screening and enrichment analysis of the 3E2H design series.
a, TopoBuilder assembly of the 3E2H topology (left) and a full-atom model after folding and design (right). The logo plots show the sequence diversity in selected core positions as predicted by Rosetta FunFolDes and the diversity for each position encoded in a combinatorial library (using degenerate codons) for yeast display screening. b, Detailed view on all positions experimentally sampled (green). c, Enrichment analysis following next-generation sequencing (NGS) of populations sorted for high affinity binding (x-axis) versus resistance to protease digestion (y-axis). n = 5,377 unique sequences were analyzed that were found under both selection conditions. d, Residue preferences for each position when comparing sequences positively enriched for both binding and protease resistance (c, quadrant I, blue) versus sequences that were negatively enriched (c, quadrant IV, red). 100-200 sequences each were analyzed. The heatmap shows the relative frequency of the respective amino acids in quadrant I versus quadrant IV, showing, for example, an overrepresentation of valine over isoleucine in position 7 in sequences from quadrant I.
Extended Data Fig. 3 Computational design, experimental screening and enrichment analysis of the 4E1H design series.
a, TopoBuilder assembly of the 4E1H topology. b, Detailed view on all positions experimentally sampled (green). c, Enrichment analysis of dual selection pressures, binding to 101F antibody and resistance to protease digestion. n = 92,779 unique sequences were analyzed that were found under both selection conditions. d, Residue preferences for indicated position in positively enriched versus negatively enriched sequences. See Extended Data Fig. 2 caption for further details.
Extended Data Fig. 4 Computational design, experimental screening and enrichment analysis of the 4E2H design series.
a, TopoBuilder assembly of the 4E2H topology and full-atom structure after Rosetta FunFolDes. b, Detailed view on all positions experimentally sampled (green). c, Enrichment analysis. n = 3,903 unique sequences were analyzed that were found under both selection conditions. d, Residue preferences for indicated position in positively enriched versus negatively enriched sequences. See Extended Data Fig. 2 caption for further details.
Extended Data Fig. 5 Computational design, experimental screening and enrichment analysis of the 3H1L_02 design series.
a, TopoBuilder assembly of the 3H1L_02 topology and full-atom structure after Rosetta FunFolDes, the epitope region is highlighted in purple. b, Detailed view on all positions experimentally sampled (green). c, Enrichment analysis. n = 99,338 unique sequences were analyzed that were found under both selection conditions. d, Residue preferences for indicated position in positively enriched versus negatively enriched sequences. See Extended Data Fig. 2 caption for further details.
Extended Data Fig. 6 Computational design, experimental screening and enrichment analysis of the 4H design series.
a, TopoBuilder assembly of the 4H topology and full-atom structure after Rosetta FunFoldDes, the Mota epitope region is colored in cyan and 2D10 epitope is colored in wheat. b, Detailed view on all positions experimentally sampled (green). c, Enrichment analysis. n = 287 unique sequences were analyzed that were found under both selection conditions. d, Residue preferences for indicated position in positively enriched versus negatively enriched sequences. See Extended Data Fig. 2 caption for further details.
Designed proteins are melt from 20 °C to 95 °C as measured by CD. The melting temperature (Tm) is determined by the change of ellipticity at the global curve minimum.
Luminescence spectra of the 4E2H.210 LUMABS sensor in the absence of antibody (black), compared to 2 µM 101 F IgG (blue) or 15 µM cetuximab (CTX, green), an anti-EGFR antibody, showing that the sensor only responds in the presence of epitope-specific antibodies.
About this article
Cite this article
Yang, C., Sesterhenn, F., Bonet, J. et al. Bottom-up de novo design of functional proteins with complex structural features. Nat Chem Biol 17, 492–500 (2021). https://doi.org/10.1038/s41589-020-00699-x
Nature Methods (2021)