Abstract
Abiotic d-proteins that selectively bind to natural l-proteins have gained significant biotechnological interest. However, the underlying structural principles governing such heterochiral protein–protein interactions remain largely unknown. In this study, we present the de novo design of d-proteins consisting of 50–65 residues, aiming to target specific surface regions of l-proteins or l-peptides. Our designer d-protein binders exhibit nanomolar affinity toward an artificial l-peptide, as well as two naturally occurring proteins of therapeutic significance: the D5 domain of human tropomyosin receptor kinase A (TrkA) and human interleukin-6 (IL-6). Notably, these d-protein binders demonstrate high enantiomeric specificity and target specificity. In cell-based experiments, designer d-protein binders effectively inhibited the downstream signaling of TrkA and IL-6 with high potency. Moreover, these binders exhibited remarkable thermal stability and resistance to protease degradation. Crystal structure of the designed heterochiral d-protein–l-peptide complex, obtained at a resolution of 2.0 Å, closely resembled the design model, indicating that the computational method employed is highly accurate. Furthermore, the crystal structure provides valuable information regarding the interactions between helical l-peptides and d-proteins, particularly elucidating a novel mode of heterochiral helix–helix interactions. Leveraging the design of d-proteins specifically targeting l-peptides or l-proteins opens up avenues for systematic exploration of the mirror-image protein universe, paving the way for a diverse range of applications.
Similar content being viewed by others
Introduction
d-proteins are protein molecules whose polypeptide chains consist of d-amino acids and the achiral amino acid glycine. d-proteins, which can form specific heterochiral protein–protein interactions with natural l-protein targets, possess remarkable potential as molecular tools, therapeutics, and diagnostics due to their high bioorthogonality and stability.1,2,3 To identify d-proteins capable of binding to a target l-protein, mirror-image peptide phage display methods have been developed.4,5,6 However, it remains challenging to precisely target a specific surface region of the target protein and confirm the presence of valid binders within the initial random library.
Compared to conventional selection approaches, computational methods offer a more advanced strategy for identifying binders by targeting specific regions on the surface of a target protein. l-proteins have been designed to bind naturally existing target proteins.7,8,9,10,11,12,13,14 However, a significant challenge in the design of d-proteins that target l-proteins arises from our limited understanding of the heterochiral interactions between l- and d-proteins — there are only a few high-resolution 3D structures available for heterochiral protein complexes in the protein data bank (PDB).2,15,16,17 Single-helix d-peptides that bind l-protein targets18,19,20 have been generated, however, this method relies on structure information derived from a known α-helix in the l-configuration bound to the target protein, and is restricted to the one-helix scaffold. Thus far, the designed binding modes for these designer d-peptide and l-protein target complexes have not been validated with high-resolution structures. The accurate design of d-proteins that target specific surface regions of any target protein or peptide, and the de novo design of d-protein binders solely from the target protein structure, remain unsolved problems.4,18,19,20,21
Results
The mirror-image design approach
We developed an integrated computational and experimental method that enables the design and evaluation of de novo d-protein binders for any given l-protein target (Fig. 1). In this method, the computational design, high-throughput testing, characterization, and directed evolution of the potential d-protein binders are initially performed in the mirror-image space: the natural l-protein target is chemically synthesized as the mirror image d-protein molecule; potential l-protein binders are computationally designed to interact with the d-protein form of the target molecule (Fig. 2a). The designed l-protein binders are experimentally evaluated by sorting yeast libraries displaying the designs against the refolded d-protein form of the target molecule; identified l-protein binders are expressed in Escherichia coli, purified and characterized in solution; the affinity of l-protein binders may be optimized by using directed evolution methods (Fig. 2b). Compared to the alternative direct design, synthesis and testing of d-protein binders against l-protein targets, this mirror-image strategy has the advantage of higher throughput by using yeast display to evaluate the designs as l-proteins in a massively parallel manner. Finally, the d-protein forms of these experimentally selected binders are chemically synthesized and characterized and will bind to the natural l-protein target for reasons of symmetry.
We hypothesize that heterochiral protein interactions likely adhere to the same principles of physical chemistry as interactions between l-proteins. These principles involve maximizing the interface through achieving chemical and shape complementarity. However, at the level of secondary structure, the geometry of the interaction must differ substantially due to chirality. We managed to calculate the computational binding energy and interface metrics for the heterochiral protein–protein or protein–peptide complexes available in the PDB15,16,22,23,24 by using Rosetta25,26 and found the computed values to be nearly within the range observed for l-protein complexes (Supplementary information, Fig. S1a–d), suggesting that these metrics could be applied for the in silico design and selection of d-protein binders.
We began by digitally inverting the l-protein target structure to generate the d-protein target structure. Then we used RifDock27 to generate the rotamer interaction field (RIF, billions of scored interacting residues) by docking discrete l-amino acids against the selected surface regions of the d-protein target structure (Fig. 2a). Our protocol was found to be effective in recovering most of the interacting l-amino acids from the d-protein target structure in known heterochiral protein complexes (Supplementary information, Fig. S1e, f). This finding demonstrates the capability of our algorithm to generate meaningful heterochiral interactions. Then, 9606 miniprotein scaffolds in the l-configuration across 5 different topologies7 were docked against the d-protein target guided by RIF. Interface design was performed, and more backbone geometries and interface compositions of the designer binders were sampled by using the MotifGraft algorithm28,29 for optimal binding.
Massively parallel design and characterization of binders in the mirror-image space
To test our mirror-image design protocol, we used an artificial alpha helical peptide (named l-Pep-1) and two natural l-proteins of pharmacological significance, which are the D5 domain of human tropomyosin receptor kinase A (residues 283–384, hereafter referred to as l-TrkA) and the human interleukin-6 (residues 28-212, l-IL-6). These peptide and protein targets (hereafter referred to as protein targets for simplicity) have different origins, shapes and surface characteristics (Figs. 3a, 4a). We designed l-Pep-1 as an amphipathic alpha-helix with arbitrary hydrophobic residues and hydrophilic residues on its nonpolar and polar faces, respectively (Supplementary information, Fig. S2a). The design model was nearly identical to the structure predicted by AlphaFold30 (Supplementary information, Fig. S2b). l-TrkA and l-IL-6 bind nerve growth factor (NGF)31 and IL-6 receptors (IL-6R),32 respectively, and are involved in a variety of signaling processes. There are no known examples of d-proteins that bind l-Pep-1, l-TrkA or l-IL-6.
We chemically synthesized the mirror-image configuration of the l-protein targets, namely d-Pep-1, d-TrkA, and d-IL-6, on a milligram scale. All these targets were labeled with an N-terminal biotin (Supplementary information, Figs. S3–S5). d-Pep-1 was soluble in water solution. Synthesized d-TrkA and d-IL-6 were dissolved in 8 M urea and 6 M guanidinium chloride (GdmCl) solution, respectively, and then folded by dialysis against buffer without denaturants. Folded d-TrkA and d-IL-6 were further purified by size-exclusion chromatography (SEC) (Supplementary information, Fig. S2c, d). The circular dichroism (CD) spectra of the three d-proteins were of opposite sign to those measured for the folded l-protein targets (Fig. 3b).
For d-Pep-1, we selected the hydrophobic region for the binders to target. We digitally inverted the chirality of the TrkA (PDB ID: 1WWW31) and IL-6 (PDB ID: 4O9H) structures to generate the d-TrkA and d-IL-6 structures. For d-TrkA and d-IL-6, we designed against the cognate regions in l-TrkA and l-IL-6 that interact with NGF and IL-6R, respectively, for potential blockage of the downstream signaling pathways. We designed and selected 4500–14,000 l-protein binders composed of 50–65 amino acids for each d-protein target, which were at least locally optimal in design calculations sampling from an estimated sequence space on the order of 1020 (~10,000 protein backbones × 1016 interface residue combinations). Specifically, there were 7703 binders generated for Pep-1, 13,158 binders for IL-6, and 4642 binders for TrkA. The computational interface metrics for these designer heterochiral protein complexes were at similar levels to those for heterochiral or homochiral protein complex structures available in the PDB (Supplementary information, Fig. S1a–d). Pools of oligonucleotides encoding the designer l-protein mini-binders were synthesized, amplified, and cloned into a yeast surface-expression vector for display. Yeast cells displaying designs that bound the d-protein target were enriched by 2–4 rounds of fluorescence-activated cell sorting (FACS) (Fig. 3c). The frequency of each design in the initial yeast libraries and the sorted cell populations was determined by the next-generation sequencing (NGS). The top enriched binders for each target have different sequences (Supplementary information, Fig. S6).
We expressed and purified the most highly enriched l-protein binders for each of the d-targets in E. coli and characterized their binding affinity with the d-protein targets (Fig. 3d, e). The binder l-19437d-Pep-1 showed a binding affinity of 22 nM to d-Pep-1, as measured using biolayer interferometry. Additionally, the binder l-57445d-TrkA exhibited an affinity of 152 nM towards d-TrkA. Lastly, the binder l-25367d-IL-6 bound to d-IL-6 with an affinity of 25 nM. The binding of l-19437d-Pep-1, l-57445d-TrkA, and l-25367d-IL-6 to d-targets was significantly diminished or eliminated upon mutation of essential designer interface residues, consistent with the design model (Supplementary information, Fig. S7).
The binding affinity of l-protein binders were optimized by using directed evolution methods. We constructed a site saturation mutagenesis (SSM) library for l-57445d-TrkA and l-25367d-IL-6 and identified mutations that increased binding. We systematically generated combinatorial mutation and random mutagenesis libraries to identify protein binders capable of binding to d-TrkA or d-IL-6 with high affinity. Two binders were successfully identified from these libraries, namely l-57445-evod-TrkA and l-25367-evod-IL-6 (Supplementary information, Fig. S6d). The affinity between l-57445-evod-TrkA and d-TrkA was measured to be 7.8 nM using biolayer interferometry, whereas the affinity between l-25367-evod-IL-6 and d-IL-6 was measured to be 0.7 nM (Supplementary information, Fig. S8a, b, f, g).
Characterizing the designer d-protein binder proteins
The d-protein forms of the well-characterized l-protein binders were chemically synthesized, folded and characterized (Fig. 4; Supplementary information, Figs. S8, S9). Due to the small size and stable nature of the proteins, protein synthesis and folding was straightforward. All d-protein binders had CD spectra that were comparable to those measured for the recombinantly expressed l-protein binders, with opposite sign (Fig. 4b). These results were in line with the design models of d-protein binders. The d-protein binders were hyper-stable in the thermo-melting experiments: the CD spectra at 95 °C are very close to those at 25 °C (Fig. 4b). Compared to the l-protein binders, the d-protein binders showed much greater resistance to protease treatment even under very acidic conditions (pH 2.5 for pepsin). All l-protein binders were completely digested by trypsin or pepsin after a 6-h incubation period, while all d-protein binders remained intact even after a 20-h incubation period (Fig. 4d).
The folded d-protein binder binds the l-protein target with an affinity close to that measured between the cognate l-protein binder and d-protein target (Fig. 4c). The binders D-19437l-Pep-1, d-57445l-TrkA and d-25367l-IL-6 bound to l-Pep-1, l-TrkA and l-IL-6 with affinities of 59 nM, 151 nM and 28 nM, respectively, as measured by using biolayer interferometry. The evolved binders d-57445-evol-TrkA and d-25367-evol-IL-6, exhibited affinities towards l-TrkA and l-IL-6 of 1 nM and 3 nM, respectively, as determined through biolayer interferometry (Supplementary information, Fig. S8c, h). To further validate the binding affinities, isothermal titration calorimetry (ITC) was employed for binders d-19437l-Pep-1 and d-25367-evol-IL-6 (Supplementary information, Fig. S10), while microscale thermophoresis (MST) was used for d-57445-evol-TrkA (Supplementary information, Fig. S8d and Table S5). The results showed that d-19437l-Pep-1, d-57445-evol-TrkA, and d-25367-evol-IL-6 bound to l-Pep-1, l-TrkA, and l-IL-6 with affinities of 12 nM, 1 nM, and 89 nM, respectively. It is noteworthy that the ITC analysis suggested the possible presence of two sequential d-protein binding sites on l-IL-6, with the second binding site exhibiting a weaker affinity compared to the first binding site. The complexes between d-proteins and l-proteins were assembled in vitro, and binding of d-proteins to the corresponding l-proteins was confirmed by coelution in SEC (Supplementary information, Fig. S11).
The three designer binders, d-19437l-Pep-1, d-57445-evol-TrkA and d-25367l-IL-6, exhibit high enantiomeric and target specificity. The binding between l-protein binders and their corresponding l-protein targets was undetectable, suggesting that the designer d-protein binders have high enantiomeric specificity (Supplementary information, Fig. S12b–d). To evaluate their target specificity, we tested the interactions of the d-protein binders with all the l-protein targets. While the cross-reactivity of these d-binders is generally low, there is some binding observed between d-57445-evo l-TrkA and l-Pep1 (Supplementary information, Fig. S12).
In cell-based experiments, d-57445-evol-TrkA and d-25367-evol-IL-6 were shown to effectively block downstream signaling pathways by direct competing with NGF and IL-6R, respectively (Fig. 4e; Supplementary information, Fig. S8e). d-57445-evol-TrkA inhibits the phosphorylation of Akt and Erk kinases downstream of TrkA signaling and inhibits cell proliferation with an EC50 value of ~120 nM. Similarly, d-25367-evol-IL-6 inhibits the IL-6-IL-6R signaling in a reconstituted cell-based assay with an EC50 value of ~1 nM. Both inhibitors demonstrated significant potency and exhibited concentration-dependent effects. These findings suggest that the designed d-protein binders for TrkA and IL-6 effectively bind to their intended regions on targets.
Structure validation of the designer heterochiral protein complexes
The l-protein binder structures predicted by AlphaFold30 are very similar to the design models (with root mean square deviation (RMSD) values of Cα atoms less than 1 Å) (Supplementary information, Fig. S13a–c), which strongly suggest that binders were able to fold into the designed structures. For the heterochiral complexes, we subjected the purified complexes to crystallization and determined the structure of d-19437l-Pep-1 in complex with l-Pep-1 at 2.0 Å resolution (Fig. 5; Supplementary information, Table S1). The designer d-19437l-Pep-1, consisting of three helices connected by two short loops, binds to l-Pep-1 through two helices in its crystal structure (Fig. 5a). This interaction involves the formation of numerous hydrophobic interactions, which are further enhanced by certain intermolecular electrostatic interactions (Fig. 5b). The crystal structure is nearly identical to the computational design model with a Cα RMSD of 0.6 Å for all aligned Cα atoms. The designed interface residues have conformations that match the design model with almost pinpoint accuracy, which showed clear electron density in an unbiased omit map (Supplementary information, Fig. S13d, e). We also determined a 2.2 Å resolution crystal structure of l-19437d-Pep-1 in complex with d-Pep-1, which is nearly identical to the mirror-image of the d-19437–l-Pep-1 complex structure (Supplementary information, Fig. S14 and Table S1).
New insights into the interactions between l- and d-proteins
Our heterochiral complex structure provides fresh insights into the interactions between a helical l-peptide and a small d-protein binder. l-Pep-1 forms antiparallel and nearly parallel helix–helix interactions with the second (H2) and the third (H3) helices of d-19437l-Pep-1, respectively (Fig. 6a, b). l-Pep-1, d-19437-H2, and d-19437-H3 are nearly untwisted helices with periodicities of 3.64, 3.68, and 3.7 residues per α-helical turn, respectively, which would approximately align every 11th residue along the α-helix axis after three full turns in a short range (referred to as hendecad motif21,33 in which residue positions are coded as abcdefghijk) (Fig. 6c, d). Both the antiparallel and parallel heterochiral helix–helix interfaces exhibit regular patterns of interaction by forming the knobs-into-holes interaction, which involves matching the shape and charge complementarity. For example, in the case of the antiparallel helix–helix interaction, a residue positioned at “e” in the l-helix inserts into a pocket formed by four residues positioned at “g, d, c, and k” in the d-helix. The periodic interaction pattern, consisting of three distinct packing layers between the two heterochiral hendecad motifs, is illustrated by overlaying the two helical net diagrams (Fig. 6c, d, the middle panel). The interface residues from the two heterochiral helices form parallel layers in space and interlock through repetitive knobs-into-holes interactions, resembling two coupled gears revolving in opposite directions. By contrast, the helix–helix interactions of homochiral untwisted parallel or antiparallel α-helices33 are more complex than those of heterochiral α-helices. These interactions exhibit a periodic pattern but form intersecting layers, as illustrated in the helical net diagrams (Supplementary information, Fig. S15).
To our knowledge, this is the first experimental evidence clearly demonstrating the predicted mode of interaction in a heterochiral helical protein complex, as initially proposed by Crick in 1953.34 Additionally, previous studies have shown that heterochiral hendecad motifs play a role in the packing of helices in racemic crystals of helical peptides.21,35 However, it should be noted that these studies did not measure the binding affinity between the two racemic peptides and the observed interface may be a result of the racemic crystal packing. Our results show that hendecad motifs contribute to the heterochiral helix–helix packing in a heterochiral protein complex.
Discussion
The de novo design of d-proteins to target specific natural l-proteins poses a rigorous challenge due to the limited availability of structural information regarding such heterochiral protein complexes. However, our study demonstrates that the computational method we developed is sufficiently accurate. By successfully designing d-protein binders that effectively interact with defined surface regions of target l-proteins and by achieving agreement between the crystal structure and the design model of the d-19437–l-Pep-1 complex, we have shown that interactions between l-proteins and d-proteins can now be designed with atomic-level precision. To our knowledge, these are the first demonstration of the accurate de novo design of heterochiral protein complexes. Our mirror-image design approach, which involves initially designing and characterizing l-protein binders for the d-protein form of a natural target in a high-throughput manner and subsequently converting the most effective l-protein binders to d-protein binders through chemical synthesis, has proven to be robust and efficient.
Our analysis and results, in combination with previous studies, indicate that the protein motifs governing heterochiral protein–protein interactions, such as hendecad motifs21 and rippled-sheet motifs,36 differ significantly from those responsible for homochiral protein–protein interactions. However, similar to l-protein–l-protein interactions,37 d-protein binders tend to optimize the overall impact of chemical and geometric complementarity in order to bind natural l-protein targets. This is achieved by forming extensive van der Waals packing and electrostatic interactions across the heterochiral protein–protein interface. Our study represents a significant advancement towards a comprehensive understanding of the structural principles underlying heterochiral protein–protein interactions, which could ultimately enhance the success rate of protein design.
The d-protein binder molecules designed in our study were limited to a maximum of 65 amino acids, but they were easily synthesizable and folded into stable structures. Although the protein targets in our research were up to 185 residues in length, our approach has the potential to be applied to even larger protein targets, including transmembrane proteins. For example, a four-transmembrane-domain transporter can be chemically synthesized.38 Additionally, recent advancements in the robust, fast, and on-demand synthesis of proteins offer the possibility of efficiently producing d-protein molecules.39 The chemical synthesis of proteins not only allows for the incorporation of d-amino acids but also provides the opportunity to incorporate other non-canonical amino acids with unique chemical and geometric properties.
The accurate design of d-proteins targeting l-proteins forms the basis for a diverse range of applications in the development of molecular tools, therapeutics, and diagnostics. In future investigations, a variety of custom-designed proteins have the potential to facilitate mirror-image biology40,41 by connecting the natural realm with its mirrored counterpart. Numerous intriguing possibilities emerge, such as the development of designer l-proteins capable of catalyzing chemical reactions of mirror-image substrates, as well as d-proteins designed to recognize or modify naturally existing DNA or RNA molecules.
Materials and methods
Computational design
The crystal structure (PDB ID: 1WWW) of human nerve growth factor in complex with the ligand-binding domain of TrkA receptor and interleukin-6 in complex with a camelid fab fragment (PDB ID: 4O9H) were refined with the phenix.rosetta_refine program.42 The structure of l-Pep-1 was predicted by using AlphaFold, and 5 models were generated. The TrkA D5 domain, IL-6 and predicted model of l-Pep-1 were used for subsequent steps. All the l-protein target structures were mirror-imaged to the y-z plane by making opposite sign of the x coordinates to obtain the structures of the d-targets for docking and design. The design protocol is slightly modified from that for the design of the homo-chiral protein–protein interface.7 Briefly, Disembodied l-amino acids were docked against the desired binding surface on the d-protein target for the generation of the rotamer interaction field (RIF),7 with favorable hydrogen-bonding or non-polar interactions. For rapid searching of rotamers aligning with a given miniprotein scaffold, all the rotamer ensembles were stored in a six-dimensional hash table. Miniprotein scaffolds were mutated to poly-valine first and docked against the d-targets by using PatchDock,43 and the identified seeding positions were further refined by RifDock.27 Miniprotein scaffolds in the library were docked into the RIF using a branch-and-bound searching strategy to generate ~1 × 105 docked models. Rosetta FastDesign protocol26 was used to optimize the interfaces for the docking models. During the design stage, we allowed for the design of the previously generated RIF residues. Computational metrics of all the designed models were calculated by using Rosetta. A maximum likelihood estimator is used to assess the designed models generated during the interface design stage. For the MotifGraft9 stage, ~1000 motifs interacting with the d-protein target were selected and miniproteins were grafted onto these motifs to generate ~1–10 million designs for every d-target. The heterochiral interfaces of the grafted models were further optimized by using Rosetta FastDesign protocol. 4642–14,000 designer binders for d-TrkA, d-IL-6 and d-Pep-1 were selected based on the computational interface metrics for experimental evaluation. Specifically, there were 7703 binders generated for Pep-1, 13,158 binders for IL-6, and 4642 binders for TrkA. During RifDock and interface design, we used the Rosetta energy function “beta”,44,45 which is suitable for calculating the energetics for d-amino acids in cyclized peptides.46,47,48,49,50
Chemical synthesis
Materials
Rink Amide AM resin (0.27 mmol/g, 0.55 mmol/g loading) were purchased from Tianjin Nankai HECHENG S&T Co., Ltd (Tianjin, China). Fmoc-protected amino acids, Fmoc-d-protected amino acids (Fmoc-d-Ala-OH, Fmoc-d-Arg(Pbf)-OH, Fmoc-d-Asn(Trt)-OH, Fmoc-d-Asp(OtBu)-OH, Fmoc-d-Cys(Trt)-OH, Fmoc-d-Gln(Trt)-OH, Fmoc-d-Glu(OtBu)-OH, Fmoc-Gly-OH, Fmoc-d-His(Trt)-OH, Fmoc-d-Ile-OH, Fmoc-d-Leu-OH, Fmoc-d-Lys(Boc)-OH, Fmoc-d-Met-OH, Fmoc-d-Phe-OH, Fmoc-d-Pro-OH, Fmoc-d-Ser(OtBu)-OH, Fmoc-d-Thr(OtBu)-OH, Fmoc-d-Trp(Boc)-OH, Fmoc-d-Tyr(OtBu)-OH, Fmoc-d-Val-OH), Fmoc-beta-Ala-OH, Fmoc-(Dmb)Gly-OH, Fmoc-d-Glu(OAllyl)-OH, Fmoc-d-Cys(Acm)-OH and 1-Ethynyl-4-(4-pentylcyclohexyl)cyclohexanol (RBM) were purchased from Jiangsu ShenLang Biotech Co., Ltd (Nantong, China). N,N-dimethylformamide (DMF), triisopropylsilane (TIPS), trifluoroacetic acid (TFA), Anisyl sulfide, PdCl2, PdOAc2, 2,2’-[azobis(1-methylethylidene)]bis[4,5-dihydro-1H-imidazole dihydrochloride (VA-044), pentanedionen and DL-dithiothreitol (DTT) were purchased from J&K Scientific Ltd (Beijing). Dichloromethane (DCM), N,N-diisopropyl-carbodiimide (DIC), Ethyl cyanoglyoxylate-2-oxime (Oxyma), N,N-diisopropylethylamine (DIEA), Tris (2-carboxyethyl)phosphine hydrochloride (TCEP·HCl), Guanidine hydrochloride (GdmCl), methanol and sodium chloride (NaCl) were purchased from Shanghai Titan Scientific Co., Ltd. 1,2-Ethanedithiol (EDT) was purchased from TCI (Shanghai, China) Development Co., Ltd. 4-Mercaptophenylacetic acid (MPAA) was purchased from Alfa Aesar. Sodium nitrite (NaNO2) was purchased from Beijing Chemical Works (Beijing, China). Piperidine was purchased from Sinopharm Chemical Reagent Co., Ltd. Ethyl ether was purchased from Modern Oriental (Beijing) Technology Development Co., Ltd. Acetonitrile was purchased from Mallinckrodt Baker, Inc. Disodium hydrogen phosphate, O-(6-Chloro-1-hydrocibenzotriazol-1-yl)-1,1,3,3-tetramethyluroniumhexafluorophosphate (HCTU) and sodium dihydrogen phosphate was purchased from Shanghai Bidepharmatech Co., Ltd. t-butyl mercaptan was purchased from Acros Organics. Concentrated hydrochloric acid was purchased from Beijing Tongguang Fine Chemicals Company. Boc-GABA-OH was purchased from Gl Biochem (Shanghai) Ltd. Triphenylphosphine-3,3’,3”-trisulfonic acid trisodium salt (TPPTS) was purchased from Shanghai Aladdin Bio-Chem Technology Co., LTD. NaBH4 was purchased from Sigma-Aldrich Co.
Peptide segment synthesis
All peptides were made by standard Fmoc solid-phase peptide synthesis (Fmoc SPPS) and were automatically synthesized by the Liberty blue microwave peptide synthesizer (CEM Corporation). Hydrazide resin was used to prepare hydrazide terminal peptide fragments, and amide resin was used to prepare amide terminal peptide fragments. First, the Fmoc protecting group was removed at 90 °C for 1 min using a solution of 10% piperidine and 0.1 M Oxyma in DMF. The resin was then washed 3 times with DMF. The resin (0.25 mmol), 4 equivalents of (eq.) Fmoc-protected amino acids (0.2 mM, 5 mL, dissolved in DMF), 4 eq. Oxyma (1 mM, 1 mL, dissolved in DMF), and 4 eq. DIC (0.5 mM, 2 mL, dissolved in DMF) were mixed and coupled under microwave heating at 90 °C for 2 min. The standard synthesis procedure concluded with washing the resin three times with DMF. At the end of the procedure, the peptide was cleaved from the resin by treating it for 3 h with TFA cleavage cocktails (TFA/TIPS/thioanisole/water/EDT, 82.5:5:5:5:2.5, v/v/v/v/v). The TFA solution was then concentrated with nitrogen agitation, precipitated with ice-cold diethyl ether, centrifuged and then the supernatant was poured out. This process was repeated three times. The resulting precipitate, a crude peptide, was subsequently purified using RP-HPLC to obtain the pure peptide of interest.
RP-HPLC purification
Reversed-phase HPLC was performed on Shimadzu Prominence HPLC. A pump mobile phase was acetonitrile with 0.1% TFA, and B pump mobile phase was deionized water with 0.1% TFA. Each peptide segment and reaction purification were optimized separately for different gradients. The crude peptide was dissolved in 50% acetonitrile in water with 0.1% TFA, filtered with a 0.22-µm filter, and the purified solution is lyophilized to obtain pure peptide powder.
N-terminal biotin modification
Above synthetic resin with 4 eq. d-Biotin (0.5 mM, 2 mL, dissolved in DMF), 4 eq. Oxyma, 4 eq. DIC were mixed and coupled under microwave heating at 90 °C for 2 min. This process was repeated three times.
RBM peptide synthesis38
The amino acid position where the RBM group needs to be introduced was first synthesized using the microwave. The resin was then transferred to the reaction tube, where 2 eq. RBM was added and stirred for 40 min. This step was repeated twice. After the reaction, the resin was washed alternately with DMF and DCM. Next, 5 eq. NaBH4 (dissolved in DMF) was mixed with the resin for 10 min, repeating this step twice, followed by washing the resin. Finally, 4 eq. Fmoc-d-protected amino acids, 4 eq. Oxyma and 4 eq. DIC were mixed with the resin and left to react overnight. The resin was then washed. The resin was transferred to the microwave to continue coupling the rest of the sequence. Next, the resin was moved to the reaction tube, and SnCl2 (10 mg/mL, 10 μL concentrated HCl, dissolved in DMF) was added to reduce the -NO2 group. The Arg tag ((beta)Ala-Arg-Arg-Arg-Arg-Boc) was then coupled in the microwave. After piperidine treatment for 30 min, 10 eq. Boc-GABA-OH, 4 eq. Oxyma and 4 eq. DIC were added for overnight coupling.
Removal of GABA
Dissolved crude peptide fragments in 6 M GdmCl solution (pH 7). Allow the reaction to proceed at room temperature for 10 min.
MPAA thioester peptide synthesis51
Dissolved crude peptide fragments (10 mg/mL) in 6 M GdmCl solution (pH 2.3). Added 10 eq. MPAA and 5 eq. Pentanedione. Adjusted the pH to 2–2.5. Allow the reaction to proceed at room temperature for 2 h.
Native chemical ligation (NCL)52
Dissolved the hydrazide peptide fragment (1 mM) in 6 M GdmCl solution (pH 2.3). In an ice salt bath at –20 °C, added 10 eq. NaNO2 (0.5 M). Allowed oxidation to proceed for 20 min. Then, added 50 eq. MPAA and 1.1 eq. the N-terminal Cys peptide fragment. Adjusted the pH to 6.3–6.5 using a glass electrode. Allowed the reaction to proceed at room temperature overnight.
Desulfurization53
Dissolved the peptide fragment (1 mM) in 6 M GdmCl and 0.5 M TCEP solution (pH 7). Added 100 eq. tBuSH and 100 eq. VA044. Adjusted the pH to 7–7.2. Allowed the reaction to proceed at room temperature for 3 h.
One-pot ligation and desulfurization
Dissolved the MPAA thioester peptide fragment (1 mM) and the N-terminal Cys peptide fragment (1.1 mM) in 6 M GdmCl (pH 7). Adjusted the pH to 6.3–6.5. Allowed the reaction to proceed at room temperature overnight. Then diluted the mixture threefold with a 300 mM TCEP solution (pH 7). Added 10% (v/v) tBuSH and 40 eq. VA044. Adjusted the pH to 7–7.2, and allowed the reaction to proceed at room temperature for 3 h.
Remove Acm54
Dissolved the peptide fragment (1 mM) in 6 M GdmCl (pH 7). Added 15 eq. PdCl2 and TCEP (5 mg/mL). Adjusted the pH to 7–7.2, and allowed the reaction to proceed at room temperature for 2 h. Finally, added DTT for quenching.
Remove Allyl55
Dissolved the peptide fragment (1 mM) in NCL solution (6 M GdmCl, 100 mM MPAA, 40 mM TCEP, pH 7). Added 3 eq. [Pd(Allyl)Cl]2, adjusted the pH to 7–7.2. Allowed the reaction to proceed at room temperature for 2 h. Finally, added DTT for quenching.
Removal of RBM38
The RBM group on peptide was cleaved with TFA cocktails (TFA/TIPS/water, 95:2.5:2.5, v/v/v) for 3 h. The TFA solution was then concentrated under nitrogen agitation, followed by the addition of acetonitrile in water. The solution was subsequently lyophilized.
Synthetic route of d-19437l-Pep-1
d-19437l-Pep-1 was obtained by standard Fmoc SPPS. The peptide was eluted from a C18 column (4.6 × 250 mm) using a linear gradient of 30%–70% Buffer A in B for over 30 min. RP-HPLC isolation yield was 5% (Supplementary information, Fig. S9).
Synthetic route of d-25367l-IL-6
d-25367l-IL-6 was obtained by standard Fmoc SPPS. The peptide was eluted from a C18 column (4.6 × 250 mm) using a linear gradient of 30%–80% Buffer A in B for over 30 min. RP-HPLC isolation yield was 4.5% (Supplementary information, Fig. S9).
Synthetic route of d-57445l-TrkA
d-57445l-TrkA was obtained by standard Fmoc SPPS. The peptide was eluted from a C18 column (4.6 × 250 mm) using a linear gradient of 30%–80% Buffer A in B for over 30 min. RP-HPLC isolation yield was 6.7% (Supplementary information, Fig. S9).
Protein purification
Expression, purification and refolding of TrkA
The D5 domain of human TrkA (l-TrkA, residues 283–384) was cloned into the pET-21a vector with an N-terminal 6× His tag. The construct was transformed into competent Lemo21(DE3) cells. For l-TrkA, the E. coli cells were grown in LB medium at 37 °C until the OD600 reached 0.6–0.8. IPTG was added to a final concentration of 1 mM and the cells were incubated for 3–4 h at 37 °C. Collected cells were resuspended in 10% glycerol and sonicated, and the inclusion body was purified and successively washed by buffer A (1% TritonX-100, 10 mM Tris-HCl, pH 8.0, 1 mM EDTA), buffer B (1 M NaCl, 10 mM Tris-HCl, pH 8.0, 1 mM EDTA) and buffer C (10 mM Tris-HCl, pH 8.0, 1 mM EDTA). The purified inclusion body was dissolved in buffer D (8 M Urea, phosphate-buffered saline (PBS), pH 7.4, 30 mM imidazole). The supernatant was collected from centrifugation (13,000× g for 1 h), and applied to Nickel Sepharose 6 Fast Flow resin for affinity purification. The eluate was diluted in buffer D with 2.5 mM DTT to a final protein concentration of 0.1 mg/mL. After dialysis in buffer E (20 mM Tris-HCl, pH 8.5, 50 mM NaCl) for 24 h, the protein was concentrated and further purified by SEC (Superdex 75 Increase). All protein samples were characterized by SDS-PAGE, with purity greater than 95%. d-TrkA was refolded using the same protocol.
Expression, purification and refolding of IL-6
The gene encoding human IL-6 (28-212, with an additional methionine at the N-terminus) was cloned into the pET-28b vector, and the resulting plasmid was then transformed into competent Lemo21(DE3) cells. E. coli cells were grown in LB medium at 37 °C until the OD600 reached 1.0. IPTG was added to a final concentration of 1 mM and the cells were incubated at 37 °C for 3–4 h. Cells were harvested by centrifugation at 10,000× g. The pellet was resuspended in PBS (pH 7.4), and disrupted by sonication. The lysates were centrifuged at 20,000× g for 1 h at 4 °C. The pellets containing the inclusion body of IL-6 were resuspended in a buffer containing 6 M GdmCl, 0.1 M Tris-HCl, pH 8.0, and 30 mM imidazole, and then centrifuged at 20,000× g for 1 h at 4 °C. The supernatant was applied to nickel resin (Ni Sepharose 6 Fast Flow, Absin) for affinity purification. Purified protein in a buffer containing 6 M GdmCl, 0.1 M Tris-HCl, pH 8.0, and 300 mM imidazole was incubated with 10 mM dithiothreitol (final concentration) at room temperature for 1 h, and was diluted into PBS (pH 7.4) containing 2 mM reduced glutathione and 0.2 mM oxidized glutathione to a final protein concentration of 0.25 mg/mL. The mixture was incubated at room temperature for 2 h followed by a dialysis against PBS buffer (pH 7.4) at 4 °C for 24 h. The dialyzed protein was concentrated and purified by SEC (Superdex 200 increase column). d-IL-6 was refolded using the same protocol.
Expression and purification of l-protein binders
The genes encoding binders l-19437d-Pep-1, l-57445d-TrkA and l-25367d-IL-6 were amplified from the corresponding yeast library, cloned into the pET-28b vector containing an N-terminal 6× His tag and a TEV protease cutting site, and sequenced. The constructs were transformed into competent Lemo21(DE3) cells. Cells were grown in LB medium at 37 °C until OD600 reached 0.6, and protein expression was induced at 18 °C overnight by addition of 0.1 mM IPTG (final concentration). Harvested cells in PBS buffer containing 500 mM NaCl (pH 7.4) were lysed by sonication and centrifuged. The protein in supernatant was extracted using a gravity flow column containing 1 mL Ni Sepharose 6 Fast Flow resin (Cytiva Life Sciences), purified by SEC (ÄKTA pure, Cytiva Life Sciences) on a Superdex 75 Increase 10/300 GL column (Cytiva Life Sciences) in PBS buffer (pH 7.4), and verified by SDS-PAGE. Peak fractions were pooled, concentrated, snap-frozen by liquid nitrogen and stored at –80 °C.
Folding of d-protein binders
Chemically synthesized d-19437l-Pep-1, d-57445l-TrkA, and d-25367l-IL-6 were dissolved in denaturing buffer containing 6 M GdmCl and 100 mM Tris-HCl (pH 8.0) to 0.1 mg/mL and dialyzed against TBS buffer (pH 7.4) for 6 h. The proteins were dialyzed against fresh TBS buffer for another 12 h. Dialyzed sample was concentrated and purified by SEC on a Superdex 75 Increase 10/300 GL column (Cytiva Life Sciences) in PBS buffer (pH 7.4) and verified by SDS-PAGE. Peak fractions were pooled, concentrated, snap-frozen by liquid nitrogen and stored at –80 °C.
Biochemical and biophysical characterization
CD
CD data were collected in a 0.5-mm path-length cuvette from a Chirascan V100 CD spectrometer (AppliedPhotophysics). CD spectra of targets were measured from 260 nm to 180 nm in triplets and averaged. For designer binders, CD spectra were applied at various temperatures from 25 °C to 95 °C. Temperature melts were conducted in 2 °C steps (heating rate of 2 °C/min) by measuring the signal at a wavelength of 222 nm. The concentrations of proteins measured in CD were in the range of 0.1–0.4 mg/mL.
Biolayer interferometry
Biolayer interferometry binding data were collected and processed using Octet RED96e (ForteBio). All target proteins (5–10 μg/mL) were biotinylated and immobilized onto streptavidin-coated biosensors (SA Sartorius) in a binding buffer containing PBS (pH 7.4) with 0.02% TWEEN. The biosensors were equilibrated in binding buffer for 60 s and then dipped into binder solution for 180 s or 300 s (association step), and then dipped into binding buffer (dissociation step). The experiment was carried out at 25 °C. In the cross-reactivity assay, binding of 1 μM different miniprotein binders to l-Pep1, l-TrkA and l-IL-6 were tested, respectively. All the experiments were repeated twice with similar results. The data were analyzed using the global fitting algorithm provided in the Octet data analysis software, DataAnalysisHT.
ITC
Binding affinity determination by ITC was performed on a MicroCal PEAQ-ITC (Malvern) with a 19-drop method following instruction of the instrument. The first drop was 0.4 μL and each drop of the rest was 2 μL. The spacing between drops was 150 s. The analysis of results was performed using the MicroCal PEAQ-ITC Analysis Software. In this process, the evaluation of l-19437/d-Pep-1 and d-19437/l-Pep-1 was conducted using the One Set of Sites mode, while the assessment of d-25367-evo/l-IL-6 was carried out employing the Two Sets of Sites mode. The first drop was excluded during analysis as default. In the ITC titration, l-19437 (15 μM) and d-Pep-1 (150 μM), along with d-25367-evo (575 μM) and l-IL-6 (57.5 μM), were utilized in a PBS solution at a pH of 7.4.
MST
MST experiments were conducted using a Nanotemper Monolith equipped with the NT.115 Capillaries. The MST curves were acquired under identical conditions, with a protein concentration of 50 nM l-TrkA in a buffer composed of PBS with 0.05% Tween20 at pH 7.4. A total of 16 different ligand concentrations were utilized for each measurement, ranging from 122 pM to 40 μM for d-57445-evol-TrkA. These concentrations were achieved through serial dilution with a factor of 2. The laser power of the MST experiment was set to 40%, and the acquired data were analyzed using MO.Control v1.6.1 acquisition software.
Co-migration on SEC
The interaction between the targets (l-Pep-1, l-TrkA and l-IL-6) and the cognate binders (d-19437, d-57445, and d-25367), were examined by SEC analysis. Protein solution for targets, binders and target–binder mixtures was applied to SEC, respectively. l-Pep-1, d-19437, l-Pep-1 + d-19437, l-TrkA, d-57445 and l-TrkA + d-57445 were separated on a Superdex 75 Increase 10/300 column. l-IL-6, d-25367 and l-IL-6 + d-25367 were subjected into a Superdex 200 Increase 10/300 column. Co-migration of proteins was examined by SDS-PAGE.
Protease stability assay
In the digestion system, the final concentrations of l- and d-proteins were adjusted to 0.2 mg/mL. The final concentration of trypsin (Genom) was 2.2 mg/mL in the Hank’s Balanced Salt Solution with 0.02% EDTA. The final concentration of Pepsin (Aladin) was 0.22 mg/mL in a buffer containing 0.1 M Glycine, pH 2.5. Digestion was performed at 37 °C.
Yeast display
Library preparation
All designer sequences were extended to 65 amino acids using a (GS)n linker after the coding sequences of the designs for better polymerase chain reaction (PCR) amplification. The protein sequences were reversed translated and codon optimized using DNAworks 2.056 for expression in Saccharomyces cerevisiae.7
Gene pools
Oligonucleotide pools (Agilent) were constructed following a published protocol.9 Briefly, genes were flanked with a common 18-bp adaptor at 5′ and with a 17-bp adaptor at 3′ for amplification. The oligonucleotide pools were amplified with 2× PCR Master Mix (KOD One) using extension primers to add one pETCON vector homologous recombination segment (40 bp) to each end. Amplified pools were loaded on a 1.5% agarose gel and the bands with expected size were extracted (ComWin Biotech Gel Extraction Kit). pETCON backbone was linearized by PCR and gel extracted, producing 3–4 μg DNA. S. cerevisiae EBY100 cells were transformed with the library DNA and linearized pETCON vector using an established protocol,57 but the amount of library DNA was adjusted to 4 μg instead of 16 μg. After transformation (at least 1 × 107 transformants), yeast cells were grown overnight in 60 mL SDCAA medium at 30 °C and were concentrated to OD600 10 followed by mixing with an equal volume of 50% (v/v) glycerol and stored in 2 mL aliquots at –80 °C.
Yeast display and deep sequencing
Yeast display was based on an established protocol.58 1 × 107 yeast cells were inoculated into 10 mL SDCAA medium and were grown at 30 °C until OD600 reached 2–5. Cells were then centrifuged at 3000× g for 3 min, resuspended in 10 mL SGCAA, and induced at 20 °C for 20–28 h. Then, yeast cells were concentrated to 108 cells/mL and incubated with 1:250 diluted anti-c-Myc fluorescein isothiocyanate (FITC, Miltenyi Biotech) and biotinylated targets of various concentrations. Next, the cells were incubated with 1:100 diluted streptavidin-phycoerythrin (SAPE, R&D Systems). The cells were washed with 0.5 mL PBSF (PBS with 0.1% (w/v) bovine serum albumin (BSA)) right before and after each incubation step. The labeled cells were sorted on a sorter (BD Melody). The concentrations of targets were: 100 nM and 10 nM for 2 rounds of sorting of d-Pep-1 binders; 5 μM and 500 nM for 2 rounds of sorting of d-TrkA binders; 1 μM, 100 nM, 10 nM and 1 nM for 4 rounds of sorting of d-IL-6 binders. 2 × 103–5 × 104 cells with strongest double fluorescence signal were collected after each round and were grown in 5 mL SDCAA medium at 30 °C until OD600 reached 2–5; at least 1 × 108 cells were spun down at 13,000× g for 1 min and stored as cell pellets at –80 °C for plasmid extraction. Yeast plasmids were extracted (TIANprep Yeast Plasmid DNA Kit) and eluted in 70 μL distilled water, serving as templates for deep sequencing. Deep sequencing libraries with library-specific barcodes were prepared by PCR amplification for 22–30 cycles in a 50 μL KOD One reaction system. The PCR products were separated on a 1.5% agarose gel and extracted (ComWin Biotech Gel Extraction Kit). Purified DNA was mixed and sent for deep sequencing (Novogene). For the d-Pep-1 pool, after sorting against d-Pep-1 at concentrations of 100 nM and 10 nM, we identified 140 and 11 diverse sequences, respectively. For the d-TrkA pool, sorting was performed at d-TrkA concentrations of 5000 nM and 500 nM, which yielded 85 and 37 distinct sequences, respectively. After sorting at d-IL-6 concentrations of 1000 nM, 100 nM, 10 nM and 1 nM, the enriched populations contained 431, 143, 65 and 61 different sequences, respectively.
Entropy score
The entropy score for each position of the binder was calculated as follows:
\({x}_{i,j}\) represents a mutant protein binder with the ith residue mutated to j (one of the 20 amino acids). \(P\left({x}_{i,j}\right)\) is the observed frequency of \({x}_{i,j}\) from the NGS data. \({{{\rm{K}}}}\) is the length of the protein binder. For the d-TrkA binder l-57445d-TrkA, NGS data from the 1st round of selection in the site saturation mutagenesis experiment were used to calculate the entropy score. Positions with lower entropy score are more conserved, and vice versa.
Combo library
To introduce combinatorial mutations, degenerate codons were utilized to designate different amino acids for each beneficial position, based on the results of the SSM analysis. The DNA fragments necessary for this process were generated through PCR, employing library primers that carried the degenerate codons. The ratio used for mixing the primers ensured that each encoding amino acid appeared with equal frequency. Subsequently, DNA insert fragments were yielded through PCR amplification, utilizing primers that contained overlap with the previously linearized pETCON vector. Following this, the DNA insert fragments were purified through gel electrophoresis and combined with the linearized pETCON vector in preparation for transformation.
Error-prone PCR library construction
In order to create the binder library, error-prone PCR was performed using the GeneMorph II Random Mutagenesis Kits (Agilent). The PCR reaction included a mutation rate of 1–3 amino acids per gene. Once the libraries were successfully reconstructed, they were transferred into S. cerevisiae EBY100 cells.
Cell-based assay
TF-1 cell proliferation assay7
TF-1 cells were incubated with different concentrations of d-57445-evol-TrkA and NGF in the RPMI-1640 media containing 2% FBS for 48 h at 37 °C with 5% CO2. Cell proliferation was assessed by measuring cellular ATP level using ApoSENSOR™ Cell Viability Assay Kit (BioVision) according to the manufacturer’s protocol. Luminescent signal was measured by using a Thermo Varioskan LUX microplate reader, and the data were plotted and analyzed by Prism 8 (GraphPad).
Detection of the phosphorylation of Akt and Erk kinases
After a 4-h period of starvation treatment, TF-1 cells were incubated with NGF at a concentration of 100 ng/mL and treated with various binders. This incubation process took place at a temperature of 37 °C for a duration of 10 min. Subsequently, the cells were collected through centrifugation and subjected to cell lysis. The lysate of the whole cell protein was obtained and analyzed via western blot analysis to assess the protein phosphorylation levels. This analysis was conducted using the PhosphoPlus Akt (Ser473) Antibody Duet kit and the PhosphoPlus p44/42 MAPK (Erk1/2) (Thr202/Tyr204) Antibody Duet kit, both from Cell Signaling Technology.
IL-6 signaling assay59
The production of human placental secreted alkaline phosphatase (SEAP) in cell culture medium was described previously.59 Briefly, 1 × 104 HEK-293T cells were transfected with PhCMV-hIL-6R-pA (40 ng), PhCMV-hSTAT3-pA (40 ng) and PhSTAT3-SEAP-pA (20 ng) to produce SEAP in response to human IL-6. The transfected cells were incubated with the different concentrations of d-25367-evol-IL-6 and human IL-6 in DMEM medium containing 10% FBS for 48 h at 37 °C with 5% CO2. IL-6 signaling was assessed by measuring the production of SEAP in the cell culture medium. The cell culture supernatant was heat-inactivated (65 °C for 30 min), and 80 µL supernatant was mixed with 120 µL of substrate solution (100 µL of 2× SEAP assay buffer containing 20 mM homoarginine, 1 mM MgCl2, 21% (v/v) diethanolamine (pH 9.8), and 20 µL of substrate solution containing 120 mM p-nitrophenylphosphate). Absorbance was recorded at 405 nm (37 °C) using a Synergy H1 hybrid multimode microplate reader (BioTek Instruments Inc.). The data were plotted and analyzed by Prism 8 (GraphPad).
Crystallography
Crystallization
Purified l-19437 (8 mg) was incubated with TEV protease (2 mg) at room temperature for 4 h to remove His tag. The cleaved product was purified by using a gravity flow column containing 1.5 mL Nickel Sepharose 6 Fast Flow resin (Cytiva Life Sciences) in TBS buffer (pH 7.4). 3.5 mg l-19437 was incubated with 1.25 mg d-Pep-1 and purified by SEC on a Superdex 75 Increase 10/300 column (Cytiva Life Sciences). The complex was concentrated to 8.1 mg/mL and subjected to crystal screen (hanging-drop method, with a protein/buffer ratio of 1:1) by using a Mosquito crystallization robot (SPT Labtech). Prism-like crystals grew into full size after 45 days. The crystallization condition contained 20% (w/v) PEG 8000, 100 mM Tris-HCl (pH 8.5), 100 mM magnesium chloride and 20% (v/v) PEG 400. Crystals of d-19437 and l-Pep-1 were grown in the same crystallization condition.
Data collection and structure determination
Crystals were cryoprotected by addition of 15% glycerol. Diffraction data at 2.2 Å resolution were collected at 100 K by using an in-house X-ray generator (XtaLAB Synergy Custom diffractometer, Rigaku). Crystals were in space group of P21212. Data were indexed, integrated and scaled using HKL-2000.60 Further processing was carried out with programs from the CCP4 suites.61 Data collection statistics are summarized in Supplementary information, Table S1. MR solution was found by Phaser62 using the design model as the searching model. Two copies of the designed heterochiral complexes in one asymmetric unit (with translational non-crystallographic symmetry) were identified.
Data availability
The crystal structure models have been deposited in the PDB under accession codes: 8GQP for d-19437–l-Pep-1; 7YH8 for l-19437–d-Pep-1. All data are available in the main text or the supplementary materials. Design scripts are available in Supplementary information, Dataset S1.
References
Muttenthaler, M., King, G. F., Adams, D. J. & Alewood, P. F. Trends in peptide drug discovery. Nat. Rev. Drug Discov. 20, 309–325 (2021).
Zhao, L. & Lu, W. Mirror image proteins. Curr. Opin. Chem. Biol. 22, 56–61 (2014).
Dong, S. et al. Recent advances in chemical protein synthesis: method developments and biological applications. Sci. China Chem. 67, 1060–1096 (2024).
Schumacher, T. N. et al. Identification of d-peptide ligands through mirror-image phage display. Science 271, 1854–1857 (1996).
Chang, H. N. et al. Blocking of the PD-1/PD-L1 Interaction by a d-peptide antagonist for cancer immunotherapy. Angew. Chem. Int. Ed. Engl. 54, 11760–11764 (2015).
Zhou, X. et al. A novel d-peptide identified by mirror-image phage display blocks TIGIT/PVR for cancer immunotherapy. Angew. Chem. Int. Ed. Engl. 59, 15114–15118 (2020).
Cao, L. et al. Design of protein-binding proteins from the target structure alone. Nature 605, 551–560 (2022).
Fleishman, S. J. et al. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science 332, 816–821 (2011).
Chevalier, A. et al. Massively parallel de novo protein design for targeted therapeutics. Nature 550, 74–79 (2017).
Strauch, E. M. et al. Computational design of trimeric influenza-neutralizing proteins targeting the hemagglutinin receptor binding site. Nat. Biotechnol. 35, 667–671 (2017).
Whitehead, T. A. et al. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat. Biotechnol. 30, 543–548 (2012).
Procko, E. et al. A computationally designed inhibitor of an Epstein-Barr viral Bcl-2 protein induces apoptosis in infected cells. Cell 157, 1644–1656 (2014).
Cao, L. et al. De novo design of picomolar SARS-CoV-2 miniprotein inhibitors. Science 370, 426–431 (2020).
Silva, D. A. et al. De novo design of potent and selective mimics of IL-2 and IL-15. Nature 565, 186–191 (2019).
Marinec, P. S. et al. A Non-immunogenic bivalent d-protein potently inhibits retinal vascularization and tumor growth. ACS Chem. Biol. 16, 548–556 (2021).
Mandal, K. et al. Chemical synthesis and X-ray structure of a heterochiral {d-protein antagonist plus vascular endothelial growth factor} protein complex by racemic crystallography. Proc. Natl. Acad. Sci. USA 109, 14779–14784 (2012).
Uppalapati, M. et al. A Potent d-protein antagonist of VEGF-A is nonimmunogenic, metabolically stable, and longer-circulating in vivo. ACS Chem. Biol. 11, 1058–1065 (2016).
Yang, W. et al. Computational design and optimization of novel d-peptide TNFalpha inhibitors. FEBS Lett. 593, 1292–1302 (2019).
Garton, M. et al. Method to generate highly stable d-amino acid analogs of bioactive helical peptides using a mirror image of the entire PDB. Proc. Natl. Acad. Sci. USA 115, 1505–1510 (2018).
Valiente, P. A. et al. Computational design of potent d-peptide inhibitors of SARS-CoV-2. J. Med. Chem. 64, 14955–14967 (2021).
Kreitler, D. F. et al. A hendecad motif is preferred for heterochiral coiled-coil formation. J. Am. Chem. Soc. 141, 1583–1592 (2019).
Welch, B. D., VanDemark, A. P., Heroux, A., Hill, C. P. & Kay, M. S. Potent d-peptide inhibitors of HIV-1 entry. Proc. Natl. Acad. Sci. USA 104, 16828–16833 (2007).
Lyamichev, V. I. et al. Stepwise evolution improves identification of diverse peptides binding to a protein target. Sci. Rep. 7, 12116 (2017).
Smith, A. R. et al. Characterization of resistance to a potent d-peptide HIV entry inhibitor. Retrovirology 16, 28 (2019).
Alford, R. F. et al. An integrated framework advancing membrane protein modeling and design. PLoS Comput. Biol. 11, e1004398 (2015).
Leaver-Fay, A. et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011).
Dou, J. et al. De novo design of a fluorescence-activating beta-barrel. Nature 561, 485–491 (2018).
Silva, D. A., Correia, B. E. & Procko, E. Motif-driven design of protein-protein interfaces. Methods Mol. Biol. 1414, 285–304 (2016).
Berger, S. et al. Computationally designed high specificity inhibitors delineate the roles of BCL2 family proteins in cancer. Elife 5, e20352 (2016).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Wiesmann, C., Ultsch, M. H., Bass, S. H. & de Vos, A. M. Crystal structure of nerve growth factor in complex with the ligand-binding domain of the TrkA receptor. Nature 401, 184–188 (1999).
Boulanger, M. J., Chow, D. C., Brevnova, E. E. & Garcia, K. C. Hexameric structure and assembly of the interleukin-6/IL-6 alpha-receptor/gp130 complex. Science 300, 2101–2104 (2003).
Huang, P. S. et al. High thermodynamic stability of parametrically designed helical bundles. Science 346, 481–485 (2014).
Crick, F. H. C. The packing of α-helices: Simple coiled-coils. Acta Crystallogr. 6, 689–697 (1953).
Mortenson, D. E. et al. High-resolution structures of a heterochiral coiled coil. Proc. Natl. Acad. Sci. USA 112, 13144–13149 (2015).
Pauling, L. & Corey, R. B. Two rippled-sheet configurations of polypeptide chains, and a note about the pleated sheets. Proc. Natl. Acad. Sci. USA 39, 253–256 (1953).
Chothia, C. & Janin, J. Principles of protein-protein recognition. Nature 256, 705–708 (1975).
Zheng, J.-S. et al. Robust chemical synthesis of membrane proteins through a general method of removable backbone modification. J. Am. Chem. Soc. 138, 3553–3561 (2016).
Hartrampf, N. et al. Synthesis of proteins by automated flow chemistry. Science 368, 980–987 (2020).
Wang, Z., Xu, W., Liu, L. & Zhu, T. F. A synthetic molecular system capable of mirror-image genetic replication and transcription. Nat. Chem. 8, 698–704 (2016).
Xu, Y. & Zhu, T. F. Mirror-image T7 transcription of chirally inverted ribosomal and functional RNAs. Science 378, 405–412 (2022).
DiMaio, F. et al. Improved low-resolution crystallographic refinement with Phenix and Rosetta. Nat. Methods 10, 1102–1104 (2013).
Schneidman-Duhovny, D., Inbar, Y., Nussinov, R. & Wolfson, H. J. PatchDock and SymmDock: servers for rigid and symmetric docking. Nucleic Acids Res. 33, W363–W367 (2005).
Alford, R. F. et al. The Rosetta all-atom energy function for macromolecular modeling and design. J. Chem. Theory Comput. 13, 3031–3048 (2017).
Park, H. et al. Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J. Chem. Theory Comput. 12, 6201–6212 (2016).
Hosseinzadeh, P. et al. Comprehensive computational design of ordered peptide macrocycles. Science 358, 1461–1466 (2017).
Bhardwaj, G. et al. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329–3351 (2016).
Hosseinzadeh, P. et al. Anchor extension: a structure-guided approach to design cyclic peptides targeting enzyme active sites. Nat. Commun. 12, 3384 (2021).
Mulligan, V. K. et al. Computationally designed peptide macrocycle inhibitors of New Delhi metallo-beta-lactamase 1. Proc. Natl. Acad. Sci. USA 118, e2012800118 (2021).
Mulligan, V. K. et al. Computational design of mixed chirality peptide macrocycles with internal symmetry. Protein Sci. 29, 2433–2445 (2020).
Flood, D. T. et al. Leveraging the Knorr Pyrazole Synthesis for the facile generation of thioester surrogates for use in native chemical ligation. Angew. Chem. Int. Ed. Engl. 57, 11634–11639 (2018).
Fang, G. M. et al. Protein chemical synthesis by ligation of peptide hydrazides. Angew. Chem. Int. Ed. Engl. 50, 7645–7649 (2011).
Wan, Q. & Danishefsky, S. J. Free-radical-based, specific desulfurization of cysteine: A powerful advance in the synthesis of polypeptides and glycopolypeptides. Angew. Chem. 119, 9408–9412 (2007).
Maity, S. K., Jbara, M., Laps, S. & Brik, A. Efficient palladium-assisted one-pot deprotection of (acetamidomethyl)cysteine following native chemical ligation and/or desulfurization to expedite chemical protein synthesis. Angew. Chem. Int. Ed. Engl. 55, 8108–8112 (2016).
Jbara, M., Eid, E. & Brik, A. Palladium mediated deallylation in fully aqueous conditions for native chemical ligation at aspartic and glutamic acid sites. Org. Biomol. Chem. 16, 4061–4064 (2018).
Hoover, D. M. & Lubkowski, J. DNAWorks: an automated method for designing oligonucleotides for PCR-based gene synthesis. Nucleic Acids Res. 30, e43 (2002).
Benatuil, L., Perez, J. M., Belk, J. & Hsieh, C. M. An improved yeast transformation method for the generation of very large human antibody libraries. Protein Eng. Des. Sel. 23, 155–159 (2010).
Chao, G. et al. Isolating and engineering human antibodies using yeast surface display. Nat. Protoc. 1, 755–768 (2006).
Bojar, D., Scheller, L., Hamri, G. C., Xie, M. & Fussenegger, M. Caffeine-inducible gene switches controlling experimental diabetes. Nat. Commun. 9, 2318 (2018).
Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326 (1997).
Collaborative Computational Project, N. The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D Biol. Crystallogr. 50, 760–763 (1994).
Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010).
The PyMOL molecular graphics system, version 1.8 (Schrodinger, LLC., 2015).
Mól, A. R., Castro, M. S. & Fontes, W. NetWheels: A web application to create high quality peptide helical wheel and net projections. bioRxiv https://doi.org/10.1101/416347 (2018).
Acknowledgements
We would like to acknowledge Drs. Ting Zhu and Hongtao Yu for critical discussion; Drs. Zhizhi Wang, Shilong Fan and Jiawei Wang for assistance in structure determination; Dr. Peng Cao for providing the TF-1 cell line; Drs. Jiawei Shao and Mingqi Xie for kindly providing the materials for the IL-6 cell-based assay. We would like to thank the Mass Spectrometry & Metabolomics Core Facility of Westlake University for sample analysis; the Westlake University HPC Center for computation assistance; and the Protein Characterization and Crystallography Facility of Westlake University for help in sample analysis. This work was funded by Ministry of Science and Technology of the People’s Republic of China (2020YFA0909200 to P.L.); Zhejiang Provincial Natural Science Foundation of China (LR23C050001 to P.L.), “Pioneer” and “Leading Goose” R&D Program of Zhejiang (2024SSYS0036), the National Natural Science Foundation of China (22137005 to L.L. and P.L.), China Postdoctoral Science Foundation project (2021M692883 to B.Z.), the Fellowship of Zhejiang Province Postdoctoral Science Foundation (ZJ2022006) and the Research Center for Industries of the Future (RCIF) at Westlake University.
Author information
Authors and Affiliations
Contributions
P.L. and L.L. conceived and supervised the project; K.S., S.L., B.Z., Y.Z. and T.W. contributed equally to this work; K.S. developed the computational method and designed the d-protein binders with help from L.C.; S.L., B.Z. and Y.Z. performed the refolding, yeast screening, and binder evaluation experiments for Pep-1, TrkA and IL-6, respectively, with help from K.Z., J. Zhang, and H.L.; T.W. chemically synthesized the d-targets and d-protein binders with the help from D.H. and J. Zheng; S.L. prepared protein crystals; M.L. solved the crystal structure; Y.Y. performed cell-based assays; L.C., B.C. and D.B. shared the method for the design of homochiral protein complexes before its publication; P.L. wrote the original draft with all authors participating in manuscript revision.
Corresponding authors
Ethics declarations
Competing interests
K.S., S.L., Y.Z., B.Z., T.W., L.L. and P.L. are inventors on a provisional patent application submitted by the Westlake University for the function of the d-proteins in this study.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, K., Li, S., Zheng, B. et al. Accurate de novo design of heterochiral protein–protein interactions. Cell Res (2024). https://doi.org/10.1038/s41422-024-01014-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41422-024-01014-2
This article is cited by
-
Designing de novo D-protein binders
Cell Research (2024)