Accelerating the prediction and discovery of peptide hydrogels with human-in-the-loop

Xu, Tengyan; Wang, Jiaqi; Zhao, Shuang; Chen, Dinghao; Zhang, Hongyue; Fang, Yu; Kong, Nan; Zhou, Ziao; Li, Wenbin; Wang, Huaimin

doi:10.1038/s41467-023-39648-2

Download PDF

Article
Open access
Published: 30 June 2023

Accelerating the prediction and discovery of peptide hydrogels with human-in-the-loop

Tengyan Xu^1,2^na1,
Jiaqi Wang^3,4,5^na1,
Shuang Zhao⁵^na1,
Dinghao Chen¹^na1,
Hongyue Zhang¹,
Yu Fang ORCID: orcid.org/0000-0002-7536-7099¹,
Nan Kong¹,
Ziao Zhou¹,
Wenbin Li ORCID: orcid.org/0000-0002-1240-2707^3,4,5 &
…
Huaimin Wang ORCID: orcid.org/0000-0002-8796-0367^1,2,3

Nature Communications volume 14, Article number: 3880 (2023) Cite this article

5283 Accesses
20 Citations
2 Altmetric
Metrics details

Subjects

Abstract

The amino acid sequences of peptides determine their self-assembling properties. Accurate prediction of peptidic hydrogel formation, however, remains a challenging task. This work describes an interactive approach involving the mutual information exchange between experiment and machine learning for robust prediction and design of (tetra)peptide hydrogels. We chemically synthesize more than 160 natural tetrapeptides and evaluate their hydrogel-forming ability, and then employ machine learning-experiment iterative loops to improve the accuracy of the gelation prediction. We construct a score function coupling the aggregation propensity, hydrophobicity, and gelation corrector C_g, and generate an 8,000-sequence library, within which the success rate of predicting hydrogel formation reaches 87.1%. Notably, the de novo-designed peptide hydrogel selected from this work boosts the immune response of the receptor binding domain of SARS-CoV-2 in the mice model. Our approach taps into the potential of machine learning for predicting peptide hydrogelator and significantly expands the scope of natural peptide hydrogels.

Machine learning overcomes human bias in the discovery of self-assembling peptides

Article 31 October 2022

Developing a machine learning model for accurate nucleoside hydrogels prediction based on descriptors

Article Open access 23 March 2024

Predicting the stability of homotrimeric and heterotrimeric collagen helices

Article 15 February 2021

Introduction

Hydrogel, an important class of soft materials, is formed from a self-assembled matrix that immobilizes water. Hydrogels have attracted increasing attention in various research fields because they mimic properties in natural systems such as the bodies of jellyfish, the cornea in the eye, and even the condensed chromatins in the cell nucleus^1,2. Inspired by natural self-assembled functional materials (high-order assemblies of proteins), considerable attention has been focused on hydrogels formed by peptides because of their high biocompatibility^3,4,5,6,7, low immunogenicity^8,9,10, and similarity to the extracellular matrix^11,12,13,14. To date, peptidic hydrogels have been widely used in materials science^15,16,17,18, biomedicine^19,20,21,22, and semiconductors^23,24,25. However, the current design capability fails to meet the growing demand for neoteric peptidic hydrogels since the existing inefficient methods still rely on amino acid sequences that derive from natural proteins, professional experience in the peptide field, or laboratory discoveries by serendipity^26,27,28. Therefore, accurate prediction of hydrogel formation and de novo design of peptidic hydrogels emerge as of great significance to broaden the available hydrogel-forming peptide library.

To better understand the self-assembly behaviors of peptides in forming hydrogels and the resulting morphologies, coarse-grained molecular dynamics (CGMD) has been employed to model peptide self-assembly^29,30,31,32. Ulijn and Tuttle’s groups recently developed a useful approach to provide valuable design rules for overcoming the limitation of serendipity in discovering aggregation or self-assembly in dipeptide and tripeptide systems^33,34. However, molecular dynamics (MD) simulations of selected peptides could only give the information (e.g., aggregation propensity, acronymized as AP) for predicting new peptides that derive from the original ones. Importantly, due to the enormous sequence quantities of peptides, brute-force MD is becoming increasingly intractable for investigating the hydrogel formation ability of longer-chain peptides^33,35,36. To the best of our knowledge, systematic studies on peptidic hydrogel prediction and de novo design are less explored and remain challenging^26,37.

This work provides an integrated computational, experimental, and machine learning (ML) approach to build a score function for discovering tetrapeptides for hydrogelation with an improved hit rate. Tetrapeptides have sufficient structural and sequence diversity for developing a peptide hydrogel library with ample candidates, while requiring a moderate workload of simulation for generating training data. This approach proceeds as follows, firstly, the computation adopts CGMD and ML-trained regression model to provide an estimation of AP (Fig. 1a). Based on the original score function AP_H³³, 55 peptides are selected and chemically synthesized (Fig. 1b) for verification of gelation. With the resulting gelation feasibility (i.e., yes or no), a classification model is trained to produce the gelation corrector C_g fed to the original score function. An updated score function is then devised as AP_HC (Fig. 1b). The process above is looped three times with mutual information exchange between ML and experimental results (Fig. 1b) to enhance the performance of C_g from experimental results of 165 peptides, 100 of which could form hydrogels after gelation tests. Finally, tetrapeptide hydrogels obtained by de novo design from our computational model are selected as immune adjuvants to boost humoral immune recognition towards the receptor binding domain (RBD) of SARS-CoV-2 virus (Fig. 1c). The results show that the selected tetrapeptide hydrogel boosts the immune response of a model protein RBD from the spike protein of coronavirus. Overall, an 8,000-peptide library for gelation is built based on AP_HC with a gelation rate reaching 87.1% (Supplementary Data 2), providing great potential for further innovations in peptide-based soft materials.

**Fig. 1: Workflow of coupled experimental and machine learning approach for discovering tetrapeptide hydrogels and their potential biological applications.**

Results

Performance of corrected score function AP_HC

We employed cost-effective ML prediction instead of performing brute-force CGMD for generating the AP values of the entire space of tetrapeptides containing 160,000 sequences. Therefore, accurate prediction of AP values relying on ML regression models should be a prerequisite for obtaining potential hydrogels. We tested various training conditions, including training algorithms, feature representation approaches, and the size of training datasets to obtain an optimal AI model (Supplementary Figs. 2–4, Supplementary Tables 2–4, and Supplementary Data 1). Using the algorithm of support vector machine (SVM)³⁸ with 10,000 training data represented by 80-bit one-hot approach with amino acid sequence (Supplementary Table 3), we obtained a reliable SVM model with training/testing performance of 0.095/0.092 in mean absolute difference (MAE_tr/MAE_te) and 0.928/0.933 in coefficient of determination (R²_tr/R²_te)³⁹ (Fig. 2a, b). Further analysis of the prediction performance of SVM model revealed that the error between the predicted AP (AP_prd) and simulated AP (AP_sim) was less than 2.5% as AP_sim was greater than 1.5 (Fig. 2c), proving the reliability and capability of the selected model on predicting peptide aggregates and further formation of hydrogels.

**Fig. 2: “Human-in-the-loop” for obtaining corrected score AP_HC.**

Distinctive from all available score functions focusing on the prediction of peptide self-assembly³³, we constructed a corrected score function AP_HC within three loops (Fig. 2d–f) for improving the gelation hit rate. Since the final goal was to develop a hydrogel-forming peptide library with the minimum candidate numbers and the highest gelation possibility, we constrained our gelation hit rate assessment within the top 8000 assessing scores (AP_H and AP_HC). We calculated AP_H (Fig. 1b) in the first loop and randomly selected 55 peptides (26 peptides that were among the top 8000 in the AP_H ranking), which were possibly to form hydrogel according to human expertise. It was found that 16 among the 26 peptides (within the top 8000) could form a hydrogel, and a corresponding gelation hit rate of 61.5% could be achieved with the AP_H score, while a similar hit rate of 63% can be achieved with the AP_prd score alone (Fig. 2d, left panel). With the total 55 gelation results, we trained a classification model to generate the gelation corrector C_g,1 with an averaged accuracy of 0.735 (averaged over ten parallel ML experiments, Fig. 2d, right panel). During the second loop, we calculated AP_HC,1 (AP_HC,1 = AP_prd² × logP^0.5 × C_g,1) and selected another 55 peptides, 30 of which were in the top 8000 AP_HC,1, and 23 peptides (of the 30 peptides) formed hydrogels, resulting in a hit rate of 76.7% within the top 8000 AP_HC,1 pool, while the AP_H score yielded only a gelation hit rate of 64.7% (Fig. 2e, left panel). Augmenting the gelation results from the second batch to the first batch (total 110 data), we retrained a classification model to update the gelation corrector from C_g,1 to C_g,2 with an average accuracy of 0.746 (Fig. 2e, right panel). Proceeding to the third loop, we updated the AP_HC,1 to AP_HC,2 (AP_HC2 = AP_prd² × logP^0.5 × C_g,2). Similar to the previous two loops, we selected 55 peptides, and a gelation hit rate of 81.6% (31 out of 38) was generated within the top 8000 AP_HC,2 and a rate of 75.0% was achieved with AP_HC,1 alone (Fig. 2f, left panel). With a total of 165 experimental gelation results, a final classification model was trained and produced gelation corrector C_g,3 with an averaged accuracy of 0.767 (Fig. 2f, right panel) and score AP_HC,3 (AP_HC,3 = AP_HC = AP_prd² × logP^0.5 × C_g,3) for each peptide, and a gelation hit rate of 87.1% was finally achieved with the top 8,000 AP_HC,3 (Fig. 2g), while the AP_prd and AP_H could only produce a gelation hit rate around 66% (Fig. 2h) based on the 165 gelation results. We listed the top 8,000 C_g and AP_HC (Supplementary Data 2 and 3) peptides for the convenience of selection and comparison.

To further differentiate between AP_HC and AP_H in predicting peptide hydrogels, we next compared the relationship between AP_HC and logP’ (Fig. 2i) as well as AP_H and logP’ (Fig. 2j) of experimentally synthesized 165 peptides that were marked with blue (gelation: yes) or red (gelation: no) dots, and those of total tetrapeptides (gray dots). Here, logP’ indicated normalized hydrophilicity between 0 and 1. In addition to the relationship between AP_HC and logP’, the relationship between AP_HC-AP and AP_HC-C_g was also investigated (Supplementary Fig. 6a, b). No linear correlation for AP_HC and logP’ (also AP_HC and AP) can be observed, demonstrating that hydrophobicity and aggregation propensity were not the only two contributors to gelation, for instance, lower isoelectric points (i.e., 4.5 ~ 6 on pH scale) could improve the gelation performance (Supplementary Fig. 6c) due to the Columbic interaction and hydrogen bonds, inducing the formation of water-containing networks between deprotonated peptides and water solvent. These results indicated the significance of cooperating experimental input (i.e., C_g) into a prediction of hydrogel-forming sequences. Furthermore, it was conducive for hydrogelation when logP’ was in the range of 0.05 to 0.4, as evidenced that the logP’ of all gelating peptides were in this range (Fig. 2i). Peptides with too weak hydrophilicity (<0.05) possibly form precipitates while ones with too strong hydrophilicity (>0.4) maintain in solution. The AP_H also assigned high scores to peptides with logP’ in the range of 0.05 to 0.4. However, AP_H cannot efficiently pinpoint peptides with high gelation potential and low AP_prd. As a result, more gelation peptides fall out of top 8000 compared to AP_HC (Fig. 2j). We have also compared the ranks of AP_HC and AP_H of the complete sequence space of tetrapeptides (Fig. 2k). AP_HC can significantly increase the rank of peptides which could potentially form hydrogels (maximum absolute difference in rank between AP_HC and AP_H is 8.5 × 10⁴), such as WVII (by 14311) and IMVV (by 57146), while decreasing the rank of peptides that hardly form the hydrogel, such as WPYY (by 33033) and WWCP (by 50677). These four peptides were synthesized, validating that WVII and IMVV can form hydrogel while WPYY and WWCP cannot (Fig. 2l, Supplementary Data 6 and 7).

Discovery and characterization of peptide hydrogels

After validating the efficiency of AP_HC in predicting tetrapeptide hydrogels, we detailed the phase state of 165 synthesized tetrapeptides and the observed assembly behavior in an aqueous solution. Having demonstrated the identity of each synthetic tetrapeptides by mass spectrometry (MS, Supplementary Data 4) and nuclear magnetic resonance spectroscopy (NMR, Supplementary Data 5), we defined the hydrogel as the formation of a self-supporting, non-flowing mixture of water and hydrogelator through the vial-inverting method. Figure 3a (Insert optical images) showed the 6 representative tetrapeptides (FVIY, WEFF, WKFF, WTIF, WVFY, and IFYT) hydrogels in the glass vial, probably due to the π-π interaction of more than two aromatic amino acids in the tetrapeptide. Transmission electron microscope (TEM) studies (Fig. 3a and Supplementary Data 6) showed that the hydrogel formed by FVIY, WEFF, WKFF, or WTIF contained entangled nanofibers, while the hydrogel formed by WVFY or IFYT contained interlaced nanosheets. MD simulations (1250 ns) confirmed the observation of TEM results, and the front ranking of AP_HC demonstrated the formation of these hydrogels. Mechanical properties of tetrapeptide hydrogels (Fig. 3b and Supplementary Fig. 7) indicated that both the elasticity (G’) and the viscosity (G”) exhibited weak frequency dependence between 0.01 and 100 Hz. The G’ values were higher than G” values, suggesting the formation of a hydrogel. Fourier transforms infrared (FTIR) spectroscopy (Fig. 3c) in the amide I region (1620–1648 cm⁻¹, C = O stretching vibration) revealed the presence of β-sheet conformation in all these six hydrogels, further indicating the presence of highly ordered peptide nanostructures.

**Fig. 3: Experimental investigations on the self-assembly behavior of 165 synthetic tetrapeptides.**

We also paid attention to those non-hydrogel-forming tetrapeptides to obtain rules of sequences of non-gelating peptides. Figure 3d (Insert optical images) showed six representative tetrapeptides (TRFS, PGWW, FGWW, RRRR, LRFH, and RWVF) with low AP_HC ranking, some of which were highly soluble while others were barely soluble in water. TEM images (Fig. 3d and Supplementary Data 6) showed that these six peptides formed aggregates with different sizes in an aqueous solution, which were qualitatively consistent with the morphologies obtained in MD simulations except for RRRR (Fig. 3d, right column), showing different levels of aggregation. Taking the TEM result of RRRR together, we attributed this to the thermodynamic factor of concentration. Finally, we presented a summary of the assembled morphologies of all synthesized tetrapeptides (Fig. 3e, f and Supplementary Data 6), indicating that hydrogel-forming tetrapeptides tended to form fibers, sheets, or hybrid morphology (70%) in an aqueous solution. Non-hydrogel-forming tetrapeptides self-assembled into aggregates, spheres, or particulate supramolecular structures (86%). The results above confirmed the self-assembled nanostructures and hydrogelation results of these synthetic tetrapeptides, as predicted by the corrected score function AP_HC.

Hydrogelation laws from experiment and simulation results

One hundred and sixty-five synthesized peptides were presented with different colors indicating the capabilities of hydrogel formation (Fig. 4a). The average rank of AP_HC (Fig. 4b) for peptides gelation at four certain concentrations was 2664, 2801, 3646, and 4899, respectively, which was consistent with the experimental results (Supplementary Data 7) of the gelation capability, demonstrating the reliability of AP_HC in screening tetrapeptides for the hydrogel formation.

**Fig. 4: Hydrogelation laws from experiments and simulations.**

Hydrogelation laws (i.e., the effect of position and type of amino acids on gelation) deduced from the experimentally synthesized peptides gelators (100 data) and computationally selected candidates (top 8000 data based on AP_HC) exhibited reasonable consistency (Fig. 4c). Aromatic amino acids (F and Y) had the largest contribution to gelation, especially when located at positions 3 and 4 near the C-terminus. The W had a much lower contribution due to the strong hydrophobicity, which may lead to suspension with precipitation instead of forming a hydrogel. The H amino acid with a five-membered ring structure was favored in position 3 in gelation peptides. Second to F, W, and Y, the amino acids I, L, V, and M can also contribute to gelation due to hydrophobicity carried by side chains. The simulation results only slightly increased the percentage of I, L, V, and M at positions 2 and 3, while the experimental results raised the percentage of I, L at positions 1 and 3 and V at position 2. The contribution of the polar amino acids N, Q, S, T, and C to gelation was identical in both the experiment and simulation. N and Q with strong polarity were rarely found in gelation peptides with scarce occupancy at position 1 or 2. S and T with moderate polarity were beneficial for gelation when S was located at positions 1, 2, and 4 and T at 1, 2. Apolar amino acid C contains the -SH group, which may induce the formation of disulfide bonds and stable nanostructures, especially when located at position 1. Amino acid P contributed to the hydrogel formation when located at position 1 because of the potential formation of the “kink” structure^33,40, promoting self-assembly. Meanwhile, G without functional side chains cannot significantly contribute to gelation. Charged amino acids D, E, R, and K had a minimal contribution to gelation. However, peptides with K near the N-terminus were found to form hydrogel due to the attraction of opposite charges driving self-assembly.

To gain an overview of the effects of position-type on gelation, we analyzed the AP_HC scores of the complete space sequence of tetrapeptides with fixed position 4 (fixed F, Fig. 4d; fixed remaining 19 amino acids, Supplementary Figs. 8–26). Different from Fig. 4c, we can discern the effect of doublets and triplets of amino acids on gelation, other than a single amino acid. It can be confirmed that aromatic-aromatic (F, W, Y – F, W, Y) and aromatic-hydrophobic (F, W, Y – I, L, M, V) doublets had positive effects on gelation synergistically. In addition, aromatic amino acids bonded with P and K exhibited similar positive performance. These rules can also be applied to the triplets. In addition, we analyzed the position-type percentage with respect to adjacent amino acids, based on the 100 hydrogel-forming peptides in the experiment and 8000 peptides with the highest AP_HC score in the simulation (Supplementary Fig. 27). It can also be deduced that aromatic-aromatic and aromatic-hydrophobic doublets have the most significant contribution to hydrogelation, and position-specific rules regarding other amino acids are also congruent with those deduced from Fig. 4c, d. For example, amino acid A is barely found in the fourth position, except when F or Y is located in the third position. In summary, we have presented a complete picture of the relationship between the gelation ability and position & type of 20 natural amino acids, providing schematic guidance for experimentalists to design tetrapeptide hydrogels and possibly functional applications associated.

Boosting antibody production of RBD vaccine

The advantage of self-assembling peptide materials is their remarkable multivalency, which contributes to improved immunogenicity. It is generally known that multivalency can repeatedly display ligands or epitopes to increase affinity for specific receptors while enhancing antibody responses^8,9,41,42,43. The RBD of the spike protein covering the surface of SARS-CoV-2 attracted our interest as a promising target antigen for COVID-19 vaccines^44,45,46,47. We hypothesized that tetrapeptide hydrogel could provide a biodegradable platform to encapsulate RBD protein and enhance humoral immune responses against RBD protein. Since YAWF has a AP_HC rank within the top 8000 (1661, Fig. 4a) and can form hydrogels containing nanofibrous network (Supplementary Fig. 28), we selected this tetrapeptide as a vaccine adjuvant candidate. We quantified the production of antigen-specific antibodies in C57BL/6 mice, which was a crucial indicator for evaluating the performance of the SARS-CoV-2 vaccine (Fig. 5a). Compared with the RBD group, the results (Fig. 5b) showed that the FDA (U.S. Food and Drug Administration) approved adjuvant aluminum could enhance the generation of IgG by 20.7-fold. The hydrogel formed by YAWF remarkably increased the generation of IgG by 41.6-fold (the endpoint titres of RBD, aluminum, and YAWF were shown in Supplementary Fig. 29), suggesting that the tetrapeptide hydrogel could boost the immune response in vivo. The results also indicated that the hydrogel group significantly enhanced the production of IgG1, which was similar to the aluminum group. The RBD-specific IgG2b response in the hydrogel group increased around 9.7-fold, compared with the commercial aluminum adjuvant group. As for IgG2c, the hydrogel formed by YAWF maintained high IgG2c titers, surpassing the ones in the aluminum group or control group (Fig. 5b and Supplementary Fig. 29).

**Fig. 5: Response to tetrapeptide-based hydrogel nano vaccination.**

During the infection, the regular pathway to produce IgG antibody is highly related to the proliferation of SARS-CoV-2-specific CD8⁺ or CD4⁺ T cells, which is reflected by the elevated secretion of several typical cytokines, including interleukin-5 (IL-5) and interferon-γ (IFN-γ). Compared with the aluminum adjuvant group, the mice that received YAWF based vaccine showed a higher IL-5 level in their splenocytes culture, and IFN-γ secretion was also obviously evoked (Fig. 5c). Thus, the YAWF stimulated an obvious cell-dependent adaptive immune response. To further confirm the capability of the tetrapeptide vaccine to regulate related cell immunity, the upstream dendritic cells (DCs) activation enhanced by tetrapeptide hydrogel was evaluated. The DCs treated with YAWF vaccine showed promising activation as the percentage of CD83, CD80, and CD86 expressing cells augmented to 72.0%, 71.1%, and 50.5% (Fig. 5e and Supplementary Fig. 30). Such intense activation could also be proved by the clustering of DCs (Fig. 5d) producing raised levels of Th-1 cytokines (Fig. 5f). To sum up, de novo designed tetrapeptide hydrogels as immune adjuvant successfully enhanced the immune response to RBD protein in vivo, providing great inspiration for us to explore natural tetrapeptide hydrogel library for biomedical applications.

Discussion

This work demonstrated an efficient “human-in-the-loop” framework that integrated coarse-grained molecular dynamics, machine learning, and experiments for the prediction and discovery of peptide hydrogels. The framework evolved into an updated score function AP_HC to evaluate the hydrogelation feasibility of 160,000 natural tetrapeptides, and a gelation hit rate of 87.1% with the top 8,000 AP_HC rank was achieved. The simulation and experiment revealed similar hydrogelation laws for short peptide design. Subsequently, a de novo-designed tetrapeptide hydrogel based on our hydrogelation laws was successfully applied in SARS-CoV-2 vaccine adjuvant, proving the potentials of the peptide libraries within the top 8,000 AP_HC rank for developing versatile biological and medical applications. Moving forward, the “human-in-the-loop” framework can be further automated by employing a robotic platform for synthesizing new peptides and performing machine learning for training classification models. The framework described here can also be extended to the efficient design of other functional materials/devices, including the terminal-covered peptide hydrogels, peptide batteries, peptide fluorescence probes, and peptide semiconductors, contributing to modern organic nanotechnology employing short peptide building blocks as key structural and functional elements.

Methods

Ethical approval

All mice were handled in accordance with institutional guidelines, and all animal experiments were approved by the Institutional Animal Care and Use Committee (IACUC) of Westlake University (IACUC Protocol #21-046-WHM).

Material sources

Fmoc-amino acids were obtained from GL Biochem (Shanghai, China). 2-Cl-trityl chloride resin was obtained from Nankai Resin Co. Ltd. (Tianjin, China). Commercially available reagents were used without further purification unless noted otherwise. Deionized water was used for all experiments. All other chemicals were reagent grade or better. Horseradish peroxidase-conjugated goat anti-mouse IgG, IgG1, IgG2b, and IgG2c (1030-05, 1071-05, 1091-05, and 1078-05, 1:5000 dilution) were obtained from Southern Biotech (USA). Mouse IFN-γ, IL-5, TNF-α and IL-6 ELISA kits (430807, 431204, 430907, and 431307) were obtained from Biolegend (USA). Recombinant murine GM-CSF and IL-4 (315-03 and 214-14) were purchased from Peprotech (USA). FITC anti-mouse CD83 Antibody, FITC anti-mouse CD80 Antibody, and PE anti-mouse CD86 Antibody (121505, 104705, and 159204, 0.5 µg per million cells in 100 µL volume for usage) were purchased from Biolegend (USA). Imject Alum Adjuvant was obtained from Thermofisher (USA). RBD-Fc (Z03513, SARS-CoV-2 Spike protein, RBD, mFc Tag, CHO-expressed) and RBD-H (Z03479, SARS-CoV-2 Spike protein, RBD, His Tag,) were purchased from Genscript Biotech (Nanjing, China). This research followed institutional guidelines, and all animal experiments were approved by the Institutional Animal Care and Use Committee of Westlake University.

Coarse-grained molecular dynamics (CGMD)

To speed up the simulation/screening, coarse-grained molecular dynamics (CGMD) simulations were adopted to generate machine learning (ML) training data, which were performed with the open-source GROMACS package^48,49 and Martini2 force field^50,51,52. The all-atom tetrapeptide structures (prepared based on CHARMM36⁵³) were coarse-grained using the python script martinize.py⁵¹. In simulations for screening purposes, total of 300 coarse-grained tetrapeptides (as zwitterions) were solvated randomly in a 13 nm × 13 nm × 13 nm box with water whose density was set as approximately 1 g/cm³ (~18700 water beads). The charge of the tetrapeptide/water system was maintained neutral by adding the proper amount of Na⁺ or Cl^-, and the system was also maintained at neutral pH. The whole system was then energy-minimized using the steepest descent algorithm⁵⁴, until the maximum force on each atom was less than 20 kJ mol^-1 nm^-1. Subsequently, the system was passed to an equilibration run for 5 × 10⁶ steps with a time step of 25 fs, resulting in a total simulation time of 125 ns. The temperature and pressure during the equilibration were controlled through Berendsen algorithm at 300 K and 1 bar, respectively. A total of 15,000 such simulations were performed, and the selection of the initial 15,000 tetrapeptides was based on Latin hypercubic sampling⁵⁵. To obtain more accurate and stable morphology of self-assembled structure, a 1,250 ns duration was employed, and the final morphology results were averaged over 8 identical simulations.

To quantitatively characterize the degree of self-assembly, we adopted the aggregation propensity (AP) value³³, which was calculated by:

$${{{{{\rm{AP}}}}}}=\frac{{{{{{{\rm{SASA}}}}}}}_{{{{{{\rm{initial}}}}}}}}{{{{{{{\rm{SASA}}}}}}}_{{{{{{\rm{final}}}}}}}}$$

(1)

Where the SASA_initial and SASA_final are the solvent (i.e., tetrapeptides) accessible surface area at the beginning and end of a CGMD equilibration run.

Self-assembled peptides cannot guarantee the formation of hydrogels, which was also affected by the hydrophobicity of peptides. Therefore, a hydrogel formation score function AP_HC considering hydrophobicity was utilized to screen out the peptides with the highest possibility of hydrogel formation under current experimental/computational conditions, as shown below:

$${{{{{{\rm{A}}}}}}{{{{{\rm{P}}}}}}}_{{{{{{\rm{HC}}}}}}}={{{{{\rm{AP}}}}}}{{\hbox{'}}}{{\times }}{{\log }}{{{{{\rm{P}}}}}}{{\hbox{'}}}{{\times }}{C}_{g}$$

(2)

$${{\log }}{{{{{\rm{P}}}}}}=\mathop{\sum }\limits_{i=1}^{4}\triangle {G}_{{{{{{\rm{wat}}}}}}-{{{{{\rm{oct}}}}}},i}$$

(3)

Where the AP’ and the logP’ are normalized AP and logP value (normalized to 1), and α (=2) and β (=0.5) are two coefficients determining the significance of AP’ and the logP’. ∆G_{wat-oct, i} (kcal mol^-1) is the Wimley–White whole-residue hydrophobicities for each amino acid⁵⁶. C_g is the gelation corrector output by the ML classification model trained with experimental gelation results.

Machine learning

Four different ML algorithms were deployed: Random Forest (RF)⁵⁷, Linear Regression (LR)⁵⁸, Nearest Neighbor (NN)⁵⁹, and Support Vector Machine (SVM)⁶⁰. Mean absolute error (MAE) and coefficient of determination (R²)³⁹ were calculated to assess the performance of each ML model. Different numbers of training data sets prepared by Excel 2019 (i.e., 1000, 5000, and 10,000) were used to train the ML models. In each training, 80% of the training data was used for training, while the remaining 20% was used for validation (Fig. 1b). After obtaining each model, another 5000 data were employed for independent testing.

Before training the ML model, we converted the amino acid sequence into numerical data with 4-integer and 80 bit one-hot representation approaches (shown in Supplementary Table 3, taking Glu-His-Asn-Thr, i.e., EHNT, as an example), aiming for enhanced model performance with optimal data presentation approach. Moreover, a tetrapeptide can be considered as a “tripeptide” with each “position” represented by one of the 400 possible dipeptide sequences, namely, a tetrapeptide can have 1200 possible bits with 3 of them to be 1. Therefore, we also trained models with a 1200-bit one-hot representation converted from the dipeptide sequence composition.

All the model training and prediction were conducted via ASCENDS code⁶¹, which employs a Python-based open-source data analytic toolkit, scikit-learn⁶². The training was initially performed based on the default hyperparameter settings in ASCENDS (Supplementary Table 5). To investigate the effect of hyperparameters on training performance, we selected three kernels and different parameters and ranges for tuning (Supplementary Table 6)⁶³. The performance of the SVM model with varying hyperparameters were illustrated in Supplementary Fig. 5. The highest training performance of MAE_tr was 0.090 and R²_tr was 0.934, as kernel = rbf, C = 100, and gamma = 0.001. However, it was only slightly increased compared to the generated training performance (MAE_tr = 0.095, R²_tr = 0.928, as shown in Fig. 2b) with the default hyperparameters (kernel = rbf, C = 1, and gamma = auto, equaling to 1/n_features = 1/80 = 0.0125). The slightly increased training performance would have minimal impact on the prediction, and we thus concluded that the default hyperparameters were good enough for achieving reliable prediction results.

Synthesis, purification, and characterization of tetrapeptides

The selected tetrapeptides were synthesized by solid-phase peptide synthesis (SPPS) using 2-chlorotrityl chloride resin, and the side chains of the corresponding N-Fmoc protected amino acids were properly protected by different chemical groups (Supplementary Fig. 1). First, the C-terminal of the first amino acid was conjugated to the resin. Anhydrous N, N’-dimethyl formamide (DMF) containing 20% piperidine was used to remove Fmoc group. To couple the next amino acid to the free amino group, HBTU (O-(Benzotriazol-1-yl)-N, N, N’, N’-tetramethyluronium hexafluorophosphate) was used as coupling reagent and the organic base N, N-diisopropylethylamine (DIPEA) was added. The growth of the peptide chain was performed according to the established Fmoc SPPS protocol. After the final coupling step, the excess reagent was rinsed with DMF, followed by five washing steps using dichloromethane (DCM) for 1 min (5 mL per gram of resin). The peptide was cleaved using cleavage reagent (trifluoroacetic acid (TFA): triisopropylsilane (TIS): H₂O = 95%: 2.5%: 2.5) for 45 minutes. 20 mL per gram of resin of ice-cold diethyl ether was then added to the concentrated cleavage reagent. The resulting precipitate was centrifuged at 1500 g for 10 minutes at 4 °C. The supernatant was then decanted, and the resulting solid was dissolved in H₂O/CH₃CN (1:1) for HPLC separation. HPLC was conducted at Agilent 1260 Infinity II Manual Preparative Liquid Chromatography system using a C18 RP column with CH₃CN (0.1% of trifluoroacetic acid) and water (0.1% of trifluoroacetic acid) as the eluents (Supplementary Table 1). The purity of each tetrapeptide was verified by HPLC, and the purified tetrapeptide was dissolved in 200 μL of 0.1 mg/mL methanol to prepare Mass spectrometry (MS) samples. MS was conducted at the Agilent InfinityLab LC/MSD system with MSD signal set as positive ion mode. NMR samples were prepared by dissolving purified tetrapeptides in 600 μL of 8 mg/mL DMSO-d6. ¹H NMR spectra were obtained on a Bruker BioSpin AVANCE NEO spectrometer (500 MHz, Switzerland), using tetramethyl silane as an internal standard. The structure of tetrapeptide was verified by MS and ¹H NMR.

Transmission electron microscope

We used the negative staining technique to observe the morphologies formed by peptides. A micropipette was used to load 10 μL of sample solution to a carbon-coated copper grid, and we used a piece of filter paper to remove the excess solution. After rinsing the grid with the deionized water, we used uranyl acetate to stain the sample for 1 minute and then rinsed the grid with deionized water again. The excess liquid was drained with filter paper and conducted on a Talos L120C system, operating at 120 kV.

Peptide hydrogel formation

This work defines hydrogel formation as a self-supporting, non-flowing mixture of water and hydrogelator by the vial-inverting method. All purified tetrapeptides were dissolved in ultrapure water (to form a 30 mM solution initially), followed by the stepwise addition of 1 N NaOH solution to adjust the overall aqueous pH to 6.5–8.5. Meanwhile, a short-term ultrasonication treatment was applied after each pH adjustment to facilitate peptide solubilization and speed up peptide self-assembly. These operations could be repeated several times until a viscous, translucent colloid was formed, suggesting the initial stage of the gelation process. The mixture was allowed to stand in for another 48 h for complete hydrogelation. Upon the absence of gelation phenomenon during the aforementioned loops, we increased the peptide concentration (60 mM, 90 mM, and 120 mM were selected) and re-started such gelation operation loop to explore their gelation feasibility.

Rheology

During the experiment, a rheology test was carried out on an ARES-G2 (TA instrument) system with a 25 mm parallel plate at a 500 μm gap during the experiment. In the process of dynamic frequency (strain = 0.5%) scanning, the obtained hydrogel was transferred to the test platform with a pipette, and the changes of elastic modulus (G’) and viscous modulus (G”) of the hydrogel during scanning (frequency from 100–0.01 Hz) was tested. The hydrogels were then characterized by dynamic strain sweep at a fixed frequency of 1 Hz, and the changes in elastic modulus (G’) and viscous modulus (G”) of the hydrogel were recorded (strain: 0.01–100%).

Fourier transform infrared spectroscopy

Samples of tetrapeptide hydrogel were first lyophilized to powder, then we placed the sample on a diamond single reflection attenuated total reflectance (ATR) module. Spectra were recorded on a FTIR micro spectrometer (ThermoFisher Nicolet iS50) by averaging 32 scans at a spectral resolution of 1 cm^-1.

Preparation of tetrapeptide hydrogel vaccines

YAWF was dissolved in 600 μL of endotoxin-free PBS buffer (≤0.5 EU/mL, Cellcook Biotech Co. Ltd., Guangzhou, China). A homogeneous hydrogel was formed by adjusting the final pH to 7.5 by 1 N NaOH. Then RBD protein (RBD-Fc) was added to the hydrogel, followed by vortexing and standing at room temperature for one hour to obtain a tetrapeptide hydrogel protein vaccine. For in vivo immune evaluation, C57BL/6 J mice were randomly divided into three groups (n = 6): (1) 15 μg RBD protein (RBD group); (2)12.5 μL aluminum adjuvant and 15 μg RBD protein (Alum + RBD group); (3) 60 mM tetrapeptide hydrogel and 15 μg RBD protein (YAWF + RBD group). Each mouse was injected subcutaneously in the groin with 100 μL of the prepared vaccine.

Mice

C57BL/6 J mice (female, 6-8 weeks old) were obtained from the laboratory animal resources center (LARC) at Westlake University. They were housed in specific pathogen-free (SPF) conditions, and a 12-h light/12-h dark cycle was used. The housing temperature for mice is between 20–26 °C with 40–70% humidity.

Tetrapeptide hydrogel in promoting dendritic cells maturation

Bone marrow cells were isolated from the femur and tibia of C57BL/6 J mice and then cultured in 1640 medium containing GM-CSF (5 ng/mL) and IL-4 (5 ng/mL) at 37 °C for 6 days⁶⁴. The collected immature DCs were plated in a 24-well plate at a density of 1 × 10⁶ cells per well. After 24 h, 50 μL of the blank medium, vaccine, and LPS were added to each well, respectively, then the medium volume was supplemented to 1 mL. The cells were cultured for another 24 h and centrifuged to collect the cells and supernatant. The acquired cells were labeled with FITC-tagged anti-CD80, FITC-tagged anti-CD83, and PE-tagged anti-CD86 for flow cytometry (CytExpert Software for CytoFLEX 2.4.0.28). The production of IL-6 and TNF-α in the cell culture supernatants was also analyzed by ELISA kit.

ELISA for antibody titer

The production of anti-RBD IgG, IgG1, IgG2b, and IgG2c antibodies in mice serum was analyzed by ELISA. RBD proteins (RBD-H) were plated on 96-well uncoated ELISA plates (Biolegend) at 3 μg/mL in PBS buffer overnight at 4 °C. The plate was blocked with 1% BSA for 2 h at 37 °C. By washing the plate three times with PBST (Phosphate buffer saline (PBS) with 5‰ Tween 20), 100 μL diluted mice serum was added into per well and incubated at 37 °C for 2 h. The plate was washed with PBST four times. Each well of 96-well plate was added with 100 μL HRP-labeled goat anti-mouse IgG, IgG1, IgG2b, and IgG2c binding antibody (1: 5000 diluted in blocking buffer) at 37 °C for 1 h. After washing the plate four times with PBST, 50 μL 3,3′,5,5′-tetramethylbenzidine (TMB) was added into per well. The reaction was stopped with 50 μL 2 M H₂SO₄. The absorbance value at 450 nm and 570 nm wavelength was determined by a Microplate reader (Thermo Fisher Scientific, Varioskan LUX). Titers were analyzed with log 10 serum dilution plotted against absorbance at 450 nm minus absorbance at 570 nm. Antibody titer values were defined as the highest serum dilution that gave an optical density above 0.1.

Cytokine production

On the 7th day after the last immunization, fresh splenocytes were collected by grinding the mouse spleen⁶⁵. The splenocytes (5 × 10⁶ cells/mL) from each group of mice were plated in 24-well plates, and stimulated with soluble RBD protein (50 μg/mL) for 96 h. The production of IFN-γ and IL-5 in cell culture supernatants was analyzed by ELISA kit.

Statistics and reproducibility

Statistical significance was determined using a one-way ANOVA test. Statistical analyses were performed using GraphPad Prism 8. For all representative TEM or optical images, experiments were performed three times independently with similar results.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All relevant data supporting the key findings of this study are available within the article and its Supplementary information files. Any additional requests for information can be directed to, and will be fulfilled by, the lead contact. Source data are provided with this paper.

References

Whitesides, G. M. & Wong, A. P. The intersection of biology and materials science. MRS Bull. 31, 19–27 (2006).
Article CAS Google Scholar
Strickfaden, H. et al. Condensed chromatin behaves like a solid on the mesoscale in vitro and in living cells. Cell 183, 1772–1784. e1713 (2020).
Article CAS PubMed Google Scholar
Matson, J. B., Zha, R. H. & Stupp, S. I. Peptide self-assembly for crafting functional biological materials. Curr. Opin. Solid State Mater. Sci. 15, 225–235 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Levin, A. et al. Biomimetic peptide self-assembly for functional materials. Nat. Rev. Chem. 4, 615–634 (2020).
Article CAS Google Scholar
Shigemitsu, H. & Hamachi, I. Design strategies of stimuli-responsive supramolecular hydrogels relying on structural analyses and cell-mimicking approaches. Acc. Chem. Res. 50, 740–750 (2017).
Article CAS PubMed Google Scholar
Lampel, A. et al. Polymeric peptide pigments with sequence-encoded properties. Science 356, 1064–1068 (2017).
Article ADS CAS PubMed Google Scholar
Boekhoven, J., Hendriksen, W. E., Koper, G. J. M., Eelkema, R. & van Esch, J. H. Transient assembly of active materials fueled by a chemical reaction. Science 349, 1075–1079 (2015).
Article ADS CAS PubMed Google Scholar
Rudra, J. S., Tian, Y. F., Jung, J. P. & Collier, J. H. A self-assembling peptide acting as an immune adjuvant. Proc. Natl Acad. Sci. USA 107, 622–627 (2010).
Article ADS CAS PubMed Google Scholar
Rudra, J. S. et al. Modulating adaptive immune responses to peptide self-assemblies. Acs Nano 6, 1557–1564 (2012).
Article CAS PubMed PubMed Central Google Scholar
Fries, C. N. et al. Advances in nanomaterial vaccine strategies to address infectious diseases impacting global health. Nat. Nanotechnol. 16, 1–14 (2021).
Article ADS CAS PubMed Google Scholar
Collier, J. H., Rudra, J. S., Gasiorowski, J. Z. & Jung, J. P. Multi-component extracellular matrices based on peptide self-assembly. Chem. Soc. Rev. 39, 3413–3424 (2010).
Article CAS PubMed PubMed Central Google Scholar
Mart, R. J., Osborne, R. D., Stevens, M. M. & Ulijn, R. V. Peptide-based stimuli-responsive biomaterials. Soft Matter 2, 822 (2006).
Article ADS CAS PubMed Google Scholar
Smith, D. J. et al. A multiphase transitioning peptide hydrogel for suturing ultrasmall vessels. Nat. Nanotechnol. 11, 95–102 (2016).
Article ADS CAS PubMed Google Scholar
Majumder, P. et al. Surface-fill hydrogel attenuates the oncogenic signature of complex anatomical surface cancer in a single application. Nat. Nanotechnol. 16, 1251–1259 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, X. et al. Self-assembled dipeptide aerogels with tunable wettability. Angew. Chem. Int. Ed. 59, 11932–11936 (2020).
Article CAS Google Scholar
Silva, G. A. et al. Selective differentiation of neural progenitor cells by high-epitope density nanofibers. Science 303, 1352–1355 (2004).
Article ADS CAS PubMed Google Scholar
Du, X., Zhou, J., Shi, J. & Xu, B. Supramolecular hydrogelators and hydrogels: from soft matter to molecular biomaterials. Chem. Rev. 115, 13165–13307 (2015).
Article CAS PubMed PubMed Central Google Scholar
Gačanin, J. et al. Autonomous ultrafast self-healing hydrogels by ph-responsive functional nanofiber gelators as cell matrices. Adv. Mater. 31, 1805044 (2019).
Article Google Scholar
Zhao, F., Ma, M. L. & Xu, B. Molecular hydrogels of therapeutic agents. Chem. Soc. Rev. 38, 883–891 (2009).
Article CAS PubMed Google Scholar
Wang, F. et al. Tumour sensitization via the extended intratumoural release of a STING agonist and camptothecin from a self-assembled hydrogel. Nat. Biomed. Eng. 4, 1090–1101 (2020).
Article CAS PubMed PubMed Central Google Scholar
Chakroun, R. W. et al. Supramolecular design of unsymmetric reverse bolaamphiphiles for cell-sensitive hydrogel degradation and drug release. Angew. Chem. Int. Ed. 59, 4434–4442 (2020).
Article CAS Google Scholar
Liang, G. L., Ren, H. J. & Rao, J. H. A biocompatible condensation reaction for controlled assembly of nanostructures in living cells. Nat. Chem. 2, 54–60 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. Microfabrication of peptide self-assemblies: inspired by nature towards applications. Chem. Soc. Rev. 51, 6936–6947 (2022).
Article CAS PubMed Google Scholar
Tao, K., Makam, P., Aizen, R. & Gazit, E. Self-assembling peptide semiconductors. Science 358, eaam9756 (2017).
Article PubMed PubMed Central Google Scholar
Fores, J. R. et al. Localized supramolecular peptide self-assembly directed by enzyme-induced proton gradients. Angew. Chem. Int. Ed. 56, 15984–15988 (2017).
Article Google Scholar
Li, F. et al. Design of self-assembly dipeptide hydrogels and machine learning via their chemical features. Proc. Natl Acad. Sci. USA 116, 11259–11264 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Wei, G. et al. Self-assembling peptide and protein amyloids: from structure to tailored function in nanotechnology. Chem. Soc. Rev. 46, 4661–4708 (2017).
Article CAS PubMed PubMed Central Google Scholar
Palmer, L. C. & Stupp, S. I. Molecular self-assembly into one-dimensional nanostructures. Acc. Chem. Res. 41, 1674–1684 (2008).
Article CAS PubMed PubMed Central Google Scholar
Lee, O. S., Cho, V. & Schatz, G. C. Modeling the self-assembly of peptide amphiphiles into fibers using coarse-grained molecular dynamics. Nano Lett. 12, 4907–4913 (2012).
Article ADS CAS PubMed Google Scholar
Wu, C. & Shea, J.-E. Coarse-grained models for protein aggregation. Curr. Opin. Struct. Biol. 21, 209–220 (2011).
Article CAS PubMed Google Scholar
McCullagh, M., Prytkova, T., Tonzani, S., Winter, N. D. & Schatz, G. C. Modeling self-assembly processes driven by nonbonded interactions in soft materials. J. Phys. Chem. B 112, 10388–10398 (2008).
Article CAS PubMed Google Scholar
Velichko, Y. S., Stupp, S. I. & de la Cruz, M. O. Molecular simulation study of peptide amphiphile self-assembly. J. Phys. Chem. B 112, 2326–2334 (2008).
Article CAS PubMed Google Scholar
Frederix, P. W. J. M. et al. Exploring the sequence space for (tri-)peptide self-assembly to design and discover new hydrogels. Nat. Chem. 7, 30–37 (2015).
Article CAS PubMed Google Scholar
Frederix, P. W. J. M., Ulijn, R. V., Hunt, N. T. & Tuttle, T. Virtual screening for dipeptide aggregation: toward predictive tools for peptide self-assembly. J. Phys. Chem. Lett. 2, 2380–2384 (2011).
Article CAS PubMed PubMed Central Google Scholar
Gupta, J. K., Adams, D. J. & Berry, N. G. Will it gel? Successful computational prediction of peptide gelators using physicochemical properties and molecular fingerprints. Chem. Sci. 7, 4713–4719 (2016).
Article CAS PubMed PubMed Central Google Scholar
Van Lommel, R., Zhao, J., De Borggraeve, W. M., De Proft, F. & Alonso, M. Molecular dynamics based descriptors for predicting supramolecular gelation. Chem. Sci. 11, 4226–4238 (2020).
Article Google Scholar
Weiss, R. G. The past, present, and future of molecular gels. what is the status of the field, and where is it going? J. Am. Chem. Soc. 136, 7519–7530 (2014).
Article CAS PubMed Google Scholar
Mueller, T., Kusne, A. G. & Ramprasad, R. Machine learning in materials science: Recent progress and emerging applications. Rev. Comput. Chem. 29, 186–273 (2016).
CAS Google Scholar
Nagelkerke, N. J. D. A note on a general definition of the coefficient of determination. Biometrika 78, 691–692 (1991).
Article MathSciNet MATH Google Scholar
Nagy-Smith, K., Moore, E., Schneider, J. & Tycko, R. Molecular structure of monomorphic peptide fibrils within a kinetically trapped hydrogel network. Proc. Natl Acad. Sci. USA 112, 9816–9821 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Si, Y., Wen, Y., Kelly, S. H., Chong, A. S. & Collier, J. H. Intranasal delivery of adjuvant-free peptide nanofibers elicits resident CD8(+) T cell responses. J. Control. Release 282, 120–130 (2018).
Article CAS PubMed PubMed Central Google Scholar
Rudra, J. S. et al. Self-assembled peptide nanofibers raising durable antibody responses against a malaria epitope. Biomaterials 33, 6476–6484 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hudalla, G. A. et al. Gradated assembly of multiple proteins into supramolecular nanomaterials. Nat. Mater. 13, 829–836 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, Z. Z. et al. Exosomes decorated with a recombinant SARS-CoV-2 receptor-binding domain as an inhalable COVID-19 vaccine. Nat. Biomed. Eng. 6, 791–805 (2022).
Article CAS PubMed Google Scholar
Gale, E. C. et al. Hydrogel-based slow release of a receptor-binding domain subunit vaccine elicits neutralizing antibody responses against SARS-CoV-2. Adv. Mater. 33, e2104362 (2021).
Article PubMed Google Scholar
Dai, L. et al. Efficacy and safety of the rbd-dimer–based covid-19 vaccine zf2001 in adults. N. Engl. J. Med. 386, 2097–2111 (2022).
Article CAS PubMed Google Scholar
Liu, Y. et al. A D-peptide-based HIV gelatinous combination vaccine improves therapy in ART-delayed macaques of chronic infection. Nano Today 42, 101353 (2022).
Article CAS Google Scholar
Abraham, M. J. et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015).
Article ADS Google Scholar
Berendsen, H. J., van der Spoel, D. & van Drunen, R. GROMACS: A message-passing parallel molecular dynamics implementation. Comput. Phys. Commun. 91, 43–56 (1995).
Article ADS CAS Google Scholar
de Jong, D. H. et al. Improved parameters for the martini coarse-grained protein force field. J. Chem. Theory Comput. 9, 687–697 (2013).
Article PubMed Google Scholar
Marrink, S. J., Risselada, H. J., Yefimov, S., Tieleman, D. P. & De Vries, A. H. The MARTINI force field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 111, 7812–7824 (2007).
Article CAS PubMed Google Scholar
Monticelli, L. et al. The MARTINI coarse-grained force field: extension to proteins. J. Chem. Theory Comput. 4, 819–834 (2008).
Article CAS PubMed Google Scholar
Huang, J. & MacKerell, A. D. Jr CHARMM36 all‐atom additive protein force field: Validation based on comparison to NMR data. J. Comput. Chem. 34, 2135–2145 (2013).
Article CAS PubMed PubMed Central Google Scholar
Meza, J. C. Steepest descent. WIREs Comp. Stat. 2, 719–722 (2010).
Article Google Scholar
Helton, J. C. & Davis, F. J. Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab. Eng. Syst. Saf. 81, 23–69 (2003).
Article Google Scholar
Singh, G. & Tieleman, D. P. Using the Wimley–White hydrophobicity scale as a direct quantitative test of force fields: the Martini coarse-grained model. J. Chem. Theory Comput. 7, 2316–2324 (2011).
Article CAS PubMed Google Scholar
Svetnik, V. et al. Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comp. Sci. 43, 1947–1958 (2003).
Article CAS Google Scholar
Seber, G. A. & Lee, A. J. Linear regression analysis. (John Wiley & Sons, 2012).
Altman, N. S. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992).
MathSciNet Google Scholar
Noble, W. S. What is a support vector machine? Nat. Biotechnol. 24, 1565–1567 (2006).
Article CAS PubMed Google Scholar
Lee, S., Peng, J., Williams, A. & Shin, D. ASCENDS: advanced data SCiENce toolkit for non-data scientists. J. Stat. Softw. 5 1656 (2020).
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
Hsu, C-W., Chang, C-C., Lin, C-J. A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University. URL: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf (2003).
Luo, Z. C. et al. A Powerful CD8(+) T-cell stimulating d-tetra-peptide hydrogel as a very promising vaccine adjuvant. Adv. Mater. 29 1601776 (2017).
Wang, H. M. et al. Enzyme-catalyzed formation of supramolecular hydrogels as promising vaccine adjuvants. Adv. Funct. Mater. 26, 1822–1829 (2016).
Article ADS CAS Google Scholar

Download references

Acknowledgements

T.X., D.C., and H.W. are supported by the National Natural Science Foundation of China (82022038), and the Research Center for Industries of the Future (RCIF) at Westlake University. J.W., S.Z., and W.L. are supported by RCIF under Award No. WU2022C041. T.X. acknowledges the support of the Zhejiang Postdoctoral Science Foundation (No. 102216582101). J.W. also acknowledges the support of the Zhejiang Postdoctoral Science Foundation (No. 103346582102) and the National Natural Science Foundation of China (No. 52101023). This research was supported by Instrumentation and Service Centers for Molecular Science and for Physical Science, respectively, as well as by Biomedical Research Core Facilities at Westlake University.

Author information

These authors contributed equally: Tengyan Xu, Jiaqi Wang, Shuang Zhao, Dinghao Chen.

Authors and Affiliations

Department of Chemistry, School of Science, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang Province, China
Tengyan Xu, Dinghao Chen, Hongyue Zhang, Yu Fang, Nan Kong, Ziao Zhou & Huaimin Wang
Institute of Natural Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang Province, China
Tengyan Xu & Huaimin Wang
Research Center for the Industries of the Future, Westlake University, No. 600 Dunyu Road, Sandun Town, Xihu District, Hangzhou, 310030, Zhejiang Province, China
Jiaqi Wang, Wenbin Li & Huaimin Wang
Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang Province, China
Jiaqi Wang & Wenbin Li
School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang Province, China
Jiaqi Wang, Shuang Zhao & Wenbin Li

Authors

Tengyan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Dinghao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongyue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Fang
View author publications
You can also search for this author in PubMed Google Scholar
Nan Kong
View author publications
You can also search for this author in PubMed Google Scholar
Ziao Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Li
View author publications
You can also search for this author in PubMed Google Scholar
Huaimin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.X. and J.W.: Conceptualization, Visualization, Methodology, Formal analysis, Computational simulation, Writing-original draft. S.Z. and D.C.: Formal analysis, Computational simulation, Writing-review & editing. H.Z., Y.F., N.K., and Z.Z.: Methodology, Resources, Writing-review & editing. W.L. and H.W.: Conceptualization, Supervision, Funding acquisition, Writing-review & editing. All the authors discussed and commented on the manuscript.

Corresponding authors

Correspondence to Wenbin Li or Huaimin Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Arne Elofsson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer review file

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Xu, T., Wang, J., Zhao, S. et al. Accelerating the prediction and discovery of peptide hydrogels with human-in-the-loop. Nat Commun 14, 3880 (2023). https://doi.org/10.1038/s41467-023-39648-2

Download citation

Received: 09 November 2022
Accepted: 22 June 2023
Published: 30 June 2023
DOI: https://doi.org/10.1038/s41467-023-39648-2

This article is cited by

Advanced construction strategies to obtain nanocomposite hydrogels for bone repair and regeneration
- Wang Ding
- Yuxiang Ge
- Xiaofan Yin
NPG Asia Materials (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.