Introduction

A novel coronavirus connected to a cluster of acute respiratory illnesses, named COVID-19 by the World Health Organization (WHO), has spread worldwide and resulted in a pandemic. Genetic analysis has revealed that the virus is closely related to SARS-CoV with 79.6% sequence identity [1, 2]. Hence, it was named SARS-CoV-2 by the International Committee on the Taxonomy of Viruses. Up to February 27th of 2021, 112,902,746 patients have been confirmed worldwide, and 2,508,679 resulting deaths (https://covid19.who.int/). Unfortunately, the number of patients is increasing, but there is no effective drug treatment so far.

SARS-CoV-2 and SARS-CoV are beta coronaviruses, which are enveloped, single, and positive-stranded RNA viruses [3]. Their genome RNA encodes a part-structural replicase polyprotein and structural proteins, which include Spike (S), Envelope (E), Membrane (M), and Nucleocapsid (N) proteins [4]. Recent studies have systematically constructed the protein-protein interaction (PPI) map between SARS-CoV-2 and human cells, which have provided several potential PPI interfering strategies to inhibit the virus [5].

S protein is a major structural protein of SARS-CoV-2, that is essential for the interaction of the virions with host cell receptors, i.e., ACE2, and subsequent fusion of the viral envelope with the host cell membrane to allow infection [1, 6]. It consists of two subunits: S1 and S2. The S1 subunit contains a receptor-binding domain (RBD) that triggers the infection by its binding to a host cell receptor [7, 8] and another N-terminal domain. The S2 subunit is responsible for mediating fusion between the viral host cell membranes [9, 10]. Delineating virus entry mechanisms [11], designing molecular agents, that bind to the ACE2 binding site of the S protein to interfere with the protein-protein interaction (PPI) of ACE2 and S protein, may be a promising strategy to prevent viral infection. In addition, a determination of the cryo-EM structure of the S protein has provided the structural basis to develop molecular binders [6].

ACE2 is an angiotensin-converting enzyme, whose Apo state and Holo complex with small molecular inhibitor MLN-4760, have been elucidated [12]. Recently, many studies have reported that some FDA-approved drugs exhibit potential activity to inhibit the entry of SARS-CoV-2 into host cells [13, 14]. Some of these inhibitors can directly block the binding of ACE2 to S protein, such as clemastine [15] and heparan sulfate [16], whereas others act by inhibiting Transmembrane Serine Protease 2 (TMPRSS2) [13], which is responsible for S protein priming. The latter appear to have attracted much attention in drug development to target SARS-CoV-2, whereas the former, i.e., drugs that interfere with the interface of ACE2 and RBD [17], have received little attention. A large number of antibodies, that potently neutralize the ACE2-RBD binding interface, have been reported [18,19,20,21]. However, small molecular blockers, which have lower costs and more flexible clinical use, have seldom been reported.

Interestingly, the inhibitor-binding ACE2 structure exhibits a large hinge-bending motion in the Apo state of ACE2 [12], which suggests that the binding between ACE2 and S protein is affected allosterically. Hence, it was expected that ACE2 inhibitors could be used to treat SARS-CoV-2. To further verify this hypothesis, we used a Bio-Layer Interferometry (BLI) Octet® RED 96 platform (Sarotirus, Fremont, California, USA) to characterize the binding interactions between RBD and ACE2 with or without ACE2 inhibitor MLN-4760 (Fig. 1a, b). MLN-4760 is a very potent ACE2 inhibitor with an IC50 of 0.44 nM [22]. However, our results indicated that the ACE2 inhibitor appeared to have a negligible effect on the binding affinity between RBD and ACE2. The dissociation constant KD between the two proteins is 52.83 nM versus 59.11 nM treated with MLN-4760. However, although ACE2 exhibits some conformational distortion while bound to MLN-4760, there were no obvious changes in its binding interface to RBD (Fig. 1c). Overall, designing small molecular blockers to directly disrupt the binding of ACE2-RBD may be the most feasible approach to prohibit the infection.

Fig. 1: Comparison of RBD binding affinities with Apo ACE2 and ACE2 treated with MLN-4760 by BLI.
figure 1

Association and dissociation curves of RBD with Apo ACE2 (a) or ACE2 treated with MLN-4760 (b) in a concentration range between 4.68 nM and 150 nM. The number that follows the ±sign is the standard deviation (SD). (c) Comparison of binding conformations between the binding complex of RBD with Apo ACE2 (PDB code: 1R42) and Holo ACE2 (PDB code: 1R4L). The conformational alignment was performed on the first 100 residues at the N-terminus of ACE2.

Designing small molecular modulators to interfere with PPIs, is a challenging task in drug design and development, which is hampered by the large, flat, and featureless areas of interfaces [23, 24]. This also occurs when developing inhibitors to prevent binding of the RBD domain of the S protein to ACE2 [25]. Hotspot residues are groups of essential residues that dominantly contribute to binding affinities for PPIs [26, 27]. Detecting and targeting such important residues can greatly improve the success rate of identifying lead compounds. Thus, our hotspot identification module in Fd-DCA [28] was used to search over the cryo-EM structure of the RBD of S protein (PBD code:6M17) [7]. As shown in Fig. 2, multiple residues that may contribute to binding affinities between the RBD of the spike protein to ACE2 were identified and focused on for our molecular docking simulations to search for candidate RBD binders.

Fig. 2: Identification of hotspot residues (highlighted in red and listed in the text box) on the binding sites of the RBD to S protein by Fd-DCA.
figure 2

The orange stick-ball model represents the predicted low-energetic bound conformations of fragment-sized molecular probes introduced in Fd-DCA. The dark green surface model is the RBD domain of S protein.

On the basis of the previously reported structural information of binding complexes of ACE2–RBD [7, 8], some critical interactions between the two proteins can be found, i.e., Q498, K417, and Y453 of RBD, which form hydrogen bonds with Q24, Q42, K31, D30, and H34 of ACE2, and Y489 and F486 of RBD, which interact with F28, Y83, M82, and L79 of ACE2 via a cluster of hydrophobic contacts. Some residues were also considered as potential hotspot residues on the RBD for protein–protein or protein-ligand binding (Fig. 2). Therefore, they were regarded as the druggable binding pocket to search for candidate small molecular binders of RBD.

Materials and methods

Protein expression, and purification, and Bio-layer interferometry assay

The codon-optimized wild-type cDNA of SARS-CoV-2 receptor-binding domain (RBD) (residues 333–530) was synthesized by GENEWIZ. The SARS-CoV-2 RBD was expressed using the Bac-to-Bac baculovirus system. The supernatant of cell culture containing the secreted removal of glycosylated RBD was harvested 72 h after infection and concentrated and RBD was captured by Ni-NTA resin (GE Healthcare). The resin was washed 5–6 times with 30 mL of wash buffer (25 mM Tris, 150 mM NaCl, 40 mM imidazole, pH 7.5), the target protein was eluted with elution buffer containing 25 mM Tris, 150 mM NaCl, 500 mM imidazole, pH 7.5. The protein was further purified on a Superdex S75 (GE Healthcare) column equilibrated with 25 mM Tris, 150 mM NaCl, pH 7.5. The RBD protein was biotinylated by using Biotin-Protein Ligase kit (GeneCopoeiaTM), further purified on the Superdex S75 column, and concentrated to 15 mg/mL.

Bio-Layer Interferometry assay was conducted at 30 °C in PBS, 0.02% Tween 20, 0.1% BSA, pH 7.4, using an Octet Red 96 instrument (Sarotirus, Fremont, California, USA). Sensors were loaded with 10 μg/mL ligand (Fc-RBD). The dissociation wells were used only once to ensure the potency of the buffer. To characterize whether MLN4760 could inhibit ACE2 binding to immobilized Fc-RBD, the proA sensors, which were coated with Fc-RBD, were exposed to 150 nM, 75 nM, 37.5 nM, 18.75 nM, 9.375 nM, 4.68 nM ACE2 with 1.5 μM MLN4760. The proA sensors, which were coated with Fc, were exposed to 150 nM, 75 nM, 37.5 nM, 18.75 nM, 9.375 nM, 4.68 nM ACE2 with 1.5 μM MLN4760. The correction of any systematic baseline drift was accomplished by subtracting the shift recorded for sensors loaded with ligand but no analyte.

More details are shown in S1 File.

Ligand structure preparation and virtual screening

Natural products in the database were first prepared for docking using the LigPrep ligand preparation module (Schrödinger, Inc), which generates multiple minimized conformations and protonation/tautomerization states with default settings. Then, virtual screening was performed using a molecular docking module, named Glide 8.5 [29]. The same protein structure (PBD code:6M17 [7]) used in hotspot searching was applied to carry out the molecular docking simulations. The structure was first prepared and refined by Protein Preparation Wizard [30] of Schrödinger 2019-4 with default parameter settings that included assigning bond orders, adding hydrogens, and assigning partial charges. A grid for docking simulations was generated with a size of 20 × 20 × 20 Å3 cubes by centering on the basis of the coordination of Q498 and Y489 on the RBD surface to cover the hotspot residues identified by Fd-DCA. During the virtual screening step, two docking methods, high throughput virtual screening (HTVS) and standard precision (SP) were used for preliminary rough docking and more accurate evaluation. The number of ligands retained from HTVS was set to 30% and the top 50% of SP docking ligands were eventually exported for further visual inspection.

Mass spectrometry

MS was performed by a Thermo Fisher fusion orbitrap with 20 µM protein and 1 mM compound in ammonium acetate buffer. Our protein sample of RBD (aa. 319–591) for MS analysis was provided by Dr. Guo from Nan Kai University (the detailed protein expression method is shown in supplementary materials).

Surface plasmon resonance

RBD was diluted in sodium acetate solution (pH 4.5) to a final concentration of 50 μg/mL and was then immobilized covalently on a CM5 sensor chip. The final immobilization level was 4430.3 resonance units. The measurement was run using PBS with 0.005% (v/v) surfactant P20 (pH 7.4) and 1% DMSO as the running buffer. The compound was diluted in the running buffer from the highest concentration. All compound measurements were performed at a flow rate of 30 μL/min. Data processing and analysis were performed using the BIA evaluation 1.1 software.

Anti-viral activity assay using native SARS-CoV-2 virus

To evaluate the anti-viral efficacy of these compounds, Vero cells were cultured overnight in 48-well plates at a density of 5 × 104 cells/well. The cells were pretreated with various doses of the indicated compounds for 1 h, and then the virus (MOI of 0.01) was added. At 48 h p.i., the culture supernatant was collected and treated with lysis buffer (Takara, Cat. no. 9766) for quantification as described in the previous study [31].

For cytotoxicity assays, Vero cells were suspended in a growth medium in 96-well plates at a density of 1 × 104 cells/well. The next day, appropriate concentrations of compounds were added to the medium. After 24 h, the relative numbers of surviving cells were measured by the CCK8 assay (Beyotime, China) in accordance with the manufacturer’s instructions.

Results

Virtual screening of the natural product database

Natural products are rich sources for drug development [32, 33]. Approximately 60% of drugs in the market have originated from natural sources [34]. In this study, we searched for lead compounds in an in-house natural product database from Dr. Ye Yang, which contained 2467 compounds. As a result, five candidate compounds were selected for experimental validation, as shown in Fig. 3.

Fig. 3: Compound structures of the selected five candidate molecular binders of RBD.
figure 3

The groups of carbasugars were boxed by dashed red lines.

Among the five compounds, four contained groups of carbasugars. Gentiopicrin (GTCP), is a precursor of gentiogenal [35, 36]. Cordycepin (CDCP) is an excellent anti-cancer lead compound because of its various types of bioactivities, i.e., AMP-activated protein kinase (AMPK) agonist activity [37], inhibiting the activity of mTORC1 function [38], downregulating HIF-1α expression in tumor cells [38] and activating autophagy [39]. Potential bioactivities of the other three compounds, MCCS-B, H69D1, and H69C2, have not been reported to our knowledge.

Mass spectrometry of candidate molecular binders

To determine whether the compounds shown in Fig. 3 bound to the S protein as predicted, MS and SPR were employed for the binding evaluation. As a result, four of the five candidates presented signals of binding to RBD, as shown in Fig. 4. H69C2 was not examined because of its poor solubility in ammonium acetate buffer.

Fig. 4: Mass spectrometric determination of the interaction between RBD and our compounds.
figure 4

Differences in molecular weights between RBD protein and complexes are indicated by red arrows.

Surface plasmon resonance of candidate molecular binders

Next, we performed SPR experiments with a Biacore T200 (GE Healthcare) to determine the binding affinities and kinetics of the five compounds for RBD. The protein sample of RBD (aa. 319–591) used in the experiments was obtained from the National Protein Center of China (the method of recombinant protein preparation is described in the supplementary materials).

As shown in Fig. 5, all compounds exhibited concentration-dependent binding to the RBD of the S protein. We next obtained dissociation constant KD values. Compared with the other three compounds, CDCP and GTCP showed weaker binding affinities (CDCP was 20.460 μM and GTCP was 29.559 μM). H69D1, MCCS-B, and H69C2 showed obviously stronger binding affinities than CDCP and GTCP, with a dissociation constant KD values of 2.154, 3.560, and 0.0947 μM, respectively. Binding kinetics data of these compounds are shown in Fig. 5.

Fig. 5: Binding kinetics and affinity analysis of the five compounds for binding to the RBD domain of S protein of SARS-CoV-2 using a Biacore T200.
figure 5

Multiple concentrations of the compounds were injected into the equipment to fit the binding data, which are represented by different colors in each subgraph.

Binding model analysis

Interestingly, in accordance with the MS results shown in Fig. 4, GTCP was bound to RBD in the 2:1 model, i.e., one RBD molecule accommodated two GTCP molecules. To further determine how the compound interacted with RBD, we compared all predicted possible binding poses from docking simulations. As a result, there may be two potential binding sites in RBD of ACE2 to attach two GTCP molecules simultaneously (Fig. 6). While the rest of the compounds bound to RBD in the 1:1 model, the lowest-energetic binding confirmation for each of them was maintained. To refine the predicted binding conformation and relax the predicted binding complexes from the molecular docking simulations, the Molecular Mechanics Generalized Born and Solvent Accessibility module with the OPLS_2005 force field of Schrödinger 2019-4 was used for energy minimization.

Fig. 6: Binding model of the five compounds.
figure 6

Global overview of compounds on RBD and details of the five compound binding models are linked by the dashed line. The hydrogen bonds between the RBD sidechain and small molecular blockers are shown as a dashed line.

By analyzing the refined binding conformations, as shown in Fig. 6, we found that the binding sites of these five small molecular blockers may be distributed on three subregions of the large binding site of RBD. CDCP, H69D1, and one of the binding poses of GTCP bound in the same subregion of RBD. A common hydrogen bond was observed between the E484 of RBD and CDCP, H69D1, and GTCP. Excluding this, a series of hydrophobic contacts formed with aromatic residues, such as Y351, Y449, and F490, or alkyl chain residues, i.e., L492, and L452. Our hotspot residue identification results showed that these residues i.e., Y449, F490, and L452, which strongly interacted with the compounds, were also potential hotspot residues in protein–protein interactions between S protein and ACE2. This indicated that these compounds competitively interacted with three hotspot residues and interfered with the interaction between them and ACE2. Excluding the common binding interaction features, an additional hydrogen bond or network was identified between H69D1 and the residues, such as T470 and Q493, which may make H69D1 bind much stronger to RBD (around 10 times stronger in accordance with SPR experiments) than CDCP or GTCP.

For GTCP, its second potential binding site was located around Q498 and Y453, as shown in Fig. 6. Multiple hydrogen bonds formed with the side chain of Y453 and the backbone of G502 and G496 were identified. Although two possible binding conformations of GTCP were found, its binding affinity measured by SPR was still relatively lower than that of the other small molecular blockers.

In accordance with the predictions, the binding site of H69C2 (Fig. 6) was adjacent to the binding site of H69D1 and the other two. Several hydrogen bonds around the molecule formed between the side chains of R403, D405, E406, Q493, and S494. Moreover, its flavonoid skeleton played a role in stabilizing the binding via its hydrophobic interaction with hydrophobic residues, such as L455, Y495, and Y505 (Fig. 6). Among these critical residues, several of them, i.e., R403, Q493, L45, and Y505, were predicted as hotspot residues. These important molecular interactions conferred H69C2 with the best binding affinity and binding kinetics (around 20 to 300 times stronger in accordance with SPR experiments) compared with the other four small molecular blockers. This further confirmed the importance of designing molecules to interact with hotspot residues and interfere with protein–protein interactions.

Compound MCCS-B, which was identified to bind to the area adjacent to H69D1 (Fig. 6), also had a moderate binding affinity for RBD with a KD value of 3.560 μM. This strong binding affinity may be attributed to molecular interactions, such as hydrogen bonds between it and the hotspot residues of Q493, N501, and Q498, as well as hydrophobic interactions with hotspot residues of L452, as an example.

H69C2 for SARS-CoV-2 treatment as an entry inhibitor

By measuring viral RNA in culture supernatant by quantitative RT-PCR, we found that H69C2 inhibited SARS-CoV-2 replication in a dose-dependent manner (Fig. 7a), with an IC50 value of 85.75 μM. We also performed cytotoxicity assays using H69C2 and found that the CC50 value of H69C2 was above 250 μM (Fig. 7b). The anti-SARS-CoV-2 activity of H69C2 was also evaluated by monitoring the intracellular SARS-CoV-2 NP level using immunofluorescence. We found that the intracellular SARS-CoV-2 NP level was decreased after treatment with H69C2 (Fig. 7c), which indicates that H69C2 inhibited SARS-CoV-2 replication in vitro.

Fig. 7: Inhibition of SARS-CoV-2 by H69C2 in Vero E6 cells.
figure 7

Vero E6 cells infected with SARS-CoV-2 at an MOI of 0.01 were treated with various concentrations of H69C2. (a) Quantitative RT-PCR assays were performed to measure the viral copy number in the cellular supernatant. The y-axis indicates percentage inhibition of the virus relative to the sample treated with DMSO (vehicle). (b) Cell viability assay. The y-axis represents the percentage of cell viability relative to the sample treated with DMSO (vehicle). (c) Immunofluorescence images of intracellular NP. At 24 h post-infection, cells were fixed, and intracellular NP levels were monitored by immunofluorescence. Scale bars, 400 μm. Data are shown as the mean ± s.e.m., n = 3.

Discussion

As shown in Fig. 8, the small-molecular blockers, derived from natural products, bound to the surface of RBD, which hindered the binding of RBD and ACE2 and thus protected healthy cells and prevented further deterioration in the early stage of infection. By combining computational methods and experiments, we successfully identified five small molecular binders of the RBD of S protein, which bound on the RBD to block binding of ACE2 and prevent the infection by viruses. These compounds may competitively interact with hotspot residues, that were identified to contribute to the majority binding affinity of RBD to ACE2. Two of them showed low micromolar binding and H69C2 showed the strongest binding affinity of 0.0947 µM. In addition, H69C2 was validated to block viral infection in vitro with an IC50 of 85.75 µM.

Fig. 8: Competitive binding mechanism of the identified small molecules over ACE2 to bind to the RBD of S protein.
figure 8

Small molecule blockers bind to the RBD and interfere with RBD binding to ACE2.

Apart from designing anti-viral agents by targeting S protein, there are several highly potent anti-virus inhibitors that target Mpro and RdRp. Inhibitors of Mpro and RdRp have been proposed for combining to the treatment of COVID-19. Multi-target drug combination therapy has become a promising treatment strategy. The strategy is applicable to RBD molecular blockers, such as H69C2, which have different anti-virus effects from Mpro and RdRp inhibitors. There are two phases in the cell infection of SARS-CoV-2, namely cell entry, and proliferation. Inhibitors of Mpro, RdRp, and host proteases function in the stage of proliferation, but our molecular blockers can inhibit the stage of virus infection. Hence, it is expected that the combinations of RBD molecular blockers and other inhibitors will enhance the anti-viral effect. The cytokine release syndrome is the major cause of the fatal outcomes of severe COVID-19 patients [40, 41]. As discussed above, multi-target drug combination therapy is becoming a promising treatment. In addition to targets of SARS-CoV-2, targets of the host can also be considered, such as melatonin [42], and other anti-inflammatory agents.

In addition to designing RBD blockers that directly disrupt the binding between ACE2 and S protein, another approach is to identify the allosteric binding site and discover allosteric modulators, such as toremifene, to indirectly destabilize the binding between two proteins [43]. Recently, many advances have been made in the design of vaccine epitopes [44, 45], monoclonal antibodies [46, 47], and peptides [48] that target the S protein. Small molecular blockers that also bind to the epitopes of the spike are promising for COIVD-19 treatment.

So far, there are 930 naturally occurring missense mutations in the SARS-CoV-2 S protein reported in the GISAID database. Four common mutations among them are located on the interface between S protein and ACE2, which include K417N, L452R, E484K/Q, and N501Y (https://bigd.big.ac.cn/ncov/). Several studies have shown that these mutations allow escape from vaccines, monoclonal antibodies, and convalescent plasma [49,50,51,52]. In accordance with our predicted binding mode of H69C2 to S protein, we found that L452 and E484 were not involved in the interaction with H69C2, and thus may not affect the binding of H69C2. We performed residue scanning simulations using Schrödinger 2021-1 with the OPLS4 force field to evaluate the effect of K417N and N501Y on the binding affinities of H69C2 and ACE2 to the RBD domain of the S protein, respectively. The mutation of K417N may minimally affect the binding between ACE2 and RBD (\(\Delta \Delta G = \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{K417N}}}}} - {{{\mathrm{ACE}}}}2} - \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{WT}}}}} - {{{\mathrm{ACE}}}}2} = 0.16\)), but N501Y increased their binding to a certain extent(\(\Delta \Delta G = \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{N501Y}}}}} - {{{\mathrm{ACE}}}}2} - \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{WT}}}}} - {{{\mathrm{ACE}}}}2} = - 1.42\)). Similar studies have been performed to monitor the binding affinities between the mutants of RBD and H69C2. These computationally obtained observations agreed well with an experimental study [53]. Interestingly, we found that both mutations increased the binding between H69C2 and RBD, with higher binding affinities (\(\Delta \Delta G = \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{K417N}}}}} - {{{\mathrm{H69C}}}}2} - \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{WT}}}}} - {{{\mathrm{H69C}}}}2} = - 1.60\) and \(\Delta \Delta G = \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{N501Y}}}}} - {{{\mathrm{H69C}}}}2} - \Delta G_{{{{\mathrm{RBD}}}}_{{{{\mathrm{WT}}}}} - {{{\mathrm{H69C}}}}2} = - 5.67\)), respectively. Hence, H69C2 was also effective against the mutants.

This study also predicted the ADMET properties of the identified natural products using the admetSAR webserver [54]. As shown in Table 1, H69C2, H69D1, and CDCP were predicted to be absorbed by the intestines. MCCS-B could pass the blood–brain barrier. In terms of metabolic properties, CDCP had an easier metabolic reaction than the other products. H69C2 inhibited P-glycoprotein and CYP3A4. Unfortunately, H69C2 and H69D1 might cause Ames mutagenesis and hepatotoxicity, which indicates that they should be chemically modified.

Table 1 Predicted ADMET properties of identified natural products.

Overall, this study computationally identified critical druggable binding sites on the interface between ACE2 and S protein and identified five potential small molecular blockers that target this site to disrupt the binding of these two proteins, which provides a type of chemical scaffold for further optimization and evaluation.