Conserved interactions required for inhibition of the main protease of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)

The COVID-19 pandemic caused by the SARS-CoV-2 requires a fast development of antiviral drugs. SARS-CoV-2 viral main protease (Mpro, also called 3C‐like protease, 3CLpro) is a potential target for drug design. Crystal and co-crystal structures of the SARS-CoV-2 Mpro have been solved, enabling the rational design of inhibitory compounds. In this study we analyzed the available SARS-CoV-2 and the highly similar SARS-CoV-1 crystal structures. We identified within the active site of the Mpro, in addition to the inhibitory ligands’ interaction with the catalytic C145, two key H-bond interactions with the conserved H163 and E166 residues. Both H-bond interactions are present in almost all co-crystals and are likely to occur also during the viral polypeptide cleavage process as suggested from docking of the Mpro cleavage recognition sequence. We screened in silico a library of 6900 FDA-approved drugs (ChEMBL) and filtered using these key interactions and selected 29 non-covalent compounds predicted to bind to the protease. Additional screen, using DOCKovalent was carried out on DrugBank library (11,414 experimental and approved drugs) and resulted in 6 covalent compounds. The selected compounds from both screens were tested in vitro by a protease activity inhibition assay. Two compounds showed activity at the 50 µM concentration range. Our analysis and findings can facilitate and focus the development of highly potent inhibitors against SARS-CoV-2 infection.

. Structural overview of main protease homodimer of SARS-CoV-2 and its binding site. (A) Surface topology of SARS-CoV-2 Mpro homodimer in complex with the covalent α-ketoamide inhibitor (PDB structure 6Y2F). The two monomers are colored in blue and purple and the inhibitors are represented in gray. (B) Superimposition of SARS-CoV-2 Mpro (6W63, shown as ribbon and colored in green) and SARS-CoV-1 (4MDS, shown as ribbon and colored in gray) in complex with their non-covalent inhibitors X77 (shown as sticks and colored in cyan) and ML300 (shown as sticks and colored in black), respectively, shown as ribbons. The catalytic residues H41 and C145 are in sticks. The different amino acids SARS-CoV-2 S46 and CoV-1 A46 are shown in sticks. (C) Magnified view of (B) (binding site) (D) Superimposition of the most diverse structures of SARS-CoV-2 and SARS-CoV-1 (available at that time) are shown in ribbons. SARS-CoV-1, 2ZU5 (gray), SFfARS-CoV-2, 5R80 (purple), SARS-CoV-2, 6LU7 (pink), SARS-CoV-2, 6M03 (red), SARS-CoV-2, 6Y2F (orange). Residues within this site Q189, M49 and N142 and the catalytic residues H41 and C145 are represented in sticks. (E) Magnified view of (D). All images were drawn using the maestro software (https :// www.schro dinge r.com/maest ro).
In this study, we analyzed the available SARS-CoV-1 and SARS-CoV-2 Mpro co-crystal structures and the developed SARS-CoV-1 inhibitory compounds and identified key interactions required to identify and develop an active inhibitor for the main protease. We conducted a virtual screen using a library of only FDA-approved drugs against SARS-CoV-2 Mpro structure from protein data bank (PDB) [6W63] 13 using three docking software tools (GOLD 36 and Glide [37][38][39] and DOCKovalent 40 ). Several compounds were selected and tested in vitro using a protease inhibition assay.

Results
Analysis of co-crystals flexibility. To identify the flexibility of the Mpro binding site, we superimposed the SARS-CoV-1 and SARS-CoV-2 apo and co-crystal structures available at the time of our study in the PDB (Table 1). We selected the five most distinct, root-mean-square deviation (RMSD)-wise, structures within the 3D space of the binding site. The selected structures were 2ZU5, 5R80, 6LU7, 6M03, 6Y2F. Three flexible residues within the binding site showed variation in their positions between the different structures: Glutamine 189, Methionine 49 and Asparagine 142 (Fig. 1D,E).

Covalent and non-covalent co-crystal interactions.
To find the essential interactions required for inhibition of the SARS-CoV-2 Mpro, we analyzed the interactions observed with both covalent and non-covalent inhibitors ( Table 1). Most of the co-crystals for SARS-CoV-1 and SARS-CoV-2 contain covalent inhibitors (32 structures). To date, only 6 co-crystals contain non-covalent inhibitors.
Analyzing the co-crystal interactions of the non-covalent inhibitors revealed that the two SARS- All covalent compounds interact with the catalytic C145 in the co-crystals. Interestingly, most (31 out of 32) comprise also a non-covalent interaction, H-bond with H163 similarly to the non-covalent compounds. All of the covalent and non-covalent inhibitors present a H-bond acceptor to the side chain imidazole ring of H163 (see for example Fig. 2A Docking of the cleavage recognition sequence. The proteolytic activity of Mpro catalyzes cleavage between Glutamine and Serine within the viral polypeptides. To characterize the interactions required for cleavage, we analyzed two co-crystals with peptidomimetic inhibitors [2A5I, 3VB5]. In these structures, the side chain of the catalytic C145 binds the peptide Serine backbone. The catalytic C145 side chain is rigid in all co-crystals except for [2A5I] in which the side chain adopts a unique conformation. Interestingly, the peptide Glutamine side chain of the cleavage site is anchored by a H-bond interaction with the H163 imidazole (Fig. 2E).
To characterize the interactions of the cleavage recognition sequence peptide we chose, based on the peptidomimetic inhibitors, the following sequence: Ala, Val, Leu, Gln, Ser, Ala, Gly. We docked (using Glide) the recognition sequence peptide to SARS CoV-2 6LU7 crystal structure and superimposed the two co-crystals with the peptides [2A5I, 3VB5]. The Glutamine within the recognition sequence adopted the same conformation as in the two co-crystals, presenting the same H-bond interaction with H163 imidazole (Fig. 2F). In addition, the peptide's Valine and Alanine backbone interact with the E166 backbone (through water molecules). were present in all compounds tested (see for example few known inhibitors in Fig. 3).
In summary, the two hydrophilic interactions with H163 and E166 backbone exist in most of the covalent and non-covalent co-crystal ligands and all of these co-crystals show at least one of these interactions. The known inhibitors show the same pattern of interactions and these interactions seem to play a role in the recognition sequence binding, thus highlighting them as biologically significant. Therefore, in the screening process these interactions were chosen as filtering criteria, allowing to pass only poses that satisfied at least one of these two interactions, for further analysis. Table 1. SARS-CoV-1/2 Mpro covalent and non-covalent co-crystals PDB structures. A list of all PDB structures used in this work and their interactions with the two key residues H163 and E166 backbone (bb) are summarized. The known activity from the literature is mentioned when available in either inhibition concentration of 50% (IC50) or inhibitory constant (Ki). *-represent interaction through water molecule. **introduces a donor to the imidazole. th-Theoretically represent interaction through water molecule, although the water molecule is not present in the structure. www.nature.com/scientificreports/ In addition, the Schrödinger SiteMap tool 42,43 identified two hydrophobic regions within the vicinity of the binding site and we found that most of the covalent and non-covalent co-crystal ligands and the known inhibitors introduced hydrophobic moieties within those regions. We filtered the poses based on the two significant interactions identified in our analysis: H163 imidazole H-bond and E166 backbone amine H-bond (see the "Materials and Methods" section for details). We chose the best docking poses that satisfied either one or both of these interactions, resulting in at most three poses for each compound. This stage resulted in 2993 unique compounds poses in GOLD and 1969 unique compounds in Glide. We manually selected the filtered poses resulting in 21 compounds in GOLD and 13 in Glide. Altogether, a total of 29 unique compounds (4 of which were selected in both methods) were selected and sent for assessment using the protease inhibition assay. One compound, selected by the GOLD software, GSK-256066, showed 37% inhibition at concentration of 50 µM (Supplementary Table 1).
Covalent docking using DOCKovalent. Several covalent docking software were developed at Nir London's lab at the Weizmann institute 40 . As there are very few possible known drugs that can perform covalent binding, we used preclinical and clinical compounds from the DrugBank database 45 . This database was filtered to contain only compounds with covalent warheads that can be docked using DOCKovalent (see "Materials and Methods") to: [6M03, 5R7Y, 5R7Z, 6Y2F, 6W63, 4MDS, 2GX4, 6LU7] PDB structures. These compounds were visually inspected and we selected the ones that showed additional interactions to the C145 covalent interaction. We tested 5 nitriles and one Michael acceptor and two of the nitriles (bicalutamide and ruxolitinib) showed 36% and 20% inhibition at 50 μM, respectively (Supplementary Table 2).

Discussion
Antiviral drugs targeting the Mpro of SARS-CoV-2 could support the fight against the global COVID-19 pandemic. Here, to identify possible inhibitors of the SARS-CoV-2 Mpro, we have explored the co-crystal structures of the Mpro proteins of SARS-CoV-2 and SARS-CoV-1. We identified two common interactions involving H163 and E166 that appeared in most co-crystals. We screened in silico drug databases for covalent and non-covalent compounds. Possible compounds were further tested in a protease inhibition assay and we found several compounds that reduce protease activity by more than 30%.
The Mpro protein sequence of SARS-CoV-2 is highly similar (99%) to SARS-CoV-1. In the region of the binding site only one residue is different. Some studies suggested that the differences between the two proteins affected the ability to bind inhibitors 46,47 . On the other hand, several studies and our protease inhibition assay show that inhibitors identified for SARS-CoV-1 Mpro also inhibit SARS-CoV-2 Mpro (see Supplementary Table 3).

Further co-crystals of SARS-CoV-1 [2MAQ] and SARS-CoV-2 [6LU7] Mpro with the identical inhibitor (N3)
show similar interactions with the protease binding site 12 . Thus, we inferred that the binding to the binding site of both viruses is comparable and therefore we were able to analyze the key interactions based on co-crystals obtained from both viruses.
We identified that all co-crystals have at least one of two key interactions with H163 and E166. Docking of the recognition sequence peptide into the binding site revealed that H163 and E166 form H-bonds with the peptide. Specifically, the imidazole ring of H163 interacts with the conserved Glutamine of the cleavage site 11 while E166 interacts with the Alanine and Valine from the recognition sequence. Interestingly, E166 side chain interacts with Serine 1 NH 2 -terminal of the second monomer 11,48 . This salt bridge interaction minimizes the conformational flexibility of E166 backbone and assists in generating the correct orientation of the substrate binding site, which explains the importance of dimerization for the catalytic activity 48 . H163 and E166 amino acids are conserved among all human coronaviruses (2 alpha-and 5 beta-coronaviruses Fig. 4), unlike H164 and Q189 that were previously identified as important interactions of several inhibitors 12,23 . Thus, drugs developed to interact with these amino acids may be effective against all human coronaviruses and could potentially prevent the emergence of viral resistance. Riva et al 5 found 100 clinical compounds that were found to reduce viral replication by at least 40%. Most of these compounds are preclinical and below phase 2 thus were not included in our focused screen; only seven compounds passed our filtering criteria. In our in-silico screening process, Beclabuvir did not pass the docking. The other six compounds (Chloroquine, Tamibarotene, Mardepodect, Tretinoin, Apilimod, JNJ-42165279) were shown to interact with H163, E166 or both. These compounds further support the importance of the H163, E166 interactions for inhibition of SARS-CoV-2 Mpro.
In addition, several attempts to identify in-silico inhibitors of SARS-CoV-2 Mpro have been already published [49][50][51][52][53]  in the identified co-crystals flexibility residues M49, Q189 and N142 (Fig. 1D,E). For the covalent docking we used seven different crystal structures (see results) to allow more flexibility in the binding site.
Our two screening analyses resulted in two clinically approved drugs that inhibit the Mpro by over 30% in 50 μM: The first one is GSK-256066, a phosphodiesterase (PDE) 4 inhibitor 54 that was under development in phase 2 for the treatment of chronic obstructive pulmonary disease (COPD), asthma and seasonal allergic www.nature.com/scientificreports/ rhinitis. It is administered as an inhalation formulation (powder) and as an intranasal formulation (nasal spray suspension). Our model suggests that GSK-256066 forms a H-bond with H163 and additional two H-bonds with the amine and carbonyl of E166 backbone (Fig. 5). It inhibits the Mpro by 37% at a concentration of 50 μM. Another drug that showed inhibition of the Mpro is bicalutamide, which was selected from the covalent screening. It contains an aryl nitrile that can covalently bind to the protein. Bicalutamide is an oral non-steroidal anti-androgen for prostate cancer. It is comprised of a 50:50 racemic mixture of the (R)-and (S)-enantiomers. Bicalutamide binds to the androgen receptor. Our model suggests that its nitrile group covalently binds to C145 and forms two H-bonds with the amine and carbonyl of the E166 backbone (Fig. 5). Bicalutamide was tested in two experiments and inhibited Mpro by 37% and 33% at a concentration of 50 μM.
Several compounds that were previously identified as inhibitors with sub-micromolar potency were active in our protease inhibition assay (Supplementary Table 3). Two of these inhibitors with known sub-micromolar activity, showed limited inhibition (39% and 9%) at a concentration of 50 μM in our protease activity assay (Supplementary Table 3). Thus, GSK-256066 and bicalutamide, that were identified in our protease inhibition assay, have a similar inhibitory activity at the same concentration. These results suggest that more assays should be conducted to test repurposing of these drugs as anti-SARS therapeutics.
In conclusion, our analysis of the structural constraints required for the inhibition of SARS-CoV-2 Mpro has suggested key interactions with several amino acids in the active pocket of the protein. We were able to identify several approved drugs with a potential to inhibit Mpro activity; however, the observed inhibition in our experimental assay suggest that these compounds need to be chemically modified to be considered as potential treatment. In addition, our analysis could be used for further virtual screenings of larger compound databases or for rational drug development.

Materials and methods
Protein data bank (PDB) search. The protein data bank was searched for SARS-CoV/SARS-CoV-1/ SARS-CoV-2 Mpro. Non SARS-CoV structures and non-human SARS-CoV like structures were omitted. Cocrystals binding fragments were not added to this analysis due to their non-drug like structures. We anticipate that few of the available structures might be overlooked using these search criteria. All PDB structures found and analyzed are mentioned in Table 1  Selection. In our manual selection we preferred ligands that in addition to one or two important interactions (H163 and E166) also formed interactions with additional residues that were found in the co-crystal structure (for example Gly143 backbone). In addition, we favored compounds that did not violate the two hydrophobic regions within the binding site as calculated by Maestro's SiteMap tool (Schrödinger Release 2020-1: SiteMap, Schrödinger, LLC, New York, NY, 2020 42,43 ).
The Mpro inhibition assay 56 was carried out in the Mantoux Bioinformatics institute of the Nancy and Stephen Grand Israel National Center for Personalized Medicine (INCPM), Weizmann Institute of Science (as part of the COVID Moonshot initiative https ://covid .poste ra.ai/covid /activ ity_data). www.nature.com/scientificreports/