In December 2019, numerous cases of pneumonia were reported in Wuhan, Hubei Province1,2,3 among which 19 confirmed cases and 39 imported cases were identified. The cause was identified as a new coronavirus disease (COVID-19) which is closely related to severe acute respiratory syndrome CoV (SARS-CoV)4. In early March, 88,913 cases of COVID-19 had been reported worldwide, 90% of the total were reported in China5, 8,739 cases of COVID-19 were reported to WHO from 61 countries outside of China, resulting in 127 deaths5. Moreover, The Republic of Korea has reported more than 4,200 cases and 22 deaths, which accounts for more than half of the cases of COVID-19 reported outside China5. To contain this virus outbreak, it is important to identify effective therapeutic drugs immediately6.

SARS-CoV-2’s main protease (Mpro), is emerging as a promising therapeutic target. This non-structural protein of coronavirus is responsible for processing the polyprotein translated from viral RNA7. It has been confirmed that viral replication is inhibited by Mpro inhibitor in SARS-CoV8. Its sequence is highly conserved with SARS-CoV Mpro (Fig. 1). When aligned, they show a sequence identity of 96%, and only the A46S mutation is located on the inhibitor binding site. Although no effective antivirals or vaccines against COVID-19 are currently reported, peptide-like HIV-1 protease inhibitors such as lopinavir and ritonavir have been reported to be effective against SARS-CoV Mpro8,9. Clinical trials of these repurposed HIV protease inhibitors for COVID-19 have already been launched (e.g. ChiCTR2000029603, 2/6/20)10. However, the mechanism of action for SARS-CoV-2 Mpro at the atomic-level remains unknown. Understanding the mechanism of action at the atomic-level resolution may provide insights for more rational drug design11 and may decrease the risk of future drug resistance12.

Figure 1
figure 1

Alignment of SARS-CoV and SARS-CoV-2’s main protease sequences and X-ray structure. As a result of pairwise alignment, sequence identity showed 96%. The green stick model in (B) indicates the inhibitor binding site, and sphere model indicates residues that are not conserved between both sequences. (A) Pairwise alignment result of SARS-CoV Mpro (above sequence) and SARS-CoV-2 Mpro (below sequence), (B) Structure alignment result of SARS-CoV Mpro (PDB ID: 2A5I, red ribbon) and SARS-CoV-2 Mpro (PDB ID: 6LU7, orange ribbon).

Computational methods are commonly used for structure-based drug discovery (SBDD) and ligand-based drug discovery (LBDD)13,14,15,16,17,18. LBDD is a technique for searching and designing new drugs based on experimental information and structural information of known compounds19,20. On the other hand, SBDD is a method based on the tertiary structural information of the target protein21. This study focused on SBDD to discover three-dimensional insight for target binding. Pharmacophore modeling is one of LBDD techniques to discover common features of ligands to bind to the target protein17. Molecular dynamics (MD) simulations, in which the dynamics of biopolymers in solution can be analyzed at the atomic level, is a typical SBDD method used to predict the interaction between proteins and inhibitors22,23,24,25,26. MD simulation is based on Newton's equation of motion and has been applied to biomolecules such as proteins, nucleic acids, and lipid membranes27,28,29,30. Recent studies have shown that MD simulations can be applied to clarify the binding mechanism between proteins and compounds at the molecular level, which is highly useful for rational drug design22,23,24,31,32,33,34. Fortunately, many complex structures of SARS-CoV Mpro and inhibitor have already been determined and are available in the Protein Data Bank35. Therefore, by modeling the complex structure of SARS-CoV-2 Mpro and inhibitors using information on the known structure of SARS-CoV-Mpro and peptide-like inhibitors, it is possible to analyze the characteristics of functional groups required for the molecular recognition of ligands by SARS-CoV-2 Mpro.

In the present study, we revealed important interactions for potential anti-coronavirus drugs to bind to SARS-CoV-2 Mpro by pharmacophore modeling and MD simulations. Based on pharmacophore modeling, three SARS-CoV-2 Mpro inhibitor candidates were selected, and SARS-CoV-2 Mpro-inhibitor complex models were built. Subsequently, we conducted MD simulations for the SARS-CoV-2 Mpro-inhibitor complex models to predict key characteristics of the functional groups required for molecular recognition by SARS-CoV-2 Mpro using interaction analysis.


Protein preparation and pharmacophore modeling

X-ray structures (2A5I, 2OP9, 6LU7) were downloaded from the Protein Data Bank (PDB). Assignment of bond orders and hydrogenation were performed using Maestro36. The suitable ionization states of each ligand were generated by Epik37 at pH 7.0 ± 2.0. Hydrogen bond optimization was performed using PROPKA38, and energy minimization calculations was conducted with Maestro using the OPLS3e force field39. Using the “protein structure alignment” tool in Maestro, all SARS-CoV Mpro structures were aligned to SARS-CoV-2 Mpro structure (PDB ID: 6LU7) to minimize RMSD based on alpha carbon. The pharmacophore was extracted by Phase40,41 using the conformation of the inhibitor in the structure of SARS-CoV Mpro. After constructing the pharmacophore model, the protein of the SARS-CoV Mpro-inhibitor complex superimposed on SARS-CoV Mpro was deleted, and the structure of the inhibitor and SARS-CoV-2 Mpro was merged. Indinavir was aligned to the pharmacophore model and the aligned Indinavir and SARS-CoV-2 Mpro structures were merged. Each merged structure was processed by hydrogen bond optimization and energy minimization calculations. These structures were used as initial structures for MD simulation.

MD simulation

MD simulations for interaction analysis were performed using Desmond42. The inhibitor-SARS-CoV-2 Mpro complex models were placed in the orthorhombic box with a buffer distance of 10 Å in order to create a hydration model. TIP3P water model43 was used for creation of the hydration model. The cut-off radius for van der Waals and electrostatic interactions, time step, initial temperature and pressure of the system were set to 9 Å, 2.0 fs, 300 K and 1.01325 bar respectively. The sampling interval during the simulation was set to 50 ps. Finally, we performed MD simulations under the NPT ensemble for 1 μs using OPLS3e force field. Following MD simulations, the “Simulation Interactions Diagram” tool in Maestro was used to perform an interaction analysis between Mpro and inhibitor. Images of simulated proteins and ligands were generated using Maestro36.


Structure alignment and pharmacophore modeling

To construct a SARS-CoV-2 Mpro-inhibitor model, we performed structure alignment between SARS-CoV Mpro-inhibitor complex structures and the SARS-CoV-2 Mpro structure. Figure 2A shows SARS-CoV Mpro inhibitors aligned with the pharmacophore model indicating the features of common functional groups of SARS-CoV Mpro inhibitors, namely 2A5I ligand and 2OP9 ligand, and Fig. 2B shows the positional relationship of the pharmacophore.

Figure 2
figure 2

Pharmacophore model constructed by SARS-CoV Mpro-inhibitor complex structure. Four features of inhibitors that bind to SARS-CoV Mpro were extracted. Blue spheres indicate H-bond donor (HBD), and red spheres indicates H-bond acceptor (HBA). (A) Alignment of pharmacophore model with each peptide-like inhibitor (Gray stick model: 2A5I ligand, Green stick model: 2OP9 ligand, Blue stick model: Indinavir). (B) Details of the positional relationship of the pharmacophore (Purple numbers: Distance between pharmacophores (Å), Green numbers: Angle between pharmacophores). (C) Amino acid residues of SARS-CoV Mpro (PDBID: 2A5I) around the pharmacophore model (His41-Donor sphere: 3.58 Å, Gly143-Acceptor sphere: 3.16 Å, Met145-Acceptor sphere: 3.12 Å, Glu166-Acceptor sphere: 3.37 Å, Gln189-Donor sphere: 1.72 Å).

Using Phase software, two pharmacophore candidates, which were common among three ligands and had four pharmacophore points, were obtained (Fig. S1). These candidates had the same interactions, but slightly different 3D coordinates. It is because a pharmacophore is initially developed from single reference ligand by Phase algorithm, and two candidates were developed from different reference ligands. The pharmacophore that fits other active ligands more were chosen, by using (1) the root-mean-squared deviation (RMSD) in the pharmacophore point positions, and (2) the cosine of the angles formed by corresponding pairs of donor/acceptor. The total “screen score” (higher is better) of three active ligands are 5.34 and 5.07, respectively. The structural alignment and the pharmacophore model revealed that these inhibitors have two H-bond donor (HBD) functional groups and two H-bond acceptor (HBA) functional groups as common features. These features are located on the carbonyl oxygen atom and the amine, which forms peptide bonds in the backbone of peptide-like inhibitor. The blue stick molecule in Fig. 2A indicates the predicted conformation of indinavir to fit the pharmacophore model. Indinavir fits all four pharmacophore features built from the SARS-CoV Mpro inhibitor.

Figure 2C shows the amino acid residues around the chemical group defined as the pharmacophore. His41 and Gln189 are adjacent to the HBD sphere, and Gly143, Ser144, Cys145 and Glu166 are adjacent to the HBA sphere. His41’s side chain is located where the lone pair of nitrogen atoms on the imidazole ring can contact the donor sphere. Also, the carbonyl oxygen in the side chain of Gln189 is located near the donor sphere. These residues may form hydrogen bonds with the HBD located on the donor sphere. On the other hand, the HBA sphere is located near the main chain of Gly143, Ser144, and Cys145. The HBA sphere has a high affinity for the backbone NH Group. The backbone of Glu166 is also located near the HBA sphere, which enables NH group on the Glu166 backbone to connect with the HBA sphere. In Fig. 2C, these distance between His41, Gly143, Met145, Glu166, Gln189, and each pharmacophore sphere are 3.58 Å, 3.16 Å, 3.12 Å, 3.37 Å, 1.72 Å respectively.

Interaction analysis by MD simulation

To clarify the key interactions between SARS-CoV-2 Mpro and drug candidates, we performed 1 μs MD simulations for each of six SARS-CoV-2 Mpro-inhibitor complex models. The complex models were created by superimposing SARS-CoV Mpro into SARS-CoV-2 Mpro. Protein and ligand RMSD information are presented in Figures S2 and S3. And root-mean-square fluctuation (RMSF) of amino acid residue is presented in Figure S4. Except for amino acid residues at both ends, the maximum RMSF of complex models is 2.0–2.4 Å (Figure S4A–C). In contrast, the maximum RMSF of apo form is 3.2 Å (Figure S4D). In the apo form result, fluctuations of amino acid residues around the 50th, 150th, and 270th positions are large (Figure S4D), and RMSF value around these regions decreases due to binding of inhibitor (Figure S4A–C). Figure 3 shows a 2D summary of the interaction analysis results of three SARS-CoV-2 Mpro-inhibitor complex models. Timeline representation of the interactions and contacts are presented in Figure S5.

Figure 3
figure 3

2D summary of the interaction analysis by MD simulation for each ligand. This figure contains SARS-CoV-2 Mpro amino acid residues which show an interaction probability of over 30% during MD simulation. Dotted lines indicate interactions between side chains and inhibitors, and solid lines indicate interactions between side chains and inhibitors. (A) Interaction results of 2A5I ligand, (B) Interaction results of 2OP9 ligand, (C) Interaction results of indinavir.

In all MD simulations, the interaction with Glu166 had the highest interaction rate. This residue mostly interacts with all ligands during each simulation (Figure S5). The 2A5I ligand and indinavir showed that it formed two hydrogen bonds with Glu166. Also, the interaction with His41 was maintained with a high probability in all MD results (78%, 92%, and 94%). This residue continues to interact with inhibitors during each simulation (Figure S5). Interactions with His41 were classified into two types: hydrogen bonding and Pi-stacking. In the interaction with His41, most of the hydrogen bond interactions were strongly connected. With the 2OP9 ligand and indinavir, hydrogen bonds to Gly143 and Cys145 were observed with a probability of over 50% during simulation. These interactions form with the main chain of Gly143 and Cys145. Two interactions were observed between the 2OP9 ligand and Cys145, in which the amine group of Cys145 main chain and the thiol group of Cys145 side chain were involved. With the 2A5I ligand and the 2OP9 ligand, an interaction between Gln189 and the inhibitors was confirmed with a probability of over 30% during simulation. 2A5I ligand, 20P9 ligand, and indinavir have one or two water bridge interactions with a probability of over 30% each during simulation. Especially, water between 2OP9 ligand and E166 forms a water bridge with a probability of 60%. According to the results of indinavir, water bridges are formed with T190 and Q192 with a probability of over 70%. Table 1 shows amino acid residues having an interaction probability of over 30% in each simulation. Interaction of His41, Gly143, Met165, and Glu166 were observed in all MD simulations. The side chains of His41 and the main chains of Gly143 and Glu166 were involved in the interaction, and Met165 forms a van der Waals (vdW) interaction with the inhibitors.

Table 1 Amino acid residues with interaction probability of over 30%.

Table 2 shows the interactions probabilities related to pharmacophore during 1 μs MD simulation. Gly143-Acceptor and Met145-Acceptor are involved in the same pharmacophore point. Among pharmacophore interaction, His41-Donor and Glu166-Acceptor are highly stable during MD simulation for all compounds. Other interactions are also relatively stable except Met145-Acceptor of 2A5I ligand and Gln189-Donor of indinavir.

Table 2 Interaction probabilities related to pharmacophore during 1 μs MD simulation.


In this study, we first modeled a pharmacophore based on the structure of the SARS-CoV Mpro bound to peptide-like inhibitors. There were common features in the main chain of these peptide-like inhibitors. In Fig. 2C, SARS-CoV Mpro residues: His41, Gly143, Ser144, Cys145, Glu166, and Gln189 were located near these pharmacophore spheres. Since these residues are conserved in SARS-CoV-2 Mpro, the features observed in SARS-CoV Mpro inhibitors will be located at similar positions in SARS-CoV-2 Mpro and thus, have the potential to inhibit SARS-CoV-2 Mpro. Moreover, the three-dimensional structure of SARS-CoV Mpro and SARS-CoV-2 Mpro is almost conserved (Fig. 1B), and amino acid sequence identity value shows 96%. The pharmacophores do not contact unconserved amino acid residues in SARS-CoV Mpro and SARS-CoV-2 Mpro. Thus, inhibitors that are matched with these pharmacophores may have the potential to inhibit both Mpro.

To investigate the potential of these compounds to bind SARS-CoV-2 Mpro, we performed MD simulations for SARS-CoV-2 Mpro-inhibitor complex models. We observed strong hydrogen bonding with Glu166 main chain. In addition, although the thiol group of Cys145 interacts to the 2OP9 ligand, it was confirmed that the main chains of Gly143, Ser144, and Cys145 also interact with each inhibitor. It is suggested that the interaction with these amino acid residues may not be affected by side chain mutations unless the binding site shape or the dynamics of each chain are changed. Interactions with His41 were confirmed as hydrogen bonding and Pi-stacking. In the hydrogen bond, NH in the imidazole ring of His41 works as HBD. In addition, the imidazole ring of His41 also forms Pi-stacking with each inhibitor. According to the results of pharmacophore modeling, HBD pharmacophore sphere is located near His41. In contrast, the MD simulations suggested that His41 works as HBD. Therefore, HBA functional group has the potential to contact with His41. MD simulations also suggested that aromatic functional groups have high affinity for His41. In each MD simulation, Gly143, Ser144, Cys145, Glu166, and Gln189 interact with functional groups defined as pharmacophore of peptide-like inhibitors. Therefore, interactions with these amino acid residues are important for binding to SARS-CoV-2 Mpro. In these MD simulation results, all ligand has one or two water bridges. Therefore, it is suggested that water bridges are involved in Mpro and inhibitor complex structure to stabilize the structure, functional groups of ligands can be extended to the space occupied by these waters. Figure 4 shows SARS-CoV-2 Mpro with α-ketoamide inhibitors (PDBID: 6Y2G)44 aligned to 6UL7. One hydroxyl group and two carbonyl groups of α-ketoamide are matched the pharmacophore model. However, one donor sphere is located at the nitrogen atom of the pyrimidine ring. Since this nitrogen atom has no hydrogen atom, it cannot function as a hydrogen bond donor. Comparing the structures of Gln189 in Figs. 2C and 4, the conformations of the side chains are different. Although the results of MD simulations suggested that the 2A5I ligand and the 2OP9 ligand interacted with Gln189, this structure has been suggested that the side chain conformation of Gln189 flexibly changes depending on the binding inhibitor. Irreversible inhibitors which have covalent bonds with Cys residue of SARS-CoV-2 Mpro have already been reported44. Irreversible inhibitors that selectively inhibit Mpro may have a higher binding affinity than competitive inhibitors and the inhibitors analyzed in this study are competitive inhibitors. However, drug repositioning is effective for highly urgent diseases such as COVID-19, and the pharmacophore proposed in this study can evaluate compounds which is not included a functional group to form a covalent bond with Cys. Therefore, the pharmacophore can be applicated for drug repositioning strategy.

Figure 4
figure 4

Alignment of α-ketoamide inhibitors and pharmacophore models. SARS-CoV-2 Mpro with α -ketoamide inhibitors (PDBID: 6Y2G) was aligned for 6LU7 and pharmacophore model using the “protein structure alignment” tool in Maestro.

In summary, this study suggests that compounds matching the pharmacophore model have potential as coronavirus inhibitors. Although these results were obtained from peptide-like inhibitors, the formation of these interactions allows the design and search of non-peptide-like compounds. The pharmacophore features that are important for binding to SARS-CoV-2 Mpro might help to develop new effective anti-coronavirus drugs.