Targeting the coronavirus SARS-CoV-2: computational insights into the mechanism of action of the protease inhibitors lopinavir, ritonavir and nelfinavir

Coronavirus SARS-CoV-2 is a recently discovered single-stranded RNA betacoronavirus, responsible for a severe respiratory disease known as coronavirus disease 2019, which is rapidly spreading. Chinese health authorities, as a response to the lack of an effective therapeutic strategy, started to investigate the use of lopinavir and ritonavir, previously optimized for the treatment and prevention of HIV/AIDS viral infection. Despite the clinical use of these two drugs, no information regarding their possible mechanism of action at the molecular level is still known for SARS-CoV-2. Very recently, the crystallographic structure of the SARS-CoV-2 main protease (Mpro), also known as C30 Endopeptidase, was published. Starting from this essential structural information, in the present work we have exploited supervised molecular dynamics, an emerging computational technique that allows investigating at an atomic level the recognition process of a ligand from its unbound to the final bound state. In this research, we provided molecular insight on the whole recognition pathway of Lopinavir, Ritonavir, and Nelfinavir, three potential C30 Endopeptidase inhibitors, with the last one taken into consideration due to the promising in-vitro activity shown against the structurally related SARS-CoV protease.


Scientific Reports
| (2020) 10:20927 | https://doi.org/10.1038/s41598-020-77700-z www.nature.com/scientificreports/ A possible target is for example represented by structural viral proteins, therefore interfering with the assembly and the internalization of the pathogen into the host, which was shown to occur also in this case through the Angiotensin-converting enzyme II (ACE2) receptor. From this perspective, the development of a vaccine is desirable, and it is foreseen that the first candidates will be advanced to clinical phase I around mid-2020 [7][8][9] .
In the meantime, however, a great effort involves the targeting of non-structural viral proteins which are instead essential for the viral replication and the maturation processes, thus representing a crucial and specific target for anti-COVID drug development 3,10 . In this regard, the crystallographic structure of the SARS-CoV-2 main protease (M pro ), also known as C30 Endopeptidase, was elucidated and made available to the scientific community with impressive timing, just a few weeks after the first COVID-19 outbreak (PDB ID: 6LU7). The structural characterization of the protease, which shares 96.1% of its sequence with those of SARS-CoV, has revealed a highly conserved architecture of the catalytic binding site.
As a result, structure-based drug discovery techniques (SBDD) can now be applied to efficiently speed up the rational identification of putative M pro inhibitors or to drive the repurposing process of known therapy. This latter route is particularly attractive, as it allows to significantly shrink the time required to access the first phases of clinical trials, without compromising patient safety. A multitude of research groups has begun to apply computational approaches, such as molecular docking based virtual screening (VS), to evaluate already approved FDA approved drugs against the aforementioned viral protease [11][12][13][14] . Many of these studies have found convergence in suggesting compounds inhibitors of the human immunodeficiency viruses (HIV) as possible anti-COVID candidates; this is surprising considering the important structural differences exiting among these two homologous enzymes. The repositioning of HIV antiviral drugs for the treatment of coronavirus infections found, however, a foundation in the scientific literature of the past 20 years. Some of these compounds have therefore been experimentally investigated, showing promising activity, both in the case of SARS-CoV and MERS-CoV outbreak 15,16 .
Moreover, at least three randomized clinical trials are currently been held in China in order to evaluate the therapeutic efficacy of Lopinavir and Ritonavir, a combination of HIV protease inhibitors, in COVID-19 treatment 7 . In this perspective and preliminary computational research, we took advantage of the recently solved crystallographic structure of SARS-CoV-2 M pro to perform a cutting edge in-silico investigation.
Supervised molecular dynamics (SuMD), an emerging technique allowing to investigate at an atomic level of detail the molecular recognition process, was exploited to characterize the putative binding mechanism of three HIV protease inhibitors [17][18][19] . In detail, along with the aforementioned combination of Lopinavir and Ritonavir, also Nelfinavir was taken into consideration, due to the promising in-vitro activity shown by this compound against the structurally related SARS-CoV protease 20 . SuMD protocol implements a tabu-like algorithm that controls the sampling of short unbiased MD trajectories, each of which hundreds of picoseconds (ps) long. In detail, simulation steps are accepted only when describing a ligand approaching a known binding site, otherwise, the simulation is discharged and restarted from the previous coordinate set. The combination of all productive SuMD simulation steps represents, therefore, a putative molecular recognition trajectory collected, differently from brute force MD, in a very competitive computational time not exceeding the nanoseconds (ns) timescale (Fig. 1). Contrary to molecular docking, SuMD simulations fully consider both the flexibility characterizing the protein target during the binding event and the contribution played by water molecules during the recognition. Moreover, the study is not limited to a possible final state but allows peeking dynamically at the whole process of recognition, also identifying putative metastable binding sites.

Results
The combination of the structurally related antiviral protease inhibitor Lopinavir and Ritonavir, commercially known with the name Kaletra, represent an effective therapeutic weapon ensuring an adequate and durable suppression of viral load in HIV positive patients. The synergistic coadministration of these two compounds exploits low-dosage concentration of Ritonavir which, inhibiting the metabolic inactivation of Lopinavir, acts as a pharmacokinetic enhancer 21 . Following a preliminary favorable clinical response in SARS-CoV related diseases, the combination of the drug is currently under investigation also against SARS-CoV-2, with at least three randomized clinical trials undergoing with Chinese infected patients 15 . In our computational study, we considered is known to be crucial in many SARS-CoV complexes and moreover, was also found to stabilize the covalent peptidomimetic compound crystallized in the recently published SARS-CoV-2 M pro structure. In addition, the cyclic urea moiety of Lopinavir mediates a hydrogen bond interaction with the side chain of Gln189, another residue whose importance has been elucidated by means of several SARS-CoV threedimensional complexes.
Despite the modest pharmacodynamic contribution made by Ritonavir in the combined formulation under investigation by the Chinese scientific community, in which the drugs act as a pharmacokinetic enhancer rather than a protease inhibitor, we still tried to elucidate its putative molecular recognition pathway. Also, in this case, 20 ns of SuMD simulation time were sufficient to sample a binding trajectory ( Fig. 3-Panel B). Although some key interactions-i.e. hydrogen bond network with residue Glu166 and Gln189-are appreciable also in this final state ( Fig. 3-Panel A, D and Video 2), a comparative analysis of the Interaction Energy Landscape graphs (Panel C of Figs. 2 and 3) suggests lower energy stability of the SuMD predicted binding mode, when compared with that characterizing Lopinavir. A reason could be seeking on the non-optimal accommodation of Ritonavir urea moiety, which floats outside the binding site exposed to the bulk solvent during all the simulation (Video 2).
In light of the promising experimental results shown by Nelfinavir, which milded the cytopathic effect induced by SARS-CoV infection strongly inhibiting the virus replication, we decided to computationally evaluate its possible molecular recognition mechanism also against SARS-CoV-2 protease. As reported in Fig. 4 (Panel B), a slightly longer SuMD simulation was necessary to fully describe a putative Nelfinavir binding trajectory. Once it has approached the vestibular region of the protease catalytic site, the ligand spends the first 20 ns negotiating the accommodation with a series of polar residues with which it mediates intermittent interactions, as  Figure S3-Panel A and B), from which it is possible to notice a highly populated region presenting ligand-protein interaction energy comparable to the final states previously described for the other two inhibitors. The last 10 ns of the simulation were characterized by a series of conformational rearrangements, which resulted in an optimal Nelfinavir accommodation within the protease binding cleft stabilized through a dense hydrogen bond network, tightly anchoring the inhibitor to the protease. As shown in Fig. 4 (Panel A), SuMD predicted binding mode of Nelfinavir is characterized by great analogies with that of the originally crystallized covalent peptidomimetic compound. Residues His164, Glu166, Gln189, Thr190, and Gln196 mediate a series of directed or water-bridged hydrogen bonds interactions. Moreover, as highlighted in Fig. 4

Discussion
In the last two decades, three major outbreaks of coronavirus-related diseasesSARS-CoV, MERS-CoV and ultimatelySARS-CoV-2 have been responsible for significant public health issues, along with dramatic socialeconomic consequences. The process of drug discovery often undergoes timelines which are difficult to reconcile with the urgency and the need to provide an effective therapeutic response to such an emergency health situation. Drug repurposing could represent a viable possibility, and this is the case for some anti-HIV compounds targeting SARS-CoV-2 C30 Endopeptidase. The molecular basis underneath their therapeutic action remains however often obscure. In this preliminary computational investigation, we have taken advantage of the recently published crystallographic structure of SARS-CoV-2 M pro to investigate the putative binding mechanism of three antiviral compounds, previously designed as selective HIV protease inhibitors and now under investigation as anti-COVID-19 emergency treatments. SuMD protocol was in detail exploited to collect, for each of the three inhibitors, MD simulation describing the possible mechanism of molecular recognition, thus providing an www.nature.com/scientificreports/ atomistic insight to interpret their data of therapeutic efficacy. An interesting aspect is represented by the speed of this approach: a few days of calculation in a modest GPU cluster allowed to collect a multitude of simulations, from which it was possible to hypothesize the recognition mechanism of Lopinavir, Ritonavir, and Nelfinavir. An approach of this type, therefore, becomes crucial in all emergencies, making it possible to overcome the lack of structural data to guide and understand the possible repositioning of already approved drugs. In this particular case study, the SuMD protocol not only allowed to hypothesize a possible recognition method for each antiviral but also to advance some preliminary comparative considerations. Nelfinavir, in particular, showed the best fitting for the catalytic site of SARS-CoV-2 M pro , establishing an interactions network similar to those elucidated in the crystallographic complex for the covalent peptidomimetic compound N3. More specifically, the phenyl sulfanyl moiety of the protease inhibitor at the end of the simulation was completely buried within the hydrophobic subpocket S2, which is delimited by residues His41, Cys44, Met49 and Met165. The stabilizing vdW contribution mediated by these residues has been dynamically mapped during the entire simulation and it is appreciable in Figure S3. Encouragingly, a recent fragment crystallographic screening has highlighted how this site, precisely renamed "aromatic wheel", consistently accommodates aromatic fragments mediating hydrophobic interactions with the surrounding residues 24 . Furthermore, Nelfinavir hydroxyl group engages a hydrogen bond interaction with the carbonyl backbone of Glu166, a key residue found to stabilize most of the aforementioned non-covalent fragments as well as many covalent peptidomimetic inhibitors. The optimal interactive network differentiating Nelfinavir from the other two protease inhibitors is probably responsible for its total interaction energy which, as reported in Fig. 4 (Panel C), is quantitatively greater than that computed for Lopinavir and Ritonavir (Figs. 2, 3-Panel C). Intriguingly, this in-silico hypothesis has recently found two independent experimental validations, which have highlighted a mild inhibitory activity of Nelfinavir against the SARS-CoV-2 M pro (estimated between 250 and 600 μM) 25,26 .  www.nature.com/scientificreports/ lar dynamics (MD) simulations were performed with an ACEMD3 engine on an Nvidia GPU cluster composed of 20 NVIDIA drivers, whose models go from GTX 1080 to Titan V 28 . For all the simulations, the ff14SB force field was adopted to describe C30 Endopeptidase protein while general Amber force field (GAFF) was adopted to parameterize small organic molecules 29-31 . Structures preparation. The three-dimensional coordinates of C30 Endopeptidase protein in complex with a covalent peptidomimetic inhibitor (N3) were retrieved from the RCSB PDB database and prepared for SuMD simulations as herein described 32 . Considering the perfect symmetry that characterizes this homodimeric protein, and therefore its two catalytic binding sites, only one of the two monomers was used in this computational investigation. Once the covalent ligand was removed, residue Cys145 was restored to its reduced form. Protein was then processed by means of MOE protein structure preparation tool: residues missing atoms were built according to AMBER14 force field topology. Missing hydrogen atoms were added to X-ray derived complexes and appropriate ionization states were assigned by the Protonate-3D tool 33 . The coordinates of three antiviral compounds were prepared through MOE builder tool and subsequently moved at least 30 Å away from the catalytic protease binding cleft, a distance bigger than the electrostatic cut-off term used in the simulation (9 Å with Amber force field) to avoid premature interaction at the initial phases of the SuMD simulations.

Methods
Solvated system setup and equilibration. Each system investigated by means of SuMD contains a C30 Endopeptidase target macromolecule and respectively one of the three HIV antiviral compounds taken into consideration in this study, moved far away from the protein binding site as previously described. The systems were explicitly solvated by a cubic water box with cell borders placed at least 15 Å away from any protein/ ligand atom, using TIP3P as a water model. To neutralize the total charge of each system, Na + /Cl − counterions were added to a final salt concentration of 0.154 M. The systems were energy minimized by 500 steps with the conjugate-gradient method, then 500,000 steps (1 ns) of NVE followed by 500,000 steps (1 ns) of NPT simulations were carried out, both using 2 fs as time step and applying harmonic positional constraints on protease and ligands heavy atoms by a force constant of 1 kcal mol −1 Å −2 , gradually reduced with a scaling factor of 0.1. During this step, the temperature was maintained at 310 K by a Langevin thermostat with low dumping of 1 ps −1 and the pressure at 1 atm by a Monte Carlo barostat 34 . The M-SHAKE algorithm was applied to constrain the bond lengths involving hydrogen atoms. The particle-mesh Ewald (PME) method was exploited to calculate electrostatic interactions with a cubic spline interpolation and 1 Å grid spacing, and a 9.0 Å cutoff was applied for Lennard-Jones interactions 35 .
Supervised molecular dynamics (SuMD) simulations. SuMD code, in this implementation, is written in Python and exploits the ProDy python package to perform the geometrical ligand-target supervision process 36 . SuMD protocol reduces the timescale, and consequently the computational effort, required to sample a binding event in the range of nanoseconds, instead of hundreds of nanoseconds or microseconds usually necessary with unbiased MD. Sampling is improved by applying a tabu-like algorithm that monitors the distance between the ligand center of mass with respect to the protein binding site, during short unbiased MD simulations of 600 ps. Once a SuMD step has been collected, the distance points calculated at regular time intervals are fitted into a linear function. Only productive MD steps are maintained, those in which the computed slope is negative, indicating a ligand approach toward the protease catalytic binding site. Otherwise, the simulation is restarted by randomly assigning the atomic velocities. Supervision algorithm controlled the sampling of short simulations until the distance between the ligand and the protein binding site dropped below 5 Å, then was disabled, and a classical MD simulation was performed. For each case study up to a maximum of ten SuMD binding simulations were collected, of which only the best was thoroughly analyzed and discussed in the manuscript.

SuMD trajectories analysis.
All the SuMD trajectories collected were analyzed by an in-house tool written in tcl and python languages, as described in the original publication 19 . Briefly, the dimension of each trajectory was reduced saving MD frames at a 20 ps interval, each trajectory was then superposed and aligned on the protease C α atoms of the first frames and wrapped into an image of the system simulated under periodic boundary condition. The molecular recognition was monitored by calculating for each simulation step the distance between the catalytic binding site and the center of mass of the ligand taken into consideration (Figs. 2, 3, 4-Panel B). A ligand-protein interaction energy estimation during the recognition process was calculated using an NAMD engine, plotting the ligand-receptor interaction energy values over time 37 . These values were also arranged according to the distances between ligand and protease binding site mass centers in the Interaction Energy Landscape plots (Figs. 2, 3, 4-Panel C). Here, the distances between mass centers are reported on the x-axis, while the ligand-receptor interaction energy values on the y-axis, and are rendered by a colorimetric scale going from blue to red for negative to positive energetic values. These graphs allow evaluating the variation of the interaction energy profile at different ligand-protein distances, helping to individuate meta-stable binding states during the binding process. Furthermore, for each target investigated in this work, the residues within a distance of 4 Å from the respective ligand atoms were dynamically selected, to qualitatively and quantitatively evaluated the number of contacts during the entire binding process. The most contacted residues were thus selected, to compute a per-residues electrostatic and vdW interaction energy contribution with the protease target. NAMD was used for post-processing computation of electrostatic interactions, using AMBER ff14SB force field. The total per-residue interaction energy, defined as the sum of vdW and electrostatic per-residue contribution to binding were reported on the Dynamic Total Interaction Energy plots (Figs. 2, 3, 4-Panel D). The cumulative electrostatic interactions were computed for the same target residues by summing the energy values frame by frame along the trajectory, and the resulting graphs were reported at the lower-right of movies SuMD videos. Each video, as accurately described on the work of Salmaso et al., is composed of four synchronized and animated panels that depict the molecular trajectory obtained by the SuMD simulation considering different aspects of the simulation 19 . The time evolution is reported on an ns scale. In the first panel (upper-left), the molecular representation of the SARS-CoV-2 main protease is shown. The protein backbone is represented by the ribbon style (pink color) and the residues within 4 Å of each ligand investigated are shown in green, orange and cyan colors respectively for Lopinavir, Ritonavir, and Nelfinavir. In the second panel (upperright), the dynamic distance of each ligand center of mass (CM) from the respective protein catalytic binding site during the trajectory is reported. In the third panel (lower-left), the ligand-protein interaction energy profile is reported. The animated red circle highlights the value of the corresponding frame. The trend is depicted by a continuous black line obtained by smoothing the raw data (grey circles) using a Bezier curve procedure. In the fourth panel (lower-right) cumulative electrostatic interactions are reported for the 15 protein residues most contacted by each ligand during the whole simulation.