ABBV-744 as a potential inhibitor of SARS-CoV-2 main protease enzyme against COVID-19

A new pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has spread worldwide and become pandemic with thousands new deaths and infected cases globally. To address coronavirus disease (COVID-19), currently no effective drug or vaccine is available. This necessity motivated us to explore potential lead compounds by considering drug repurposing approach targeting main protease (Mpro) enzyme of SARS-CoV-2. This enzyme considered to be an attractive drug target as it contributes significantly in mediating viral replication and transcription. Herein, comprehensive computational investigations were performed to identify potential inhibitors of SARS-CoV-2 Mpro enzyme. The structure-based pharmacophore modeling was developed based on the co-crystallized structure of the enzyme with its biological active inhibitor. The generated hypotheses were applied for virtual screening based PhaseScore. Docking based virtual screening workflow was used to generate hit compounds using HTVS, SP and XP based Glide GScore. The pharmacological and physicochemical properties of the selected lead compounds were characterized using ADMET. Molecular dynamics simulations were performed to explore the binding affinities of the considered lead compounds. Binding energies revealed that compound ABBV-744 binds to the Mpro with strong affinity (ΔGbind −45.43 kcal/mol), and the complex is more stable in comparison with other protein–ligand complexes. Our study classified three best compounds which could be considered as promising inhibitors against main protease SARS-CoV-2 virus.

www.nature.com/scientificreports/ According to the aforementioned experimental information, we have chosen SARS-CoV-2 M pro as a target enzyme to accelerate the prompt hunt of antiviral drug repurposing with the potential of gaining an effective short-term solution to treat COVID-19 patients.
To address this challenge, an specific library of anti and pro-viral agents including FDA approved drugs, compounds in clinical trials and preclinical compounds having inhibitory activity between 10 and 100 nM range against SARS-CoV-2 was considered for drug repurposing to attain immediate and precise results 13 . In this study, we developed an integrated approach of drug discovery integrating 3D structure-based pharmacophore modeling, virtual screening of 75 compounds library, molecular docking workflow, ADMET pharmacological analysis and molecular dynamics (MD) simulations. This scheme will provide an informative insight into the exploration of potent antiviral drugs, which could help in progressive attempts in the therapeutics of COVID-19.

Methodology
System preparation. The 1.95 Å crystal structure of SARS-CoV-2 main protease (M pro ) in complex with α-ketoamide (αk-13b) inhibitor was extracted from the Protein Data Bank (PDB ID: 6Y2F) 10 . The structure of the enzyme was pre-processed, minimized and refined using the Protein Preparation Wizard 14 implemented in Schrödinger suite. This involved eliminating of crystallographic waters, adding missing hydrogens/side chain atoms, and assigning the appropriate charge and protonation state for the acidic as well as basic amino acid residues at pH 7.0. The enzyme structure was subjected to an energy minimization step using the OPLS-2005 force-field 15,16 with a root mean square deviation (RMSD) cut-off value of 0.30 Å to relieve the steric clashes among the residues due to the addition of hydrogen atoms 17 .
The preparation of the crystalized inhibitor, αk-13b, and the 75 candidate compounds were performed using LigPrep module of Schrodinger Suite which undertakes hydrogens atom addition, amending realistic bond lengths and angles, accurate chiralities, ionization states, tautomers, stereo chemistries, and ring conformations. Partial charges were assigned to the structures using the OPLS-2005 force-field 15 and the subsequent structures were imperiled to energy minimization until their average RMSD reached to 0.001 Å. The ionization state was set at the neutral pH = 7 using Epik ionization tool 18 . Preparation of inhibitor-like ligand library. The 75 candidate compounds were retrieved from the experimental work proposed by Gordon et al. 13 based on their experimental anti-viral activities against SARS-CoV-2. All candidate inhibitors were considered for further virtual screening-based 3D-pharmacophore features analysis. The library of the 75 compounds are presented in Table S1.
Identification of 3D-pharmacophore hypotheses. For the structure-based pharmacophore modeling Schrodinger PHASE module 19 was used with the default set of seven chemical features-hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobic contacts (H), negative ionizable (N), positive ionizable (P), and aromatic ring (R) to create the utmost illustrative features based on the crucial interactions with the key residues of the enzyme accommodated the inhibitor. The seven 3D-features were generated using Hypothesis Generation for Energy-Optimized Structure Based Pharmacophores considering the omitted volumes within 5 Å of the refined ligand for the enzyme 20 . The extracted pharmacophore hypothesis comprise the functional groups included in their bioactivity of targeted enzyme.
Screening of M pro inhibitors. All acquired seven 3D-pharmacophore features were exported and used for PHASE-based virtual screening of the 75 compounds library retrieved from recent experimental work by Gordon et al. 13 . Out of 75 candidates, 43 hit compounds were generated based on the highest PHASE screen score and matched ligand sites (Table S2). Both the quantity and quality of feature matching is taken into account in the Phase-Screen-Score.
Docking-based virtual screening. Molecular-docking-based virtual screening was performed using Glide workflow of Maestro 11.6 to prioritize the lead compounds that strongly bind to M pro enzyme 21 . Receptor grid was created as center coordinates (X = 9.81 Y = − 1.47 Z = 20.51) using two cubic boxes having a mutual centroid to systematize the calculations: a larger enclosing and a smaller binding box with dimensions of 24 × 24 × 24 Å and 18 × 18 × 18 Å, correspondingly. The grid box was centered on the centroid of the ligands in the complex, which was adequately large enough to search a superior region of the enzyme structure. All the chosen ligands were docked by using a three docking protocols of Glide 21 which starts with "High throughput Virtual Screening" (HTVS) followed by "Standard Precision" (SP) and then by "Extra-Precision" mode (XP). Finally, the 43 input compounds were assessed using Docking-Based Virtual Screening and filtered to final three optimized lead compounds based on XP-GScores.
ADMET properties assessment. Schrodinger QikProp 5.6 module was used to calculate absorption, distribution, metabolism, excretion and toxicity (ADMET) properties of the considered compounds to produce the ADMET associated descriptors. This protocol predicts noteworthy physicochemical and pharmacokinetic-based descriptors based on Lipinski's rule of five 22,23 . ADMET properties of the top three compounds and crystalized control inhibitor were analyzed using QikProp 5.6 module and the best three compounds were considered for final analysis step through molecular dynamics (MD) simulations.

MD simulations.
MD simulation considered to be the most essential approach in understanding the fundamental structure and function of biological macromolecules. This method helps in finding the underlying www.nature.com/scientificreports/ dynamics and how it is connected to enzyme's biomolecular function 24,25,26 . AMBER 18 27 simulation package was used to execute 200 ns MD simulations on all the prepared complexes using (Graphics Processing Unit) GPU accelerated version of Partial Mesh Ewald Molecular Dynamics (PMEMD) simulations 28 . The ff99SB 29 and the general AMBER force fields (GAFF) 30,31 were employed to parametrize the enzyme and the considered ligands using LEaP implemented in Amber 18. The ANTECHAMBER module was used to assign atomic partial charges for the ligands employed in General Amber Force-Field (GAFF). The system was solvated using the TIP3P 32 explicit water in a cubic box with 8 Å box edge. The Na + counter ions were added to randomly to neutralize the complex. The partial Mesh Ewald (PME) 33 method was used to account the long-range electrostatic forces using cutoff of 12 Å, and the SHAKE algorithm 34 was used to constrain all the hydrogen atoms bonds.
Energy minimizations were performed in two stages with 2500 steps of steepest decent minimization followed by 2500 of conjugated gradient to remove the bad contacts. The first stage was followed with a harmonic restraint of 500 kcalmol −1 A −2 on the solute molecule whereas, ions and water molecules were relaxed. On the second stage of minimization the restraints were removed and the whole system was relaxed. Each minimized complex was then gradually heated up from 0 to 300 K for 200 ps to keep the solute using a weak harmonic restraint of 10 kcalmol −1 A −2 . The 50 ps density equilibration with weak restraints followed by the 500 ps constant pressure equilibration at 300 K were performed at constant pressure using Berendsen barostat 35 . Ultimately, the production phase of 200 ns MD simulation was performed on all the complexes at a constant temperature of 300 K and constant pressure of 1 atm 36 .
Post-dynamic trajectories analyses. The 200 ns MD trajectories were analyzed to calculate the RMSD of C α atoms, root mean square fluctuation (RMSF) of each residue in the complex, radius of gyration (R g ), solvent accessible surface area (SASA), and intramolecular/intermolecular hydrogen bond interactions using CPPTRAJ module 37 implemented in AMBER 18. Molecular visualizations and plotting were conducted using Maestro 11.6 and OriginPro 2018 software 38 .
Principal component analysis (PCA). PCA as an important tool for identifying the conformational changes of proteins was carried to describe the residual motions upon inhibitor binding of biomolecular complex 39 . PCA generates highly correlated and anti-correlated fluctuations derived from MD trajectories by applying dimensional reduction 40,41 . The collective motions were studied using the positional covariance matrix C constructed based on the atomic coordinates and their corresponding eigenvectors. The eigenvalues and eigenvectors are defined as the extent and the direction of motions, respectively 42,43 . By the following equation, the matric elements of the positional covariance matrix C were determined: where q i and q j are the cartesian coordinates for the i, jth of Cα atom, and N is the number of Cα atoms. To remove all translational and rotational movements, the average is calculated after superimposition with a reference structure using a least-square fit procedure to excerpt the important motion from MD trajectories 44,45,46 . To derive the eigenvalues and eigenvectors, the symmetric matrix C is transformed into a diagonal matrix Λ of eigenvalues by an orthogonal coordinate transformation matrix T: in which the eigenvectors correspond to the direction of motions relative to q i and each eigenvector associate with an eigenvalue that represents the total mean-square fluctuation of the system along the corresponding eigenvector. CPPTRAJ module from the Amber 18 suite was used to perform the PC analysis and the porcupine plot of protein collective motions was created by NMWiz implemented in VMD 47 .
Binding free energy calculations. The relative binding free energies were computed using Molecular Mechanics/ Generalized Born Surface Area (MM/GBSA) binding free energy method 48 . Water molecules and counter ions were stripped using the CPPTRAJ module. The binding free energies (ΔG bind ) were calculated with the MM-GBSA method for each complex as below: (1) www.nature.com/scientificreports/ The gas phase energy (ΔE gas ) is the sum of the internal (E int ), van der Waals (E vdW ) and Coulombic (E elec ) energies, (Eq. 6). The solvation free energy is the pattern of polar (G GB ) and nonpolar (Gnonpolar, solvation) contributions (Eq. 7). The polar solvation G GB contribution is estimated using the Generalized Born (GB) solvation model with the dielectric constant 1 for solute and 80.0 for the solvent. Conversely, the nonpolar free energy contribution was assessed using Eq. 8, where the surface tension proportionality constant, γ, and the free energy of nonpolar solvation of a point solute, β, were set to 0.00542 kcal mol −1 Å −2 and 0 kcal mol −1 , respectively 49 . The SASA is calculated by the linear combination of pairwise overlap (LCPO) model 50 .

Result and discussion
Selection of compounds. An specific drug repurposing library of 75 anti and pro-viral agents including FDA approved drugs, clinical trials compounds and preclinical compounds with enzyme inhibitory activity between 10 and 100 nM range 13 against SARS-CoV-2 was considered as the input library for this in silico study. The library including the compounds name as well as their corresponding smile structures are presented in Table S1.

Structure-based pharmacophore modeling. A comprehensive and accurate information of ligand
interacting features can be obtained from structure-based pharmacophores based on three-dimensional structure of a target protein 51 . The most common descriptors in pharmacophore modeling are H-bond donors, H-bond acceptors, positive and negative ionizable groups, lipophilic regions and aromatic rings. The most effective 3D structure-based e-pharmacophores were produced using the receptor-ligand pharmacophore generation protocol implemented in PHASE, which was executed for a co-crystal αk-13b inhibitor inside the active pocket in order to determine possibly critical amino acids that are involved in ligand binding ( Fig. 2A). The generated e-pharmacophore for the considered enzyme showed seven main 3D-features including, H-bond acceptor, H-bond donor and π-π stacking of aromatic ring. In each pharmacophore feature, the red arrows represent www.nature.com/scientificreports/ hydrogen bond acceptor, blue arrow represents hydrogen bond donor and orange spheres represent π-π stacking of aromatic ring, Fig. 2B. Numerous excluded volumes were also produced in the models to demonstrate the space balancing. The seven 3D pharmacophore features and 2D-chemical structure of αk-13b inhibitor are presented in Fig. 2B,C showing three donor hydrogen bonds, three acceptor hydrogen bonds and one aromatic ring sphere.
Virtual screening of the candidate compounds. The obtained structure-based pharmacophore hypotheses of αk-13b inhibitor in complex with M pro were used to screen the 75 candidate anti-viral compounds retrieved from recent experimental work by Gordon et al. 13 (Table S1). These compounds were screened based on PHASE screen score, matched ligand sites indices. A total of 43 compounds subsequently passed this filter based on the created pharmacophore hypothesis. Molecules which have satisfied all the features of the pharmacophore hypothesis were considered as potential hits. The output of virtual screening analysis of 43 compounds consist of their PHASE screen score and matched ligand sites are presented in Table S2.
Docking-based virtual screening analysis. The 43 screened compounds obtained from virtual screening were considered for docking analysis using Glide workflow 52 of Schrödinger package. Three step wise filtering protocol were used for docking using HTVS where a total of 23 compounds (Table S3) were obtained followed by Glide SP where a total of 12 hits were generated (Table S4). Finally, the best lead compounds were obtained using Glide XP lead optimization protocol, while among 10 generated docking poses per ligand, only one pose was retained (Table S5). The Glide GScore and the interacting binding residues of the five lead compounds presented in Table 1. The αk-13b inhibitor as well as the best optimized lead generated from XP docking were selected to map their potential interactions within the active pocket of SARS-CoV-2 M pro enzyme using molecular docking approach. This approach aids in understanding the optimized orientation of a ligand and its target protein by minimizing inclusive energies of the corresponding complexes. The estimated docking binding energy values of all three compounds Daunorubicin, Onalespib and ABBV-744 with their experimentally viral inhibition activity (pIC50) 6.67, 6.81, 2.46 against SARS-CoV-2 13 as well as αk-13b inhibitor are shown in Figs. 3 and 4 and Table 1.
The Daunorubicin-M pro , Onalespib-M pro and ABBV-744-M pro docked complexes presented considerable binding affinities with the energy values of − 9.33, − 8.21 and − 7.79 kcal mol −1 , respectively (Table 1). These three lead compounds contributed into the binding site of M pro enzyme through the hydrogen bonding, π-π stacking and π-Sulphur interactions (Fig. 4).
Thus, it could be contemplated that these three compounds bound favorably to the binding site of M pro through hydrogen bond, π-π stacking and π-alkyl interactions mainly generated by CYS145, HIS41, MET165, HIS163, GLU166, GLN 189, Arg188, Thr190 and Gln192 as key contributing active residues into the docked complexes.
ADMET analysis. Pharmacokinetic and toxicity features were predicted using QikProp module of Schrodinger for Daunorubicin, Onalespib, ABBV-744 and αk-13b inhibitor. Outcomes of pharmacokinetic and toxicity study are illustrated in Table 2. The selected properties of the compounds are representatives of influence metabolism, cell permeation, bioavailability and toxicity.
The predicted central nervous system activity (CNS) of Daunorubicin, ABBV-744 and αk-13b inhibitor depicted as inactive whereas, Onalespib was presented as an active compound. The predicted human binding Table 1. The best three compounds generated using XP docking with their corresponding docking scores and the contributing binding residues are presented. Catalytic dyad residues are shown in bold.

Post-dynamics MD trajectories analysis.
The structural variations within the enzymes structure is correlated with their biological activities. Any alterations or interference on enzymes structural integrity might have a substantial impact on its activity 41 . The binding of inhibitors influence the mode of action of enzymes that are comprised in disease pathways, thus there is a requirement to estimate the structural dynamics and conformational changes associated with the inhibitory activity of these inhibitors 53 .
In this section, 200 ns MD trajectories regarding the four complexes, namely, Daunorubicin-M pro , Onalespib-M pro , ABBV-744-Mpro and αk-13b inhibitor-M pro as control model were analyzed. Different metrics and analysis were applied to investigate the stability and flexibility of the complexes as well as the contribution of the studied compounds upon binding in terms of binding free energies. The 2D chemical structure of all the ligands considered for MD simulations are presented in Fig. 1 and Scheme 1.
The computation of a time variable with reference to an RMSD of C α atoms from generated trajectories was accomplished to investigate the consistency and efficiency of M pro in complex with αk-13b inhibitor and along with the three lead compounds, Fig. 1 and Scheme 1.
The perturbations in the RMSD values as denoted in plot (Fig. 5A) throughout the simulation time disclosed the possible conformational deviances in the enzyme structure upon ligand binding. As Fig. 5A revealed, all the complexes were stabilized and attained convergence after almost 50 ns of simulation run. ABBV-744-M pro   Table 2. In-silico ADMET screening of the selected compounds. a Predicted central nervous system activity from − 2 (inactive) to + 2 (active). b Prediction of binding to human serum albumin (acceptable range: − 1.5-1.5). c Total Solvent Accessible Surface Area: SASA (acceptable range: 300-1000). d Predicted octanol/ water partition coefficient (acceptable range: − 2-6.5). e Predicted aqueous solubility, S in mol/dm −3 (acceptable range: − 6.5-0.5). f Predicted brain/blood partition coefficient (acceptable range: − 3.0-1.2). g Predicted percentage human oral absorption (< 25% is poor and > 80% is high). h Number of violations of Lipinski's rule of five, Compounds that satisfy these rules are considered druglike (maximum 4).  Solvent Access Surface Area (SASA) analysis was performed to define the activity of hydrophobic and hydrophilic amino acid residues and forces exposed to the solvent over 200 ns MD trajectories. The constant and accurate scheming of SASA is highly useful in the energetic evaluation of biological macromolecules 55 . The interfaces among the hydrophobic native contacts inside enzyme structure is a noteworthy intermolecular interaction that effect enzyme inhibition. Hydrophobic interaction constructed between the non-polar residues corroborate the stability of the enzyme structure in solution by sheltering the non-polar residues inside the hydrophobic core distant from an aqueous solution 56 . As shown in Fig. 5D, standard SASA values for all selected compounds have been measured during 200 ns MD trajectories. Average value of SASA for the compound ABBV-744-M pro complex is 14,230 Å 2 which was showed to the solvent system. Overall SASA values of 14,303 Å 2 and 14,426 Å 2 were prominent by Onalespib-M pro and Daunorubicin-M pro complexes, individually. The differences in SASA values for all the complexes during the simulation period corresponds with the folding and unfolding of enzyme structure. The overall SASA values in the control complex was 14,001 Å 2 , slightly less than ABBV-744-M pro complex. The SASA assessment perceived in compound ABBV-744 bound complex additionally validated that ABBV-744 compound has better exposure to solvent and consequently favored the improved inhibitory activity of compound ABBV-744 over other complexes.
Hydrogen bond analysis. For overall conformation and stability of enzyme structure, we have measured the intramolecular and intermolecular hydrogen bond analysis (Fig. 6). This analysis gives extreme understanding into binding mechanism of enzyme-ligand with detailed consideration 57 . An average number of intramolecular hydrogen bonds in ABBV-744-M pro complex was noted to be 136 as displayed in Fig. 6A. In compound Onalespib and Daunorubicin, the intramolecular hydrogen bonds were observed to be 139 and 140, respectively.
The number of intermolecular hydrogen bonds produced in the catalytic site of M pro enzyme notable to be 9-10 in ABBV-744 bound M pro complex. However, number of these bonds are more in Daunorubicin-M pro complex with 11-12 hydrogen bonds and less in Onalespib and αk-13b-inhibitor bound M pro with 7-8 hydrogen bonds as presented in Fig. 6B.  It is evident in Fig. 7A, the Onalespib, Daunorubicin and αk-13b-inhibitor with the trace covariance matric of 37.45 Å 2 , 37.67 Å 2 and 37.69 Å 2 , imposed highly fluctuated anti-correlated effects as the negative values of 2D-scatter points into the protein, Fig. 7B. Interestingly, in the case of ABBV-744 with the trace covariance matric of 37.64 Å 2 , the prominent correlated motions were observed with the least fluctuations of the system upon ligand binding, Fig. 7B. Thus, from the above observations Fig. 7B, it was concluded that ABBV-744 induced least fluctuations into the binding site upon binding than the variants complexes.

Mechanistic insights into binding affinity.
To understand the impact of inhibitors upon complexation in terms of their binding affinities, MM-GBSA binding free energy method were utilized to calculate the binding free energies and their energy components of the complexes, Table 3.
As it is evident in Table 3   Per-residue decomposition energy analysis. The binding-free energy decomposition offers an immense understanding in account of enzyme-ligand complexes produced from the trajectories by MD simulations. To achieve this, we fragmented the total binding energies of complexes into each-residual involvement by per amino acid residue existing in the catalytic site of M pro enzyme to provide comprehensive identification of key contributing residues upon ligand binding as depicted in Fig. 8. The interactions between catalytic site electro-negative and electro-positive residues develops ligand binding and its stabilization at the target enzyme. This creates an improved intermolecular binding that surges the binding affinity of the ligand in the active site. Active site residue MET165 in ABBV-744 contributed with the lowest ΔG bind with − 4.22 kcal mol −1 however this residue has contributed with notably less ΔG bind of − 2.50 kcal mol −1 and − 2.45 kcal mol −1 in Onalespib and Daunorubicin bound complexes. The ΔG bind of another participating residues GLN189 was also observed to be lowest in ABBV-744 bound M pro complex with − 3.94 kcal mol −1 however, it is slightly less in Onalespib-M pro complex with value of − 3.45 kcal mol −1 and − 2.48 kcal mol −1 in Daunorubicin-M pro complex. Gln192 significantly contributed in the binding of ABBV-744 compound with ΔG bind value of − 3.00 kcal mol −1 and observed a very minor difference with − 2.99 kcal mol −1 ΔG bind in Onalespib bound M pro but showed a lesser ΔG bind of − 1.50 kcal mol −1 in Daunorubicin compound.

Conclusions
The necessity to control alarming COVID-19 pandemic made us to rationalize potential lead compounds that could be considered in clinical trials. Despite major investigations in the design and development of specific drugs or vaccines, not much proven to be effective against COVID-19. This challenge motivated us to explore the drug designing approaches that could serve informative to combat this disease. In this report, we have performed 3D structure-based pharmacophore modeling followed by virtual screening-based 3D-pharmacophore hypotheses of 75 compounds as potential antiviral agents retrieved from PubChem. Molecular docking workflow using HTVS, SP and XP protocols were used to generate the best hits and their corresponding docked poses. The Six best compounds generated based on their lowest docking binding affinities using XP were considered for ADMET prediction-based physicochemical and pharmacokinetic descriptors and MD simulations analysis. MD simulations approach revealed the two highly selective compounds namely, ABBV-744 and Onalespib www.nature.com/scientificreports/ possessed significant binding affinity and presumably inhibition of SARS-CoV-2 M pro enzyme. Based on our overall observations, compounds ABBV-744 and Onalespib could be recommended as potential lead for the therapeutic of COVID-19 patients.

Data availability
The data used/generated to support the findings of this study are available from the corresponding author on reasonable request. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.