Repurposing antiviral phytochemicals from the leaf extracts of Spondias mombin (Linn) towards the identification of potential SARSCOV-2 inhibitors

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), a pneumonia-like disease with a pattern of acute respiratory symptoms, currently remains a significant public health concern causing tremendous human suffering. Although several approved vaccines exist, vaccine hesitancy, limited vaccine availability, high rate of viral mutation, and the absence of approved drugs account for the persistence of SARS-CoV-2 infections. The investigation of possibly repurposing of phytochemical compounds as therapeutic alternatives has gained momentum due to their reported affordability and minimal toxicity. This study investigated anti-viral phytochemical compounds from ethanolic leaf extracts of Spondias mombin L as potential inhibitor candidates against SARS-CoV-2. We identified Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid as potential SARS-CoV-2 inhibitor candidates targeting the SARS-CoV-2 RNA-dependent polymerase receptor-binding domain (RBD) of SARS-CoV-2 viral S-protein and the 3C-like main protease (3CLpro). Geraniin exhibited binding free energy (ΔGbind) of − 25.87 kcal/mol and − 21.74 kcal/mol towards SARS-CoV-2 RNA-dependent polymerase and receptor-binding domain (RBD) of SARS-CoV-2 viral S-protein respectively, whereas 2-O-Caffeoyl-(+)-allohydroxycitric acid exhibited a ΔGbind of − 32 kcal/mol towards 3CLpro. Molecular Dynamics simulations indicated a possible interference to the functioning of SARS-CoV-2 targets by the two identified inhibitors. However, further in vitro and in vivo evaluation of these potential SARS-CoV-2 therapeutic inhibitor candidates is needed.

This current study seeks to employ computational techniques to investigate the potential of these anti-viral extracts of S. mombin as inhibitory agents against the SARS-CoV-2 RNA-dependent polymerase [97][98][99][100][101] , 3C-like main protease (3CL pro ) 52,102-106 , and the receptor-binding domain of the viral S-protein of the SARS-CoV-2 55 . We also employed in silico methods to thoroughly assess drug-likeness of the compounds and complement our findings with molecular dynamics (MD) simulations to unravel conformational perturbations associated with the potential inhibitory activity of the identified bioactive compounds. Although in silico approaches as employed in this study are inconclusive, they could accelerate the discovery of viable anti-SARS-CoV-2 therapeutics and at a low cost.
Retrieval and preparation of investigated phytochemicals. The anti-viral phytochemicals investigated in this study (Table 1) were drawn on Marvin Sketch 112 to generate their 2D structures. Subsequently, energy minimization and optimization of the 2D structures of phytochemicals were performed using Avogadro 1.2.0 software 113 and the UFF force field with the steepest descent algorithm 109 . Afterward, the 3D of the structures of each compound was generated and saved as mol2 files for further investigations.
Binding pocket identification and molecular docking of modeled structures into SARS-CoV-2 therapeutic targets. The bound co-crystallized inhibitors were used to map out the respective binding pockets for the investigated SARS-CoV-2 therapeutic targets. Mapping-out of the binding pocket was performed using the grid box function in AutoDock Vina 114 , whereby respective coordinate which denotes the respective binding pocket of the therapeutic targets were generated. The grid box coordinates for the inhibitor binding site of SARS-CoV-2 RNA-dependent RNA polymerase were calculated as; X = 124. 119  These structural changes could inform the possible inhibitory mechanism of the identified compounds. The ANTECHAMBER module was used to parameterize the inhibitors, atomic partial charges (AM1BCC) were added 123 . The FF14SB AMBER force field was also used to parameterize the retrieved structures 124 . Protonation of histidine residues was then performed using the pdb4amber script at a constant pH (cpH) to ensure compatibility of the prepared SARS-CoV-2 therapeutic target models with the LEAP module. Subsequently, the LEAP module was employed to solvate and neutralize the entire prepared system. The counter ions, Na + or Cl − were used to neutralize as systems whereas TIP3P orthorhombic box size of 12 Å of water molecules was added to solvate each system 125 . Topology and coordinate files of the bioactive compounds, SARS-CoV-2 therapeutic targets, and the resultant complexes were then generated and saved. The prepared bound complexes and the unbound therapeutic targets were then subjected to an initial 2000 minimization steps with a restraint potential of 500 kcal/mol; afterward, a 1000 steps steepest descent minimization with no restraint was performed on the entire system. Each system was gradually heated from 0 to 300 K for 50 ps. After heating, a 500 ps equilibration was performed at a constant pressure of 1 bar using Berendsen barostat 126 . The SHAKE algorithm was employed to constrict all atomic hydrogen bonds, after which a 200 ns MD simulation was performed using a 1 fs time step 127 . Coordinates for generated MD trajectories were saved at 1 ps intervals. These generated trajectories were further analysed using the PTRAJ and CPPTRAJ modules of AMBER18 128 . Graphical plots for analysis of the generated trajectories created with the Microcal Origin analytical software 129 .
Binding free energy calculations. Binding free energies were calculated using the Molecular Mechanics/ Poisson-Boltzmann Surface Area (MM/PBSA) techniques implemented in AMBER18 130,131 , a technique that determines structural stability, predicts binding affinities and hotspots. This technique has been widely applied in protein-ligand interactions with proven reliability over the years. The binding free energies (ΔGbind) was determined by the equations: where where ΔE MM , ΔG sol , and ΔS are the changes in the gas phase molecular mechanics (MM) energy, solvation free energy, and conformational entropy upon ligand binding. ΔE int refers to the energies of bond, angle, and torsion, whereas ΔE vdw denotes van der Waals energies. The non-bonded electrostatic energy components are also denoted by ΔE elec . The solvation free energy, G sol , on the other hand, is a summation of the electrostatic solvation energy ΔG PB (polar contribution) and the nonpolar contribution ΔG SA between the solute and the continuum solvent. G SA is calculated from the solvent assessable surface area (SASA), obtained by means of a 1.4 Å water probe radius, whereas the polar contribution is calculated using Poisson-Boltzmann (PB). γ and β are empirical constants of 0.00542 kcal/(mol Å 2 ) and 0.92 kcal/mol, respectively. Frames employed in the binding free energy calculations included only frames generated after systems had stabilized.

Results and discussion
Molecular docking of the anti-viral phytochemical from ethanolic leaf extracts of S. mombin with SARS-CoV-2 RNA dependent RNA polymerase, SARS-CoV-2 3CL pro and RBD of viral S-protein. The molecular docking technique was used to explore the inhibitory potential of the phytochemical compounds from ethanolic leaf extract of S. mombin against SARS-CoV-2 therapeutic targets, molecular docking was performed. The docking scores, which gave insights into the possible binding affinity of the compounds against the studied targets, were calculated as presented in Table 2. Docking scores allow for the determination of the most favourable binding orientation of a compound within a given binding pocket. A favourable binding orientation of a ligand within a given pocket influences the nature www.nature.com/scientificreports/ of binding interaction and hence influences overall binding affinity 132 . Reports from other authors indicated that the lower the docking score, the more favourable the corresponding binding orientation 132 . As shown in Table 2, molecular docking of all the studied compounds at the active site of SARS-CoV-2 revealed that Geraniin exhibited the most favourable binding orientation at the inhibitor binding sites of both SARS-CoV-2 RdRp and the RBD of viral S-protein with the highest docking score of − 10.4 kcal/mol and − 7.3 kcal/mol respectively. Also, at the inhibitor binding site of 3CL pro , 2-O-Caffeoyl-(+)-allohydroxycitric acid exhibited the highest docking score of − 5.6 kcal/mol against binding to 3CL pro .
Exploring the binding mechanisms of identified hit phytochemicals against SARS-CoV-2 therapeutic targets. The binding mechanisms of inhibitors to biological targets are usually characterized by interactions that exist between the inhibitor and amino acids that constitute the binding site of the biological targets. These interactions consequentially influence the conformational dynamics of the biological target as well as the stability and binding affinity of the inhibitors. Therefore, inhibitor-residue interactions are very crucial in the overall therapeutic potential of inhibitors. Using the Discovery Studio 133 , we visualized and explored the residue interaction profile of Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid upon binding to SARS-CoV-2 3CL pro , SARS-CoV-2 RdRp, and the RBD of viral S-protein. Molecular insights from the binding interactions as explored herein could shed more light on the binding potential binding mechanisms of the investigated phytochemicals.

SARS-CoV-2 RNA 3 C-like main protease-2-O-Caffeoyl-(+)-allohydroxycitric acid-binding mechanism.
After exhibiting the most favourable docking scores towards SARS-CoV-2 3CL pro amongst all the investigated compounds, as shown in Table 2, we analysed the possible binding mechanisms of 2-O-Caffeoyl-(+)-allohydroxycitric acid its interaction profile with binding site residues. As shown in Fig. 1, an analysis of the binding interactions of 2-O-Caffeoyl-(+)-allohydroxycitric acid towards 3CL pro indicated the formation of strong intermolecular interactions with crucial binding sites residues. Notably, strong conventional hydrogen bond interactions formed with Csy145, Asn142 and His163. Cys145 is shown to engage in an additional pi-cation interaction with the bound inhibitor emphasizing its cruciality to the binding of 2-O-Caffeoyl-(+)-allohydroxycitric acid. A study by Hall et al. (2020) 134 reported that His163 is essential to the inhibition of 3CL pro since the mutation of its homologous residue His162 in SARS-CoV-2 protease inactivates 3CL pro134 . As such, the conventional hydrogen bond interaction engaged between 2-O-Caffeoyl-(+)-allohydroxycitric acid and His163 further highlights this residue's cruciality and also predicts a possible -inhibitory potential of 2-O-Caffeoyl-(+)-allohydroxycitric acid against 3CL pro . 2-O-Caffeoyl-(+)-allohydroxycitric acid was also observed to engage in conventional hydrogen bond interaction with Cys145, one of the catalytic dyad (Cys145 and His41) 135 of 3CL pro . The therapeutic modulation of the catalytic dyad has been reported to impact its catalytic activity and overall conformational fold of 3CL pro due to the role of the catalytic dyad in facilitating the cleavage of SARS-CoV-2 polyproteins 136 . Therefore, the observed high-affinity hydrogen bond interaction between 2-O-Caffeoyl-(+)-allohydroxycitric acid and Cys145 suggested its possible inhibitory modulation of the catalytic dyad thereby warranting its further investigations as a potential inhibitor of 3CL pro .
SARS-CoV-2 RNA dependent RNA polymerase-Geraniin complex binding mechanism. As shown in Fig. 2, Geraniin, which exhibited the most favourable docking score amongst the studied phytochemicals against SARS-CoV-2 RdRp, forms a pi-alkyl bond with Arg550, a conventional hydrogen bond with both Arg555 and Ala553, and a pi-cation interaction with Arg836. Geraniin also forms a conventional hydrogen bond with Asn691, SARS-CoV-2 receptor binding domain-Geraniin complex binding mechanism. As shown in Table 4, Geraniin also exhibited the highest docking score toward the RBD of SARS-CoV-2 viral S-protein. By examining its residue interaction profile with the RBD, we explored its possible binding mechanism. A successful blockage of the RBD of viral S-protein by Geranin could impede the binding of RBD of viral S-protein and SARS-CoV-2.  www.nature.com/scientificreports/ As shown in Fig. 3, Geraniin is engaged in a vast network of interactions, notably, conventional hydrogen bond interactions were formed with Arg403, Tyr495, Tyr453, Ser494, Gln493, Gln498 and Tyr505, while a carbonhydrogen interaction is observed with Gln498. These strong conventional hydrogen interactions could anchor Geraniin within the binding pocket to ensure its stability for favourable binding and significant interruption of the activity of RBD of the viral S-protein. The interacting residues were also consistent with dominant residues reported by several studies and residues crucial to the inhibition of the RBD of viral S-protein 137 . These structural inhibitory potentials provided in addition to the previously reported anti-viral activity of Geraniin 90,92 necessitates a further investigation of Geraniin as a possible inhibitory candidate of the receptor-binding domain of viral S-protein.

Identified hits exhibit favorable binding free energy towards SARS-CoV-2 3CL pro , RdRp and RBD of viral S protein.
Inhibitor stability within the binding pocket is very crucial in determining biological processes with significant pharmaceutical implications. Therefore, to establish the stability of the identified hits within the respective SARS-CoV-2 target, we assessed their binding free energy over the simulation period using the MMPB-SA approach since binding affinities from molecular docking are inconclusive. The MM/PBSA calculations also allowed for a quantitative determination of absolute binding affinities of the identified hits 138 . The calculated binding free energies allowed for a thorough understanding of the mechanism by which the respective SARS-CoV-2 targets recognize the identified hits 139 . As shown in Table 3, the estimated binding free energies of Geraniin towards SARS-CoV-2 RdRp and RBD of viral S protein were − 25.87 kcal/mol and − 21.74 kcal/mol, respectively, while the binding free energy of 2-O-Caffeoyl-(+)-allohydroxycitric acid against 3CL pro was − 32.00 kcal/ mol. Overall, all three compounds bound exhibited strong binding affinity towards their respective target, corroborating with the strong interaction bonds elicited binding pockets as revealed in the interaction dynamics. 2-O-Caffeoyl-(+)-allohydroxycitric acid exhibited almost similar binding free energy with Ritonavir, a reported 3CL pro inhibitor 140 , which showed a total binding free energy of − 32.34 kcal/mol. Also, a comparison of the binding free energy of Geraniin to the known SARS-CoV-2 RdRp inhibitor, Remdesivir, showed that Geraniin exhibited a relatively lower binding free energy than Remdesivir, which demonstrated binding free energy of  www.nature.com/scientificreports/ − 33.34 kcal/mol. This generally favourable binding affinity of the studied phytochemicals in addition to structural insights provided prompts a need for further investigation as potential inhibitors.

Assessing the structural and conformational changes of SARS-CoV-2 therapeutic targets upon binding of Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid. As a reliable and widely employed
computational technique, molecular dynamics simulations were used to conduct a time-dependent prediction of the structural and conformational motions that occur on the SARS-CoV-2 therapeutic targets upon the binding of the identified bioactive compounds [141][142][143] . Any observed structural changes on these SARS-CoV-2 targets could contribute to the potential inhibitory activity of the compounds. With an adequate 200 ns MD simulation period, we calculated the root mean square deviation (RMSD) 144 and root mean square fluctuation (RMSF) 143,145 to assess conformational stability and residue flexibility of each of the therapeutic targets as associated with the inhibitor binding of the phytochemicals.

2-O-Caffeoyl-(+)-allohydroxycitric acid-binding perturbs 3CL pro .
Several recent reports have investigated the conformational dynamics of unliganded of SARS-CoV-2 3CL pro , including a recent molecular dynamics simulations study by Suarez and Diaz (2020) 146 where they revealed that the domain III of 3CL pro is generally unstable while the presence of peptide substrate, induces a stable interdomain arrangement in the monomeric conformation of the protease. These conformational changes are the hallmarks of the SARS-CoV-2 target dynamics and correlate with its overall functioning 146 . By calculating the RMSD of the C-α atoms of 3CL pro over the 200 ns simulation period, the impact of the binding of 2-O-Caffeoyl-(+)-allohydroxycitric acid on the stability of 3CL pro was assessed. The stability of the protein structure is crucial in the maintenance of its function 147 . As shown in Fig. 4A

Geraniin binding distorts conformational integrity of SARS-CoV-2 RdRP.
A recent comparative molecular dynamics simulations study by Koulgi et al. (2020) 148 where the unbound and Remdesivir-complexed structures of SARS-CoV-2 RdRp showed the blocking of the template entry site upon Remdesivir binding 148 . Their report further revealed that Remdesivir binding is characterised by structural instability and increased residue flexibility. To ascertain the inhibitory potential of Geraniin against RdRp, we also assessed the conformational dynamics of RdRp upon Geraniin binding. In a similar mechanism as Remdesivir, the binding of Geraniin also increased the deviation of c-alpha atoms of RdRp implying its structural instability as shown in Fig. 5A, whereby a relatively higher average RMSD of 3.08 Å was calculated for the Geraniin bound RdRp. The unbound RdRp, on the other hand, exhibited an average RMSD of 2.5 Å. Likewise, as shown in Fig. 5B, the binding of Geraniin also induced prominent residue fluctuations, as was reported for Remdesivir binding in the study by Koulgi et al. (2020) 148 . An average RMSF of 32.01 Å was estimated for the Geraniin bound RdRp, while an average RMSF of 21.70 Å was calculated for the unbound conformation. In summary, it could be inferred that the binding of Geraniin induced structural changes on SARS-CoV-2 RdRp in a similar mechanism to Remdesivir. As such, Geraniin could further be investigated as a potential inhibitor of SARS-CoV-2 RdRp.
Geraniin binding influences the receptor accessibility or inaccessibility of the spike protein. According to a recent report by Gur et al. (2019) 149 , the down and up positions of SARS-CoV-2 RBD can interfere with the accessibility of the spike protein by controlling its open (receptor accessible) and closed (receptor inaccessible) positions. Therefore, it is evident that any conformational changes of RBD induced by a bound inhibitor could influence any intended therapeutic inhibition. A calculation of the RMSD of the simulated RBD models as presented in Fig. 6A and B revealed that the unbound conformation of RBD showcased an average RMSD of 7.20 Å. At the same time, the Geraniin bound RBD showed an average RMSD of 10.17 Å. The significantly higher average RMSD of the bound conformation may suggest that the binding of Geraniin possibly increased the deviation of c-α atoms and hence subsequently decreased the conformational stability of RBD. The flexibility of the individual amino acids of RBD was assessed. As shown in Fig. 6, an average RMSF of 12.96 Å and 13.08 Å were calculated for the unbound and inhibitor-bound conformation of RBD, respectively. Although the difference in average residue fluctuations between the bound and unbound conformations was minimal, the relatively higher average RMSF in the Geraniin bound structure confers with increased residue flexibility, suggesting that the binding of Geraniin distorted the residue integrity of RBD, which subsequently increased the residue motions as observed. This increased residue mobility of RBD upon Geraniin binding could in turn favour a down and up motion of RBD and hence possibly influence the receptor accessibility or inaccessibility of the spike protein as postulated by Gur et al. (2019).

Assessing the pharmacokinetic properties of Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid.
The physicochemical and pharmacokinetic features of drugs are very crucial to their overall therapeutic success. As such, we analysed the physicochemical and pharmacokinetic properties of Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid using the online platform SwissADME 120 . An in silico assessment of these properties, notably absorption, distribution, metabolism, and excretion, offers insights into the pharmacokinetics of a given small molecular inhibitor in vivo while minimizing the risk of being disapproved during  Table 4. Prediction of the compounds' lipophilicity by assessing their LogP o/w showed that both compounds showed poor lipophilicity. The lipophilicity of a compound significantly influences pharmacokinetic properties such as the absorption, distribution, permeability, and routes of drug clearance. A favourable log P usually ranges between 3 and 5. A low LogP of a compound often indicates a lower membrane permeability and poor absorption [150][151][152] . With a logP o/w of − 1.71 and − 0.65, it suggests that both compounds are unable to permeate lipid membranes and will exhibit poor bioavailability. Nonetheless, with a large MW of 952.65 kcal/mol, its synthetic fragmentation into smaller simpler compounds could increase its bioactivity and decrease toxicity 153 . Our results indicate that, Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid exhibited a poor pharmacokinetic properties. However, further experimental explorations based on the favourable binding mechanism could lead to the discovery of novel SARS-CoV-2 inhibitors.

Conclusion
In conclusion, this in silico study identified two phytochemical compounds,

Study limitation
Authors acknowledge that computational molecular docking analysis and MD simulations have their limitations, and that further laboratory and clinical studies are needed to validate the inhibitory effects of these candidates against SARS-CoV-2 as potential drugs for COVID-19.

Future perspective/implications of results
To the best of our knowledge, this is the first account of in silico study aimed at phytochemical compounds; Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid isolated from ethanolic leaf extract of S. mombin, against SARS-CoV-2 RNA-dependent polymerase, 3CL pro , and receptor binding domain of viral S-protein.
It is therefore envisaged that interest will be generated for in vitro study of the inhibitory potency of the crude ethanolic extract of S. mombin and/or pure compounds of Geraniin and 2-O-Caffeoyl-(+)-allohydroxycitric acid towards the discovery of novel SARS-CoV-2 therapeutics. www.nature.com/scientificreports/