In silico mutational analysis of ACE2 to check the susceptibility of lung cancer patients towards COVID-19

Being the second major cause of death worldwide, lung cancer poses a significant threat to the health of patients. This worsened during the era of pandemic since lung cancer is found to be more prone to SARS-CoV-2 infection. Many recent studies imply a high frequency of COVID-19 infection associated severe outcome. However, molecular studies are still lacking in this respect. Hence the current study is designed to investigate the binding affinities of ACE2 lung cancer mutants with the viral spike protein to find the susceptibility of respective mutants carrying patients in catching the virus. Quite interestingly, our study found lesser binding affinities of all the selected mutants thus implying that these cancer patients might be less affected by the virus than others. These results are opposed to the recent studies’ propositions and open new avenues for more in-depth studies.

www.nature.com/scientificreports/ cancer, and metastatic cancer 11 . A high fatality rate was observed in lung cancer patients as compared to other cancer patients having COVID-19 [16][17][18][19] . Although recent studies have stated that COVID-19 patients who have cancer have a high possibility of severe outcomes than patients without cancer, but there is no study pointing towards the possible molecular mechanism behind it. However, Yu et al. reported that lung cancer patients have more severity of the viral disease than other patients as the major entry route for the virus is through the lungs, and that too with reasons unknown [16][17][18][19] . Therefore, this work has been designed to investigate the binding affinities of the viral spike protein toward mutated ACE2 proteins to decipher the susceptibility and probable molecular mechanism of the COVID-19 infection concerning ACE2.

Materials and methods
ACE2 mutants and SARS-CoV2 spike protein. The 3D structures of ACE2 and SARS-CoV2 spike protein were retrieved from RCSB-PDB. The RCSB-PDB code for the structures of SARS-CoV2 spike protein and ACE2 is 6M17 20 , Missense mutations and frameshift mutations in the ACE2 gene in lung tissues were curated from the COSMIC cancer database. All the mutations in the study were used in accordance with the database regulations and guidelines (https:// cancer. sanger. ac. uk/ cosmic). These mutations were induced in the 3D structure of ACE2 through mutagenesis in PyMOL 21 . Structures were visualized via UCSF Chimera Version 1.14 22 .
Protein-protein docking. Protein-protein docking was performed on the HADDOCK server (https:// wenmr. scien ce. uu. nl/ haddo ck2.4/), a web server for the docking of biomolecular structures, and the obtained binding affinities of the docked poses of the spike glycoprotein of SARS-CoV-2 with all the cancer mutants of ACE2 were analyzed 23 . Binding residues of ACE2 (24,27,28,30,31,34,35,37,38,41,42, Analysis of docking. Based on HADDOCK scores, the top 5 docked complexes of ACE2 cancer mutants and spike protein with higher docking scores were assessed for stability and binding affinity (kcal mol −1 ). This analysis was performed on Protein Binding Energy Prediction (PRODIGY) server 25,26 . PRODIGY predicts stability and binding affinity of docking complex based on structural properties of both interacting molecules in complex. Binding affinity was demonstrated by ΔG (kcal mol −1 ) and stability by the dissociation constant Kd (M). Analysis was done at different temperature ranges. Docked complexes with higher binding affinity were subjected to molecular dynamics simulation.
Molecular dynamic simulation and MM-GBSA. The molecular dynamic simulations were carried out using the Desmond simulation package of Schrödinger LLC 27 . Molecular dynamic simulations and MM-GBSA calculations were described in detail in the supplementary information, and it was conducted according to previous work [28][29][30]  Protein to protein docking analysis. Due to simple scoring methods, protein docking is not reliable enough for the prediction of binding affinity between two interacting proteins in complexes 41 . The binding affinity of two proteins in a complex relies on dissociation constant (Kd), temperature and pH also 42 . These factors are not included in docking scoring methods of docking servers. Therefore, the binding affinity of frontier complexes (top 5) of interacting partners with the lowest docking scores was checked through the PRODIGY server (Tables 3, 4, 5, 6), binding affinity of all 5 complexes at core body temperature (37 ℃) is represented in Table 3. High stability and strong binding affinity between two interacting proteins in a complex are indicated by a smaller Kd value 43 . Spike protein-ACE mutants showed high stability and binding affinity as the temperature rises, and at the optimum temperature, stability and binding affinity became constant. Whereas the lowest protein stability and binding affinity were seen at 25℃, and highest on the optimum temperature ( www.nature.com/scientificreports/ . Other type of interactions which may affect the binding affinity (Kd), and stability (ΔG) of docked complexes are also represented in Table 4. A change in the dissociation constant of different mutants was noted. While a decrease of 1 log fold was observed for V491L, an increase of 2 log folds was present in L320F and Q60E. However, the Kd for mutants P84A and D693N were similar to those of controls (Table 3). Moreover, a slight increase in the Kd was also observed with increasing temperatures for mutants D693N, V491L, L320F, and Q60E (Tables SI 3-8). The reason for this might be increased energy imparted with the increasing temperatures, which might be making the complexes less stable, hence an increase in Kd. Another important point to infer from these is the change in binding affinities with physiological temperatures of the human body under different states, such as normal hypothermic and hyperthermic conditions. The results also imply the less binding affinities of these ACE2 mutants under hyperthermic conditions, such as fever, and vice versa. A similar study has been performed by Basit and colleagues, where the binding affinity of ACE2 was shown to be constant at different temperatures; however, the dissociation constant was shown to be increased as the temperature increased 44 .     www.nature.com/scientificreports/ SARS-CoV-2 Spike, and the ACE2 were reported and analyzed. The total energy along with the potential energy of each system was monitored during the simulation and the average was reported in Table S1.

Proteins and complex Root Mean Square Deviation (RMSD) analysis.
To monitor the impact of simulation on the stability of the SARS-CoV-2 Spike-ACE2 complexes, the root mean square deviations values were reported as a function of time for all C α atoms of the proteins with respect to their initial positions. The RMSD results for SARS-CoV-2 Spike-ACE2 complexes were plotted as a function of simulation time and presented in Fig. 6, while the RMSD for the SARS-CoV-2 spike protein was presented in Fig. 7; finally, the RMSD of the ACE2 was plotted in Fig. 8. As it can be seen from Fig. 6, most complexes showed a stable most complexes showed stability with RMSD around 3-4 Å except for CNT-complex and L320F-complex. The CNT-complex, D693N-complex, and the L320F-complex fluctuated till around 100 ns, 180 ns, and 450 ns, respectively. Other Table 3. Binding affinity of docking complex and its dissociation constant (Kd) at 37 ℃. The binding affinity and stability of docked proteins are calculated in the form of ΔG (kcal mol −1 ) and Kd (M), respectively. Smaller Kd value is showing high stability and strong binding affinity between two proteins.        www.nature.com/scientificreports/ complexes showed stability at an early stage of the simulation and held that stability during the simulations; at around 470-490 ns. A jump in the RMSD for V491L was observed and due to the movement of the ACE domain. Further, the RMSD of each chain was studied individually with respect to its original position within the complex. Figure 7 shows the fluctuation of the spike domain of the protein-protein complex; as it can see most spike domains are at around 3 Å, which is acceptable fluctuations for proteins with such size. The only notable change occurs within the D693N Spike, and this fluctuation is due to the movement of the nonstructural part (loops) of the protein Figure SI1 ("Supplementary information"). Figure 8 presents the RMDS of the ACE2 protein moiety; again, most proteins reached a plateau at around 200 ns, with an RMSD of ~ 3.5 Å. ACE2-D693N showed a high fluctuation at about 20-180 ns due to the instability of the tail (N-terminal) of the ACE2 during simulation, Fig. SI2 ("Supplementary information"). Figure 4 shows the position of the SARS-CoV-2 Spike with regard to ACE 2 at the beginning of the simulation (0 ns) and the end of the simulations (500 ns). In general, all ACE 2 hold position to the SARS-CoV-2 Spike; the hydrogen bonding breaking, and formation was monitored during the simulation and will be discussed later. Figure 9 shows a surface presentation of the first and last frame of the trajectories for a better presentation of the movement of the two proteins with respect to simulation time, and the residuals interactions will be discussed later through the course of the manuscript.
Root mean square fluctuation (RMSF). The RMSF helps characterize local changes along the protein chain. The RMSF plot peaks indicate the part of the protein that fluctuates the most during the simulation. Typically, the N-and the C-terminal fluctuates the most. Also, secondary structures like α-helices and β-strand consider more rigid than the unstructured part of the protein and fluctuate less than loops regions. Figure 10 presents the RMSF of all complexes, residue index from 0 to 182 corresponding to the SARS-CoV-2 Spike, while the residue index from 183 to 890 corresponding to ACE 2. The fluctuations between 15 and 65 corresponding to residuals ARG357 to PHE400 of the SARS-CoV-2 Spike, and the peak at 182 presents the C-terminal of the SARS-CoV-2 Spike. The peak at 298 refers to the fluctuation of ASN134, PRO135, and ASP136 of the ACE 2 protein, and the peak at 790 is due to the fluctuations of the following residuals ALA627, LEU628, and GLY629. The RMSF of L320F is provided in the supplementary information for the residual numbering example. Protein secondary structure elements (SSE) like alpha-helices and beta-strands are monitored throughout the simulation for all complexes and reported in SI 9-20.

Hydrogen bond analysis. Hydrogen bonds consider the most important interactions in protein chemis-
try; the protein structure depends on this interaction, including 2D, 3D, and quaternary structures. The hydrogen bond between the SARS-CoV-2 Spike and the ACE 2 was monitored during the simulation time and plotted as a function of time (Fig. 8). As can be seen from Fig. 6, the H-bond between the two proteins fluctuated between 0 and 25 H-bonds, the average hydrogen bond interaction for the last 50 ns was calculated and reported in Table 5. L320F showed the most H-bond interactions with an average of 18.9 H-bonds, followed by Q60E and P84A with H-bonds interactions of 15.6 and 12.8, respectively. The number of H-bond interactions in the last frame was also reported in Table 5, and some of the residuals involved in these H-bonds were reported in Table SI 2. H-bond interactions between the SARS-CoV-2 Spike and ACE 2 during simulations, as seen in Fig. 11. Using Ligplot, all the complexes were examined for hydrogen bonding interactions between 0 and 500 ns. (Fig. 11 and SI 3-8).
MM-GBSA calculations. Schrodinger software has a python script called thermal_mmgbsa.py, which was used to calculate the MM-GBSA from the trajectories and extract the average binding energies, including the average MM-GBSA binding energy, average Coulomb energy, average Covalent binding energy, average Van der Waals energy, average Lipophilic energy, average Generalized Born electrostatic solvation energy, and average Hydrogen-bonding energy. All the obtained energies are shown in Table 6.
As it can be seen from Table 6, CNT and L320F showed the highest binding energy with − 87.62 and − 86.61 kcal mol −1 , respectively, while D693N showed the lowest binding energy − 69.30 kcal mol −1 . Coulomb energy is associated with electrostatic forces of the system and reflects the ionic interactions of the system. CNT, D693N, and Q60E showed good ionic interactions, with Q60E having the highest value, while V491L seem to lose ionic interactions. Most complexes showed good H-bond energies, as well as good Van der Waals energies.

Conclusion
In this research, the relationship between lung cancer and COVID-19 was addressed at a molecular level through computational study. The binding affinity of the viral spike protein of the SARS-CoV-2 glycoprotein towards a mutate ACE2 was investigated, using both docking and molecular dynamic simulation approaches. Among 25 selected mutations, it was found that five mutations have higher binding energies than others from a docking perspective. These five complexes' stability was studied and investigated further considering molecular dynamics simulations. Finally, the binding free energy calculations using the MM-GBSA approach were implemented and showed that these mutations have a binding energy of the following order CNT > L320F > Q60E > P84A > V491L > D693N. These findings suggest that some cancer patients will be less affected than others, even though most reported mutations were not within the active site of interactions; the binding energy was affected by these mutations. These results somehow oppose those clinical studies which state that lung cancer patients are more prone to COVID-19 infection based on the healthcare scenario they have observed. That might be due to the contact of healthcare system with COVID-19 to lung cancer patients during treatment. However, more in-depth studies are needed to be performed to reach at a valid endpoint.  www.nature.com/scientificreports/ Data availability COSMIC Cancer Database (https:// cancer. sanger. ac. uk/ cosmic) was used to curate ACE2 lung cancer mutations. These mutations were publicly available on the website of database to be used for research work. These mutations were further analyzed to check their susceptibility for COVID-19. Throughout the manuscript used mutations were cited properly (Table 1). According to the rules and regulations of the database, data can be used for research purposes and correct citation is required, and so it was the done the same.