Hot spot profiles of SARS-CoV-2 and human ACE2 receptor protein protein interaction obtained by density functional tight binding fragment molecular orbital method

The prevalence of a novel β-coronavirus (SARS-CoV-2) was declared as a public health emergency of international concern on 30 January 2020 and a global pandemic on 11 March 2020 by WHO. The spike glycoprotein of SARS-CoV-2 is regarded as a key target for the development of vaccines and therapeutic antibodies. In order to develop anti-viral therapeutics for SARS-CoV-2, it is crucial to find amino acid pairs that strongly attract each other at the interface of the spike glycoprotein and the human angiotensin-converting enzyme 2 (hACE2) complex. In order to find hot spot residues, the strongly attracting amino acid pairs at the protein–protein interaction (PPI) interface, we introduce a reliable inter-residue interaction energy calculation method, FMO-DFTB3/D/PCM/3D-SPIEs. In addition to the SARS-CoV-2 spike glycoprotein/hACE2 complex, the hot spot residues of SARS-CoV-1 spike glycoprotein/hACE2 complex, SARS-CoV-1 spike glycoprotein/antibody complex, and HCoV-NL63 spike glycoprotein/hACE2 complex were obtained using the same FMO method. Following this, a 3D-SPIEs-based interaction map was constructed with hot spot residues for the hACE2/SARS-CoV-1 spike glycoprotein, hACE2/HCoV-NL63 spike glycoprotein, and hACE2/SARS-CoV-2 spike glycoprotein complexes. Finally, the three 3D-SPIEs-based interaction maps were combined and analyzed to find the consensus hot spots among the three complexes. As a result of the analysis, two hot spots were identified between hACE2 and the three spike proteins. In particular, E37, K353, G354, and D355 of the hACE2 receptor strongly interact with the spike proteins of coronaviruses. The 3D-SPIEs-based map would provide valuable information to develop anti-viral therapeutics that inhibit PPIs between the spike protein of SARS-CoV-2 and hACE2.

domain (RBD) of S1 undergoes hinge-like conformational changes that transiently conceal or reveal the determinants of receptor binding 3 . SARS-CoV-2 could possibly use angiotensin-converting enzyme 2 (hACE2), the same receptor as SARS-CoV 4 and HCoV-NL63. Since the essential function of the S protein is to penetrate host cells, it is considered as the optimal target for the prevention of cell infection. For this reason, S protein-targeted antibody-mediated neutralization has been considered as a suitable treatment for SARS-CoV diseases. Therefore, the hot spot analysis on the interface between the RBD domain of the S1 subunit and the hACE2 receptor would provide crucial information for antibody engineering and for small-molecular drug development.
To investigate protein-protein interactions (PPIs) between hACE2 and the RBD domain of S1 subunit at the molecular level, an ab initio quantum mechanical (QM) method was introduced. This method was used to obtain the most accurate information on the PPIs through analysis of the wave function obtained from the QM calculation, especially of the fragment molecular orbital (FMO) approximation method. Even with the FMO method, the calculations in a biomolecular system need a huge amount of computer resources. In order to obtain results within a reasonable computation time while maintaining a certain degree of accuracy of ab initio MO, we introduced the density functional tight-binding (DFTB) method, which is an efficient parameterized QM method and is expected to exhibit reasonable accuracy at a remarkably reduced computational cost 7 . The FMO method is one of various linear-scaling methods to reduce the huge computational cost of QM calculations by the fragmentation of target molecules. The energies of fragment and their pairs are computed in the embedding electrostatic potential 8 . Recently, the FMO method has been combined with DFTB, and the polarizable continuum model (PCM) was introduced to consider the effect of a solvent on a model system 9 . Pair interaction energies (PIEs) among the fragments of the model system from the FMO-DFTB/PCM method correlate well with PIEs from ab initio DFT FMO/PCM and with an ignition Møller-Plesset perturbation theory (MP2) FMO/PCM 9 . In our earlier work, we investigated PPIs between programmed cell death 1 and its ligand PD-L1 using FMO-MP2/ PCM and the results efficiently explained the experimental site-directed mutagenesis data 10 .
In this work, to find common hot spot amino acids on the interfaces between the RBD domain and hACE2 of the three complexes, RBD-SARS-CoV-2/hACE2 (twelve experimental structural data), RBD-SARS-CoV-1/ hACE2 (four experimental structural data), and RBD-HCoV-NL63/hACE2 (one experimental structural data), we performed FMO-DFTB3/D/PCM calculations. To visualize the interaction energy and the distance of the interacting amino acid pairs, the FMO/3D-SPIEs analysis tool was introduced. To narrow down the hot spot region, we also performed the same calculation with RBD-SARS-CoV-1/antibody complexes (five experimental structural data). Based on the FMO/3D-SPIEs results, we constructed 3D-SPIEs-based interaction maps of the hACE2 and RBD domains from SARS-CoV-1, HCoV-NL63, and SARS-CoV-2.
In order to validate the PPI predictability of the FMO-DFTB3/D/PCM results, we compared them with the site-directed mutagenesis results. Consequently, we summarized the FMO-DFTB3/D/PCM/3D-SPIEs results as interaction maps and found the hot spot regions in RBD-SARS-CoV-2 and hACE2 at a QM level.

Computational methods
All experimental structures calculated in this work are summarized in Table 1. All missing side chains were filled using Prime implemented in Maestro program 11 . Hydrogen atoms were added to the crystal structures at pH 7.0 and their positions were optimized with the PROPKA function implemented in Maestro program 12 . Water molecules in the crystal structures were included in the FMO calculations to explore their roles in PPIs. In the FMO calculations, all N-acetylglucosamine (GlcNAc) residues in the RBD domains and hACE2 were included, only RBD domains of three coronaviruses were included, and only Fv domains of antibodies that bind to RBD domains were included.
All FMO calculations were performed with the version Feb 14, 2018 GAMESS 13 . The two-body FMO method was applied to all calculations in this work for the FMO2/DFTB3 method; this is a recent extension of the selfconsistent-charge density functional tight-binding method and derived via a third-order expansion of the DFT method 9 . DFTB3 calculations were performed using the 3OB parameter set 14,15 , and the UFF-type dispersion correction (DFTB3/D) 16,17 . Due to the exchange-repulsion term is not computed in DFTB, the EX terms are all zero (see Supplementary Table S2-S25). The polarizable continuum model (PCM), an implicit solvation model, was employed with the explicitly expressed water molecules present in the X-ray crystal and cryo-EM structures. All input files were prepared in compliance with the hybrid orbital projection (HOP) scheme fragmentation 18 . Each residue and water molecule was defined as one fragment. Two cysteine residues forming the disulfide bridge were defined as one fragment, and GlcNAc, with which the asparagine residue formed covalent bonds, was defined as one fragment. All 3D-SPIEs results were generated with the reported protocol 10 . In the protocol, we selected only PIEs within a specific distance (5.0 Å) between two fragments, which reflected the distance used for the approximate of electrostatic potential in FMO method 19 . We considered the interaction with an PIE more stable than − 3.0 kcal/mol to be significant on the basis of previous reports 10,20,21 .

Results and discussion
To investigate PPIs between hACE2 and RBD domains of the three coronaviruses, we collected 23 experimental structures summarized in Table 1 and performed FMO-DFTB3/D/PCM calculations for all the experimental structures. Due to structural arrangements from mutations summarized in Table S1, we collected all available structures to consider them together. Subsequently, we performed hot spot analysis using the FMO/3D-SPIEs tool.
Hot spot region between hACE2 and RBD-SARS-CoV-1. In order to investigate the hot spot region in the RBD of the SAR-CoV-1 and hACE2 receptor complex, we performed FMO calculations on 12 RBD-SARS-CoV-1/hACE2 complexes (Supplementary Table S2-S13). We summarized the FMO results in Fig 23 . When comparing the 69 amino acid pairs of this study with the mutagenesis experimental results from two papers, it was confirmed that 32 of the 69 amino acid pairs correlated with the experimental results.
The changes in the binding affinity between the proteins that form a complex by mutation can be explained by comparing the structural changes (i.e. changes in the amino acid pairs that contribute to the increase or decrease of the binding affinity) of the mutated proteins with those of the wild-type proteins. Qu et al. reported that the N479K/T487S mutation on RBD-SARS-CoV-1 lowers the binding affinity 24 . One complex (PDB ID: 3D0H) has the T487S mutation in RBD-SARS-CoV-1. T487 in WT RBD-SARS-CoV-1 attractively interacts with 6 amino acids, Y41, G326, N330, G354, F356, and R357, whereas S487 in the mutated complex attractively interacts with only 3 amino acids, N330, G354, and R357. Wu et al. reported that the K31T mutation on hACE2 increases the binding affinity 25 , because the K31 in WT hACE2 (PDB ID: 2AJF) interacts only with Y442 of RBD-SARS-CoV-1, whereas T31 in the mutated hACE2 (PDB ID: 3D0G) interacts with two amino acids, Y442 and Y475.
The common hot spot region in RBD-SARS-CoV-1 against hACE2 and SARS-CoV-1 antibodies. In order to narrow down the hot spot regions between hACE2 and RBD-SARS-CoV-1, we performed FMO calculations on four RBD-SARS-CoV-1/antibody complexes (Supplementary Table S14-S19). We summarized the FMO results in Fig. 1.  26 When comparing the 30 amino acid pairs of this study with the previously reported results, it was confirmed that 17 of the 30 amino acid pairs are correlated: R426/Y53, T433/W226, N437/R162, Y440/D182, P470/D202, N479/D182, D480/R162, D480/S163, D480/N164, D480/R223, Y481/R223, Y484/Y102, T486/Y53, T487/Y53, T487/D99, G488/A33, and Y491/D99. In RBD-SARS-CoV-1/m395 (PDB ID: 2DD8) complex, the FMO results detected 18 amino acid pairs, which are summarized in Supplementary Table S15. The amino acid pairs that contributed to the stability of the complexes are well correlated with the published sitedirected mutagenesis study, in which the T487 mutation does not significantly affect the neutralizing activity of the antibody 27 . The FMO results supported that T487S mutation would change only minor van der Waals interactions between T487 and HC Y32. In the RBD-SARS-CoV-1/S230 (PDB ID: 6NB6, 6NB7) complex, the FMO results detected 25 amino acid pairs, which are summarized in Supplementary Table S16-S18. The S230 binds to RBD-SARS-CoV-1 in different two states. The FMO results of state 1 are detailed in Supplementary Table S16, and those of the state 2 are mentioned in Supplementary Table S17-S18. In the RBD-SARS-CoV-1/ The interactions between four antibodies (80R, m395, S320, and F26G13) and RBD domain from SAR-CoV-1 are shown in the right-hand with color bars. the main hot spot region is colored in light red, and the secondary hot spot region in hACE2 is colored in light blue, and All interactions shown in this map have attractive PIE value more stable than − 3.0 kcal/mol, whose magnitudes are ignored. In order to find common hot spot amino acids in RBD-SARS-CoV-1 against hACE2 and SARS-CoV-1 antibodies, we illustrated the FMO results with a 3D-SPIEs-based map. (see Fig. 1). All four antibodies (80R, m395, S230, and F26G19) and hACE2 have two common amino acids, R426 and T487, in RBD-SARS-CoV-1. Three of the four antibodies and hACE2 have four common amino acids, T486, G488, I489, and Y491, in RBD-SARS-CoV-1. Two of the four antibodies and hACE2 have two common amino acids, F483 and Q492, in RBD-SARS-CoV-1. Only S230 and hACE2 share four common amino acids, D463, N473, Y475, and Y442, in RBD-SARS-CoV-1. Only 80R and hACE2 share two common amino acids, Q479 and Y484, in RBD-SARS-CoV-1. Other interactions between antibodies and RBD-SARS-CoV-1 do not share interactions between hACE2 and RBD-SARS-CoV-1. Considering the possibility of mutation prediction in viruses by the FMO methods 28,29 , the evolutionary process of SARS-CoV-1 can be performed to elude neutralization of antibody by switching the unshared interactions between the antibody and hACE2 receptor.
According to the map, there are two hot spot regions between hACE2 and RBD-SARS-CoV-1 (See Fig. 1). The main hot spot region on hACE2 consists of D38, Y41, K353, D355, and several residues. The counter part of that on RBD-SARS-CoV-1 comprises R426, T486, T487, I489, Y491, and so on. The secondary hot spot region on hACE2 receptor consists of D30, K31, and several residues. The counter part of that on RBD-SARS-CoV-1 comprises Y442, D463, N473, N479, and so on. We found that SARS-CoV-1 antibodies focus on the main hot spot to block the formation of amino acid pairs between hACE2 and RBD-SARS-CoV-1.

Hot spot region in hACE2/RBD-SARS-CoV-2 integrated with 3D-SPIEs-based interaction map.
We created a 3D-SPIEs based interaction map to find the hot spot regions from the PPI information between hACE2 and RBD-SARS-CoV-2 (see Figs. 1 and 2). When comparing the interacting residues between hACE2 and RBD of the three viruses, there are two hot spot regions consisting of shallow grooves on the hACE2 receptor. The main hot spot is formed by E37, K353, G354 and D355. The secondary hot spot consists of D30 and K31. According to the map, the main hot spot is expected to be the most important hot spot between hACE2 and RBD-SARS-CoV-2. We observed that the secondary hot spot on hACE2 has interactions with K417, L455, E484, P491, and Q493 in RBD-SARS-CoV-2, whereas the main hot spot has interactions with R403, F497, Q498, T500, N501, G502, Y505, and Q506 in RBD-SARS-CoV-2. The results from the common hot spot region in SARS-CoV-1 antibodies supported the results that the main hot spot region was important for the PPI between RBD-SARS-CoV-1 and its antibodies. In the results of SARS-CoV-2 and its antibody (B38) summarized in Supplementary Table S25, the antibody had interactions with R403, Q498, N501, G502, and Y505 of RBD-SARS-CoV-2, which are the counterpart of the main hot spot. It can be used to develop antibodies and antiviral agents by using the information of the hot spot regions suggested in this work.

Conclusions
Even though the FMO method was successfully applied to evaluate PPIs, analysis of biomolecular systems still requires huge computational costs. Here, we combined parameterized quantum chemical approaches (FMO-DFTB3/D/PCM) and the 3D-scattered pair interaction energies (3D-SPIEs) protocol to analyze PPIs between SARS-CoV-2 and hACE2 complex. The FMO-DFTB3/D/PCM/3D-SPIEs results also showed a qualitative Scientific RepoRtS | (2020) 10:16862 | https://doi.org/10.1038/s41598-020-73820-8 www.nature.com/scientificreports/ www.nature.com/scientificreports/ correlation with site-directed mutagenesis results, such as the FMO-MP2/PCM/3D-SPIEs results in our earlier work 10 . The reliable inter-residue interaction energy calculation method, FMO-DFTB3/D/PCM/3D-SPIEs, would be a powerful tool for drug discovery and protein engineering in the future. Furthermore, the quantum-mechanical-level hot spot analysis results will provide new directions for antibody engineering and small-molecule development. The 3D-SPIEs-based map would provide valuable information for the discovery of anti-viral therapeutics that inhibit PPIs between the spike protein of SARS-CoV-2 and hACE2.