The dimer-monomer equilibrium of SARS-CoV-2 main protease is affected by small molecule inhibitors

The maturation of coronavirus SARS-CoV-2, which is the etiological agent at the origin of the COVID-19 pandemic, requires a main protease Mpro to cleave the virus-encoded polyproteins. Despite a wealth of experimental information already available, there is wide disagreement about the Mpro monomer-dimer equilibrium dissociation constant. Since the functional unit of Mpro is a homodimer, the detailed knowledge of the thermodynamics of this equilibrium is a key piece of information for possible therapeutic intervention, with small molecules interfering with dimerization being potential broad-spectrum antiviral drug leads. In the present study, we exploit Small Angle X-ray Scattering (SAXS) to investigate the structural features of SARS-CoV-2 Mpro in solution as a function of protein concentration and temperature. A detailed thermodynamic picture of the monomer-dimer equilibrium is derived, together with the temperature-dependent value of the dissociation constant. SAXS is also used to study how the Mpro dissociation process is affected by small inhibitors selected by virtual screening. We find that these inhibitors affect dimerization and enzymatic activity to a different extent and sometimes in an opposite way, likely due to the different molecular mechanisms underlying the two processes. The Mpro residues that emerge as key to optimize both dissociation and enzymatic activity inhibition are discussed.

www.nature.com/scientificreports/ the M pro dimer-monomer equilibrium in solution by SAXS and CD spectroscopy techniques in the absence of inhibitors. Considering the results obtained by Graziano et al. 20 for the SARS-CoV M pro , we have chosen to investigate a range of protein concentrations from 3 and 30 µ M, the molarity being expressed in terms of M pro monomers. It should be noted that, using these protein concentrations, one can discriminate between values of the dissociation constant that fall in the quite wide range of ≈ 0.2 − 200 µ M (see Eq. 2). Subsequently, we have studied by SAXS experiments the M pro dimer-monomer equilibrium in the presence of a series of potential inhibitors, selected from an in-house database containing commercial and synthetic compounds. Just one protein concentration, 30 µ M, and two concentrations of inhibitors, 30 and 60 µ M, corresponding to an inhibitors-to-monomer M pro molar ratio of 1 and 2, have been investigated. Activity assays were also performed and results are correlated with the M pro dimerization inhibition. M pro dimerization and thermal stability. The dimer-monomer equilibrium of SARS-CoV-2 M pro has been investigated at different protein concentrations by performing in-solution SAXS experiments in the temperature range between 15 • and 45 • C and far-UV CD measurements at room temperature. Far-UV CD spectroscopy was also used to study the M pro thermal stability, monitoring the unfolding transition between 10 • and 80 • C.
SAXS. SAXS data of SARS-CoV-2 M pro recorded at the B21 beam-line of the Diamond Synchrotron (Didctot, UK) at different protein concentrations and temperatures are shown as log-log plots in Fig. 2, top panels.
We have assumed that SAXS curves arise from both M pro monomer and dimer species, according to the thermodynamic equilibrium dissociation process given by the relationship: The corresponding equilibrium dissociation constant is www.nature.com/scientificreports/ where C is the total molar concentration of monomers, x 1 is the molar fraction of proteins that remain in the monomeric state, G D is the dissociation Gibbs free energy change, R is the universal gas constant and T the absolute temperature. To note, Eq. (2) can be solved in terms of x 1 , According to classical thermodynamics, the temperature dependence of G D is where G • D = −RT • log K • D is the dissociation Gibbs free energy at the reference temperature T • = 298.15 K ( K • D being the associated equilibrium constant), C p D is the change of the constant pressure heat capacity upon dissociation (here supposed independent on T) and S • D is the dissociation entropy at T • . The macroscopic differential scattering cross section, which is the experimental information provided by a SAXS curve, for a system of interacting monomers and dimers can be written as N A being Avogadro's number, κ an unknown fraction of the nominal protein molar concentration C N ( C = κC N ), B an arbitrary flat background that takes into account possible uncertainties in the determination of transmissions of proteins and buffers samples. P(q) represents the average form factor of the system where P j (q) stands for the form factor (which is the orientational average of the excess squared X-ray scattering amplitude) of the M pro monomer ( j = 1 ) or dimer ( j = 2 ). We have calculated P j (q) from the the crystal structure of SARS-CoV-2 M pro dimer recently determined 7 (PDB code 6y2e) considering one chain ( j = 1 ) or both chains ( j = 2 ) by means of the SASMOL method 23 . This method takes into account the contribution to the scattering due to the hydration water molecules around the protein, whose positions are found by embedding the atomic structure in a tetrahedral close packed lattice. For SARS-CoV-2 M pro monomer and dimer, 726 and 1243 hydration water molecules have been respectively calculated, suggesting that for the dimer formation about 200 water molecules are removed from the hydration shell of both monomers. Hence, the water molecules attributed to each monomer decrease from 726 to 621 upon M pro dimerization. This indicates that the dimerization process is accompanied with slight structural changes reducing the average area accessible to solvent. The S M (q) term in Eq. (5) is the so-called "measured" structure factor, which describes the long range intermolecular interactions among all the particles in solution. For sake of simplicity, here we consider a common effective structure factor that takes into account monomer-monomer, monomer-dimer and dimer-dimer interactions. Considering that at low q all the experimental scattering curves (Fig. 2 top panels) show a positive deviation from a Guinier trend, indicative of the prevalence of protein-protein attraction with respect to repulsion, we have approximated the structure factor by the one of fractal distribution of inhomogeneities developed by Teixeira 24 , whose main parameters are D, the fractal dimension of the aggregates, r 0 , the effective radius of the aggregating protein molecule and ξ , the correlation length, which can be interpreted as the average size of the aggregates (see Eqs. 10, 11 and 12). The above described model, which combines SARS-CoV-2 M pro thermodynamic and structural features, has been adopted to simultaneously analyze the whole set of the SAXS curves, recorded at different temperatures and concentrations, shown in Fig. 2, top panels. Fitting parameters shared by all the curves are K • D , the dissociation equilibrium constant at T • , C p D , the constant pressure heat capacity upon dissociation, S • D and the dissociation entropy at T • . Another parameter common to all the curves is the relative mass density of the hydration water (in general higher than 1), d h , which is taken into account in the SASMOL method 23 . The shared fitted parameters are shown in Table 1, while all the other are reported in Supplementary Table S1.
The most important parameter obtained by the simultaneous fit of SAXS data is the dissociation constant K • D , which turns out to be 7 ± 1 µ M, in good agreement with the value obtained by Graziano et al. 20 on the very similar main protease from SARS-CoV. The corresponding dissociation Gibbs free energy (calculated with www.nature.com/scientificreports/ Eq. 2) is G • D ≃ 30 kJ mol −1 , a value quite similar to the one observed for the β-lactoglobulin dimer dissociation at neutral pH 25 . Regarding the dissociation entropy, we have obtained a positive value, 50 ± 20 J mol −1 K −1 , meaningfully smaller with respect to the one derived for the above mentioned β-lactoglobulin case 25 . It should be noticed that in a dissociation process, many factors besides translational and rotational motions contribute to a positive dissociation entropy and it is difficult to separate them. One such factor is certainly the removal of about 200 hydration water molecules from the monomer-monomer interface when the dimer is formed. The change of the heat capacity at constant pressure upon dissociation resulted positive and large. This parameter indirectly describes the monomer-monomer interface, as it can be attributed to the hydration and correlates with the interface size 26 . The set of parameters reported in Table 1 allows to calculate the M pro equilibrium dissociation constant, together with its standard deviation, at any temperature. Results are shown in Supplementary  Fig. S1, top left panel. We notice a slight increase of K D with T, an effect that is mainly due to the large increase of the constant pressure heat capacity upon dissociation. However, a further investigation on the monomermonomer interface area and its relationship with the dissociation heat capacity 27 requires further calorimetric experiments in order to obtain lower estimation errors. Finally, the relative density of the hydration shell is little more than one, in agreement with previous literature results on globular proteins [28][29][30] . The determination of the thermodynamic features of the dimer-monomer equilibrium of M pro , in conditions quite similar to those found in vivo, is a fundamental step to investigate the effects of drugs aimed to inhibit dimerization and underlines the importance to further investigate M pro monomer-monomer interface by in-solution techniques.
Far-UV CD. To provide further insights on the dimer-monomer equilibrium, we have measured the far-UV CD spectra of M pro at three different concentrations, as shown in Fig. 3, left panel.
At the higher concentration of 16 µ M, the ellipticity shows a minimum wavelength min at about 221 nm and a shoulder centered at about 208 nm, which are typical of proteins with α-helical and β-sheet content 31,32 , fully consistent with the structural features of the SARS-CoV-2 M pro7 , in agreement with CD measurements of the same enzyme 33 and of the very much similar SARS-CoV M pro34 . As concentration decreases, min shifts towards lower values, thus reporting an increase of the β-sheet component at the expense of the α-helical content 31 . Quite interestingly, if we take the dissociation constant of K D = 7 ± 1 µ M, in the concentration range from 2.5 to 16 µ M a decrease of about a factor 2 in the fraction of monomers is expected (from around 0.67 to 0.36, see Eq. 3). This large shift in the dimer-monomer populations can reasonably yield the changes occurring to the M pro secondary structure and revealed by CD spectroscopy. Within this working hypothesis, we can describe the min trend in terms of the dimer-monomer equilibrium through the following expression: where we take a fixed value of K D = 7 µ M as estimated by SAXS, while min mon and min dim are the minimum wavelength parameters corresponding to the monomer and the dimer spectra, respectively. As shown in the inset of Fig. 3 (left panel), the trend of the min values is fitted in an excellent way with Eq. (7).
The thermal stability of the M pro has been characterized by monitoring the signal at 221 nm of the M pro sample at 16 µ M concentration, within the simplified hypothesis that the melting curve arises mainly from dimers. The rather sharp transition we have obtained is shown in Fig. 3 (right panel) and clearly suggests a two-state model, where the dimer unfolds and yields two random-coil monomeric chains:   www.nature.com/scientificreports/ Considering the scheme 8, if we hypothesize that the dimer can unfold to two random-coil monomeric chains, we obtain an apparent melting temperature T m of 50 • C, with a melting Van't Hoff enthalpy of H v = 810 ± 60 kJ/mol. This value is in good agreement with the Van't Hoff enthalpy H v ∼ 880 kJ/mol estimated through the equation �H v = 4RT 2 m C p,max /�H cal from DSC measurements 33 . It is also worth of note that, by taking H cal = 443 kJ/mol 33 , it turns out a ratio �H v /�H cal ∼ 1.8 : such a value larger than 1 is fully consistent with the unfolding transition coupled to the dimer dissociation. Quite interestingly, the thermal stability as revealed by CD measurements supports a view where about 90 the folded state in the temperature range investigated by SAXS experiments, i.e. up to 45 • C, thus validating the model we used to interpret the corresponding scattering curves.

M pro dimer-monomer equilibrium in presence of inhibitors. In-silico inhibitor selection. To identify
new inhibitors of SARS-CoV-2 main protease from a large in-house database, we applied the in silico protocol, recently proposed by some of us 35 . The flowchart of the adopted protocol is depicted in Supplementary Fig. S2. As a first step, we performed molecular docking studies on the compounds present in the database to analyze their binding capability in the catalytic active site of the SARS-CoV-2 M pro (PDB code 6y2f) 7 , as detailed in the Materials and Methods section. Supplementary Fig. S3 shows the 3D binding active site of SARS-CoV-2 M pro co-crystallized with the native inhibitor 13b 7 covalently bonded to Cys145. The ligand binds to the enzymatic catalytic cleft of the protease located between domains I and II. The 3D binding site representation (Supplementary Fig. S3) highlights the interactions with the amino acid residues involved in the inhibition mechanism, such as Met49, Met165, Glu166, His164, Phe140, Gly143 and the catalytic Cys145. It is noteworthy the presence of hydrogen bonds between the pyridone moiety of ligand and Glu166, which rules the catalytic activity driving the SARS-CoV-2 M pro to adopt an inactive conformation. The resulting best docked molecules have been selected based on a docking score cut-off of −6.5 kcal/mol and submitted to ligand based approaches, by taking advantage of the web-service DRUDIT (DRUgs Discovery Tools), an open access virtual screening platform recently developed 36 , which represents the evolution of previous well-established protocols based on molecular descriptors 37,38 .
DRUDIT implements the ligand based template of SARS-CoV-2 M pro , available in the Biotarget Finder tool, which has been recently proposed as a useful mean in the identification of new SARS-CoV-2 M pro modulators. Subsequently, the ligands selected by molecular docking were submitted to DRUDIT, as elsewhere reported 35 , allowing the evaluation of their affinity to SARS-CoV-2 M pro by the values of Drudit Affinity Score (DAS). The features of the ligand-based approaches based on molecular descriptors enabled us to assess topological, thermodynamic and charge-related characteristics of the ligands. Thus, two complementary standpoints in the evaluation of the binding capability (ligand-and structure-based) covered all the interaction aspects in the ligand-target complex. The top scored molecules (selected based on a DAS cut-off of 0.65) were processed by Induced Fit Docking (IFD) calculations to further screen the hits to submit to in-wet test. In Supplementary Fig. S4 and in Table 2 the seven best scored structures are reported. The analysis of the results in Table 2 shows as the selected compounds present similar overall scores ( IFD_score ). This confirms the robustness of the ligand-based approach exploited by DRUDIT, which is able to give an account of the receptor-ligand binding although it is based on molecular descriptors that, as known, do not take into consideration the 3D shape of the binding site. Figure 4 reports the first two best scored molecules 3 and 7 (according to the IFD_score parameter) in the binding site (left panel) and their related amino-acid maps (right panel). The two molecules are deeply buried in the cleft of the substrate-binding pocket, but unlike the co-crystallized ligand 13b, they interact with a somehow different pattern of amino-acids. This evidence suggests that these compounds are not covalently bound to the SARS-CoV-2 M pro catalytic site.
M pro activity assays. The selected inhibitors have been tested for their efficacy to reduce the M pro activity. As reported in Fig. 5, the time dependence of substrate fluorescence after hydrolysis indicates that the catalytic activity of M pro changes in the presence of the selected compounds. In particular, compounds 2, 4, 5 and 7 induced an irreversible inactivation of the enzyme, while compounds 1 and 6 resulted rather inactive. For two of the most effective compounds (2 and 7) inhibition tests have been carried out as a function of the concentration. Unfortunately, we have not been able to perform this test for compound 4, which shows the best inhibition efficacy, as it produces a fluorescence signal that partially obscures that of the substrate. Results are shown in Fig. 6 (left panel). Percent inhibition data have been fitted with the Hill equation, p( www.nature.com/scientificreports/ get the half maximal effective concentration, IC 50 , and the Hill slope n. We obtained IC 50 = 10.3 ± 0.2 µ M for 2 and 15 ± 2 µ M for 7, with n = 5 ± 1 and 3 ± 1 , respectively. These values of n larger than one indicate that the binding is positively cooperative, in agreement with other recent experimental results 40 . Unfortunately, some of the foreseen conditions are missing (e.g. SAXS curves of inhibitor 5 at 60 µM), due to an experimental problem with the sample injection in the beam-line capillary. SAXS data have been analysed with the same approach adopted for data without inhibitors, with the further assumption that, for each compound, the thermodynamic parameters are linear functions of its concentration  Table 1), and the three corresponding constant rates α G , α C p and α S are fitting parameters common to all the SAXS curves corresponding to the same inhibitor. The high quality of the fitting procedure can be appreciated in Fig. 2 (bottom panels), where the calculated SAXS curves are superposed to the experimental ones and the resulting thermodynamic common fitting parameters are shown in Table 3, first panel.

SAXS. SAXS curves of SARS-CoV
The inhibitors with the lowest negative values of α G (Table 3, first panel) are those that mostly favour dimer dissociation. Results reported in Table 3 suggest that compounds 1, 6, and 7 are, within the experimental error, mostly able to increase the dissociation equilibrium constant, which at C I = 30 µ M becomes as large as ≈ 15 µ M and, at C I = 60 µ M almost doubles its value, reaching ≈ 30 µ M. We indeed recall that, in the absence of inhibitors, the value of K • D,0 is 7 ± 1 µ M (Table 1). Inhibitor 5 is slightly less active: at C I = 60 µ M we found a dissociation equilibrium constant of ≈ 20 µ M. The other three compounds, namely 2, 3 and 4, show a value of α G close to 0, indicating that they do not affect in a significant way the dimer-monomer equilibrium of M pro . Despite the high uncertainties on α C p and α S , their negative values suggest that upon dissociation there are changes of heat capacity and of entropy smaller than those observed without inhibitors, indicating that inhibitors increase the monomer order. The temperature dependence of the equilibrium dissociation constant K D is reported, for each inhibitor, in Supplementary Fig. S1. The large uncertainties on the fitting parameters determines the presence of wide bands of uncertainty on the K D trends. This is particularly evident for inhibitor 5, since SAXS data have been www.nature.com/scientificreports/ recorded only for one inhibitor concentration. Hence, a word of caution is necessary regarding the temperature dependence of K D in the presence of the seven inhibitors obtained by the SAXS analysis. From a close inspection of the single curve parameters, reported in Supplementary Tables S1 and S2, we observe that the value of the correlation length ξ is in the range 3000-4500 Åand rather independent of the temperature and the presence of the inhibitors. The fractal dimension is ≈ 2 , suggesting a two-dimensional fractal growth of protein clusters in the presence of inhibitors.

Discussion
The active site of M pro monomer, which is highly conserved in different coronaviruses, is typically composed of four subsites, referred to as S1 ′ , S1, S2, and S4 [41][42][43] . They accommodate the corresponding domains P1 ′ , P1, P2, and P4 of the substrate or the ones of the inhibitor compound mimicking the substrate 44 . The S1 ′ subsite is constituted by the two residues Thr24 and Thr25. The S1 subsite (also referred to as the S1 pocket 43 ) is formed by the side chains of residues Phe140, Asn142, Glu166, His163 and His172 and by the main chains of Phe140 and Leu141 44 . As discussed by Sacco et al. 43 , S1 is considered a promising target for an inhibiting compound, as it can interact with both hydrophobic and hydrophilic groups. On the other hand, S2 is a hydrophobic subsite formed of the side chains of His41, Met49, Tyr54, Met165 and Asp187, while S4 is a small hydrophobic pocked that involves the side chains of Met165, Leu167, Phe185, Gln192 and Gln189 44 . An unusual catalytic dyad, His41-Cys145, acts in the active site, where His41 is a proton acceptor whereas Cys145 is attacked by the www.nature.com/scientificreports/ carbonyl carbon of the substrate. Hence, a signature of the inhibiting power of a compound is its capability to form a covalent bond with Cys145 41 , as very recently confirmed by Dai et al. 42 , who have found two promising inhibitors 11a and 11b. The importance of the protonation state of Cys145 as well as the network of hydrogen bonds between the catalytic site of M pro and inhibiting compounds has also been recently discussed by Kneller et al. 45 by combining X-ray and neutron scattering data. On these grounds, the experimental results obtained in the present study, together with the structure of the seven inhibitors within the M pro active site determined by the refined molecular docking, can be discussed. The interaction map of inhibitor 1 is shown in Supplementary Fig. S5. There are a total of 11 contacts with amino acids of M pro monomer (Ser1, Thr25, Thr26, Ser46, Asn119, Leu141, Asn142, Cys145, Pro168, Arg188, Gln192), 2 of them (Ser46, Asn119) are hydrogen bonds. The residues of the catalytic dyad and the four subsites in contact with 1 are: Cys145 (dyad, 1 of 2 (50 of 2 (50 5 (20 To note, these contacts involve only one of the residues of the catalytic dyad, Cys145, without a hydrogen bond, whereas for inhibitor 13b there is a hydrogen bond with Cys145 ( Supplementary Fig. S3). It is also worth to notice that Ser1 is among the residues in contact with inhibitor 1: since the mutual interaction of Glu166 of one monomer and the N-finger residues of the other monomer, like Ser1, has been proven to shape the catalytic cleft 7 , we argue that this compound could destabilize the dimer, consistently with the high value of K • D = 26 ± 4 µ M at C I = 60 µ M. However, its enzymatic inhibition is very poor, as shown by the high similarity of the RFU slope with the one in the absence of inhibitors (Fig. 5). A possible explanation of this result could be the absence of any hydrogen bond with Cys145 as well as the absence of any contact with the residues of subsite S2.
Regarding inhibitor 2, the map of contacts shown in Supplementary Fig. S6 reveals a total of 10 interactions with the monomer chain (Met49, Asn142, Gly143, Cys145, Asp187, His164, Met165, Glu166, Arg188, Gln189); 4 of them are hydrogen bonds (Asn142, His164, Glu166, Gln189) that do not involve the catalytic site. More in detail, the residues of the catalytic dyad and the four subsites in contact with 2 are: Cys145 (dyad, 1 of 2 (50 Asn142 and Glu166 (S1, 2 of 6 (33 of 5 (60 We also note that 2 residues of S1, Asn142 and Glu166, interact with this inhibitor via a hydrogen bond. This evidence, together with the high number of contacts with S2 and S4, could explain the experimentally observed inhibition effect ( m = 5 ± 2 RFU/min, Fig. 5). To note, this compound does not modify the dimer-monomer equilibrium, being the fitting parameter α G almost 0 (Table 3) within the experimental error.
Turning now to inhibitor 3, we have found that it does not alter the native dissociation equilibrium of M pro ( α G ≈ 0 , see Table 3) and also its inhibition effect, observed by fluorescence analysis, is week ( m = 23 ± 3 RFU/ min, slightly lower that the value in absence of inhibitors). The contact map of compound 3 (Fig. 4) shows a total of 12 interactions with the monomer chain (Thr24, Thr25, Thr26, His41, Ser46, Met49, Phe140, Leu141, Gly143, Met165, Leu167, Gln192), 2 of them being hydrogen bonds (Thr24, Ser46) and one (His41) a π − π stacking. The residues of the catalytic dyad and the four subsites in contact with 3 are: His41 (dyad, 1 of 2 (50 (S1 ′ , 2 of 2 (100 His41, Met49 and Met165 (S2, 3 of 5 (60 (S4, 3 of 5 (60 We notice that with respect to compound 2, there are no hydrogen bonds with the five residues that stabilize the S1 pocket. This difference might be the reason for the weak inhibition effect.
Compound 4 is the most effective among the seven inhibitors ( m ≈ 0 , Fig. 5) even if it shows at C I = 30 µ M a dissociation equilibrium constant K • D = 6 ± 2 µ M slightly lower than the one without inhibitors (see Table 1). Indeed the parameter α G is positive but very close to 0 within the experimental error ( Table 3), suggesting that compound 4 provokes a very weak stabilizing effect of the dimer. The contact map (Supplementary Fig. S7) shows 11 interactions with the monomer chain (Thr25, Thr26, Leu27, His41, Ser46, Met165, Glu166, Pro168, Gln189, Thr190, Gln192), including one hydrogen bond (Gln189). The residues of the catalytic dyad and the four subsites in contact with 4 are: His41 (dyad, 1 of 2 (50 of 2 (50 (40 Only one of the five residues that stabilize the S1 pocket is among the ones in contact with this inhibitor, Glu166, which does not form a hydrogen bond.  www.nature.com/scientificreports/ On this ground, the high inhibition effect of compound 4 could be only justified by the contact with His41, one of the two residues of the catalytic dyad. Results are different for compound 5: at C I = 60 µ M (unfortunately no data are available at 30 µ M) it provokes a rather important increase of K • D (Table 3) and shows a moderate inhibition effect ( m = 8 ± 3 RFU/min, see Fig. 5). Looking at the interaction map ( Supplementary Fig. S8), we notice 11 interactions with monomer M pro (Thr25, Thr26, Leu27, His41, Asn142, Gly143, Met165, Glu166, Pro168, Arg188, Gln192), one of them regards Glu166, involved in two hydrogen bonds, and the other one His41, involved in two a π − π stacking interactions. The residues of the catalytic dyad and the four subsites in contact with 5 are: His41 (dyad, 1 of 2 (50 of 2 (50 Met165 (S2, 2 of 5 (40 There are not hydrogen bonds involving the five residues that stabilize the S1 site. One could speculate that this inhibitor, probably due to its steric hindrance, provokes a modification of S1 that could interfere with the enzymatic activity of M pro . Compound 6 determines 11 contacts with the amino acid of the monomer (His41, Leu141, Ser144, Cys145, Met165, Glu166, Leu167, Arg188, Gln189, Ala191, Gln192, Supplementary Fig. S9), including two hydrogen bond (His41, Gln192). In particular, the residues of the catalytic dyad and the four subsites in contact with 6 are: His41 and Cys145 (dyad, 2 of 2 (100 2 of 5 (40 An almost absent inhibition effect is seen by fluorescence, being the slope of RFU (Fig. 5) very similar to the one determined in the absence of inhibitors. On the other side, compound 6 is able to modify the dimer-monomer dissociation, with one of the highest values of K • D = 30 ± 10 µ M at C I = 60 µ M (Table 3). To note, only one of the 6 amino acids that stabilize the S1 site are included in the list of residues interacting with compound 6. Hence, the absence of its inhibition activity could be explained by the small size of its molecular structure, which might not be able to provoke important modifications of the S1 pocket and hence to modify the catalytic features of M pro .
We finally turn to compound 7. It shows an opposite behaviour with respect to compound 6: it is capable to change the dimer-monomer equilibrium at the same extent ( K • D = 30 ± 10 µ M at C I = 60 µ M, Table 3) and displays a promising inhibition effect, with m ≈ 1 . For this compound, the map of contacts shows 12 interactions (Thr24, Thr25, Thr26, His41, Thr45, Met49, Leu141, His164, Met165, Glu166, Asp187, Arg188, Fig. 4) with a large number of hydrogen bonds (Thr24, Thr26, His41, His164, Glu166, Arg188). The residues of the catalytic dyad and the four subsites in contact with 7 are: His41 (dyad, 1 of 2 (50 (S1 ′ , 2 of 2 (100 His41, Met49, Met165 and Asp187 (S2, 4 of 5 (80 5 (20 Only one of the interacting residues (Glu166) is involved in the stabilization of the S1 pocket. We can infer the high inhibition effect could be due to the high number of contact with S2 and to the presence of 6 hydrogen bonds. Another hypothesis, which needs further insights to be confirmed, is that the fluorinated groups, which are present in a high number in compound 7, may originate a new reactive warhead able to form a covalent bond with Cys145. We may also consider that one of them involves a residue of the catalytic dyad, His41, suggesting a possible important modification of the enzymatic activity.

Conclusions
Considering the results that we have discussed above, a picture emerges where the selected compounds designed to bind the catalytic site of SARS-CoV-2 M pro may affect dimerization and enzymatic activity processes to a different extent. Since the functional form of M pro is a dimer, compounds that disrupt dimerization should be in principle also effective at diminishing its catalytic activity. However, the compound-induced shift in the dimer and monomer thermal equilibrium populations may not directly translate into a loss of enzymatic activity. Indeed, the latter strongly depends also on the local interactions occurring at the catalytic site, that in turn governs the competition between inhibitor and substrate. To better visualize the scenario presented by our findings, we report in Fig. 6 (right panel) the slope m of the fluorescence inhibition curve as a function of the dimer-monomer equilibrium constant K D calculated from the set of fitting parameters, derived by the global fit of SAXS data of the seven compounds, by fixing the inhibitor concentration C I at 60 µ M and the temperature at 30 • C, the same value employed in the enzymatic activity assays. The points in this map could be organized in two groups, as represented in blue and in red. The latter group refers to compounds 3, 5 and 7, which show the expected behaviour: the stronger is their capability to induce the dissociation of the M pro dimer, the more important is their inhibiting effect. For these compounds the molecular mechanisms underlying inhibition at the active site are likely linked with their ability to provoke dimer dissociation. On the other hand, for compounds 1, 2, 4, and 6, displayed in blue, we find the opposite behavior: the increase of dimer dissociation does not determine an increase of inhibition, thus suggesting that the molecular mechanisms of inhibition at the active site play a major role with a marginal involvement of the monomer-monomer interface. This apparently contradictory result can be in part explained by considering that, in all cases, the dissociation equilibrium is weak so that, in the presence of a compound that alters the dimer-monomer equilibrium but that does not hamper the interaction with the substrate, there are always dimeric M pro molecules that can exert their enzymatic activity when a substrate is available.
A further complementary explanation could be the existence of ligand binding sites alternative to the orthosteric active site located at the interface of dimeric M pro . Indeed, the presence of two of such binding sites, not directly involved in enzymatic inhibition but probably interfering with dimerization, has been very recently revealed by a molecular dynamics simulation study in the SARS-CoV-2 M pro46 .
We can provide a qualitative interpretation of the behavior exhibited by the two groups of compounds by looking at the contact maps of the residues grouped in the different sub-sites of the active site. We note that both compound 6 and 7, which show the higher values of K D , are in contact with two residues of the S1, Leu141 and Glu166. Besides, compound 1, which has a slightly lower K D , interacts with S1 through the contacts with Leu141 and Asn142. This suggests that binding with at least two of the residues Glu166, Leu141 and Asn142 is crucial to modify the dimer-monomer equilibrium. On the other hand, the interaction of compounds 7, 2 and 5 with Glu166 via an hydrogen bond is likely linked to their high inhibiting action. On the contrary, there is www.nature.com/scientificreports/ no hydrogen bond between compound 6 and Glu166. Hence, on the basis of the results obtained for the seven selected compounds, the interplay between SAXS results, enzymatic activity assays and contact map analysis suggests a relevant clue: in order to promote both M pro dimer dissociation and the inhibition of its catalytic activity, a small molecule should interact with at least two residues of the S1 sub-site and most likely form an hydrogen bond with Glu166. The key role of Glu166 residue, which is conserved among all human coronaviruses, for inhibition has been pointed out also very recently 47 .
To note, according to Goyal and Goyal 6 , Glu166 is among the residues that should be targeted to inhibit the dimerization of SARS-CoV M pro . However, for a more detailed investigation of the dimerization process in stabilizing the catalytic activity of M pro , it is also important to take into account the overall contribution of protein flexibility, as recently evidenced by Suárez et al. 48 , through a 2 µ s Molecular Dynamics simulation of M pro with and without a model peptide mimicking the enzyme substrate.
In summary, the experimental work presented here brings basic information to decipher the complex interplay between enzymatic activity inhibition and dimer dissociation. To the best of our knowledge, we have shown for the first time how structural information about the SARS-CoV-2 M pro in solution in the absence and in the presence of potential inhibitors and as a function of temperature can be obtained from an advanced analysis of SAXS data within an overall thermodynamic picture, complemented by more conventional approaches. Our results suggest that more experimental evidences about the impairment of monomer and dimer M pro in the presence of inhibitors corroborated by computational information will be necessary for a deeper understanding of the M pro allosteric mechanism.

Materials and methods
M pro expression and western blot analysis. pGEX-6P-1 vector harboring the full length cDNA sequence encoding for SARS-CoV-2 Main Protease (M pro NC_045512) was purchased from GenScript (clone ID_M16788F). The expressing vector was transformed into BL21DE3pLys Escherichia coli cells and the obtained clones were assayed both in small scale (5 mL) and medium scale (500 mL and 1 L) for the production of SARS-CoV-2 M pro . Transformants were grown onto LB medium containing 100 µg/mL Ampicillin and 34 µg/ mL Chloramphenicol as selective antibiotics. Cultures were grown up to OD600 of 0.6-0.8 at 37 • C, 200 rpm and then M pro expression was induced by addition of 0.5 mM isopropyl-1-thio-β-D-galactopyranoside (IPTG). Growth under induction was achieved both for 3 h at 37 • C and 10 h at 16 • C in order to test the best expressing condition. Cells were harvested by centrifugation at 6000 g. Cell pellets were resuspended in lysis buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 2 mM β-mercaptoethanol), and cell rupture was achieved by sonication (Sonics Vibra Cell sonicator) at 4 • C. Cell debris was separated from the total protein extract by centrifugation at 6500 g for 1 h. Supernatant aliquotes were resuspended in Laemmli sample buffer, run onto 12 polyvinylidene difluoride membrane for Western blot analysis. M pro was decorated by 6 ×-His tag monoclonal primary antibody (Invitrogen) and anti-mouse secondary antibody and detected by chemiluminescence (Clarity Western ICL Substrate, Biorad, Supplementary Fig. S11, panels A and B).  www.nature.com/scientificreports/ acid 1 mM) through Amicon Ultra-4 centrifugal filters 30K (Merck Millipore). For M pro C-terminal His-tag removal, the Prescission (1 U for 100 µ g of protein) cleavage reaction was performed at 4 • C for 4 h and Prescission protease was then removed by GSTrap FF column (GE-Healthcare). The M pro solution was further purified by FPLC size-exclusion chromatography on Superdex 75 10/300 GL column ( Supplementary Fig. S10, panels A and B) 33 SAXS data analysis approach has been described in the main text, with the exception of some minor points. Since in all conditions the nominal molar protein concentration is lower than 1 mM, its temperature variations can be considered to be only determined by the dependency with T of the relative mass density of water, which, according to literature results 50 is written as where, in our investigated range 15 − 45 • C, the optimum value of the thermal expansivity at T • is α w = 2.5 · 10 −4 K −1 and the one of its first derivative is β w = 9.8 · 10 −6 K −2 . Accordingly, C N = C • d w , C • being the nominal protein concentration at T • .
The measured structure factor S M (q) has been obtained in relation to the protein-protein structure factor S(q) by: where β(q) is the coupling function and P (1) (q) is the average of the protein excess scattering amplitude, a function provided, together with P(q) by the SASMOL method. According to Ref. 24 , S(q) has been written as where Ŵ(x) is the gamma function, D is the fractal dimension (comprised between 1 and 3) of the aggregates, r 0 is the effective radius of the aggregating protein and ξ is the correlation length.
In-silico design. Ligand preparation. The default setting of the LigPrep tool implemented in Schrödinger's software (version 2017-1) was used to prepare the ligands for docking 51 . All possible tautomers and combination of stereoisomers were generated for pH 7.0 ± 0.4 , using the Epik ionization method 52 . Energy minimization was subsequently performed using the integrated OPLS 2005 force field 53 . Docking validation. Molecular Docking was performed by the Glide program 37,56,57 . The receptor grid preparation was performed by assigning the original ligand (13b) as the centroid of the grid box. The generated 3D conformers were docked into the receptor model using the Standard Precision (XP) mode as the scoring function. A total of 5 poses per ligand conformer were included in the post-docking minimization step, and a maximum of 2 docking poses were generated for each ligand conformer. The proposed docking procedure was validated by the re-dock of the crystallized 13b within the receptor-binding pockets of 6y2f by Glide covalent docking. The results obtained were in good agreement of the experimental poses, showing a RMSD of 0.75 Å.
Biotarget finder module (DRUDIT). The refined selection of suitable SARS-CoV-2 M pro inhibitors was performed through the module Biotarget Finder as available in the www.drudit.com webserver 36 . The tool allows to predict the binding affinity of candidate molecules versus the selected biological target. The template of the biological target was built as previously reported. Thus, the in-house database was submitted to the Biological Predictor module by setting the DRUDIT parameters, N, Z, and G, using the crystallized structure of 13b, as previously reported 35 .
Induced fit docking. Induced fit docking simulation was performed using the IFD application as available 38,58 in the Schrödinger software suite 39 , which has been demonstrated to be an accurate and robust method to account for both ligand and receptor flexibility 59 . The IFD protocol was performed as follows 60,61 : the ligands were docked into the rigid receptor models with scaled down van der Waals (vdW) radii. The Glide Standard Precision (XP) mode was used for the docking and 20 ligand poses were retained for protein structural refinements. The docking boxes were defined to include all amino acid residues within the dimensions of 25 Å× 25 Å×25 Å from the centre of the original ligands. The induced-fit protein-ligand complexes were generated using Prime software 39,62,63 . The 20 structures from the previous step were submitted to side chain and backbone refinements. All residues with at least one atom located within 5.0 Å of each corresponding ligand pose were included in the refinement by Prime. All the poses generated were then hierarchically classified, refined and further minimized into the active site grid before being finally scored using the proprietary GlideScore function defined as follows: XPG_score = 0.065 vdW + 0.130 Coul + Lipo + Hbond + Metal + BuryP + RotB + Site , where vdW is the van der Waals energy term, Coul is the Coulomb energy, Lipo is a lipophilic contact term that rewards favourable hydrophobic interactions, Hbond is an H-bonding term, Metal is a metal-binding term (where applicable), BuryP is a penalty term applied to buried polar groups, RotB is a penalty for freezing rotatable bonds and Site is a term used to describe favourable polar interactions in the active site. Finally, IFD_score ( IFD_score = XPG_score + 0.05 Prime_Energy ), which accounts for both protein-ligand interaction energy and total energy of the system, was calculated and used to rank the IFD poses. More negative IFD_score values indicated more favourable binding. Results are shown in Table 2.
Chemical synthesis of inhibitors. Inhibitors 1 64 , 3 64 , 5 64 and 6 65 have been prepared as previously reported. Inhibitor 2 is commercial. Inhibitors 4 64 and 7 64 have been synthesized as described in detail in the next paragraphs. All solvent and reagents were used as received, unless otherwise stated. Melting points were determined on a hot-stage apparatus. 1 H-NMR and 13 C-NMR spectra were recorded at indicated frequencies, residual solvent peak was used as reference. Chromatography was performed by using silica gel (0.040-0.063 mm) and mixtures of ethyl acetate and petroleum ether (fraction boiling in the range of 40-60 • C) in various ratios (v/v). Compounds 8 64 and 9 66 , used in the synthesis of inhibitors 4 and 7, have been prepared as previously reported.
Synthesis of inhibitor 4. Inhibitor 4 was synthesized through a nucleophilic aromatic substitution (S N Ar) of 5-pentafluorophenyl-1,2,4-oxadiazole 8 with 1-Aza-18-crown-6 in para position ( Supplementary Fig. S12). Oxadiazole 8 (312 mg, 1 mmol) was dissolved in acetonitrile (5 mL). 1-Aza-18-crown-6 (289 mg, 1.1 mmol) and potassium carbonate (152 mg, 1.1 mmol) were added and the suspension was stirred at room temperature for 24 h. The reaction was monitored by TLC. The reaction mixture was dried under vacuum and treated with H 2 O (50 mL) before extraction three times with EtOAc (50 mL each). The combined organic layers were dried with Na 2 SO 4 and then concentrated in vacuo to give the crude product, which was recrystallized from EtOH. Synthesis of inhibitor 7. Tripodal oxadiazolylamide (inhibitor 7) was easily obtained by means of nucleophilic displacement with ethylamine from tripodal ester 9, which was previously reported as heavy metal fluores-