Modeling the allosteric modulation on a G-Protein Coupled Receptor: the case of M2 muscarinic Acetylcholine Receptor in complex with LY211960

Allosteric modulation is involved in a plethora of diverse protein functions, which are fundamental for cells’ life. This phenomenon can be thought as communication between two topographically distinct site of a protein structure. How this communication occurs is still matter of debate. Many different descriptions have been presented so far. Here we consider a specific case where any significant conformational change is involved upon allosteric modulator binding and the phenomenon is depicted as a vibrational energy diffusion process between distant protein regions. We applied this model, by employing computational tools, to the human muscarinic receptor M2, a transmembrane protein G-protein coupled receptor known to undergo allosteric modulation whose recently X-ray structure has been recently resolved both with and without the presence of a particular allosteric modulator. Our calculations, performed on these two receptor structures, suggest that for this case the allosteric modulator modifies the energy current between functionally relevant regions of the protein; this allows to identify the main residues responsible for this modulation. These results contribute to shed light on the molecular basis of allosteric modulation and may help design new allosteric ligands.

www.nature.com/scientificreports www.nature.com/scientificreports/ mainly affected. Usually then, no appreciable conformational changes are observed 9 . In this work we will consider only this case. Finally, cases in between the first two types are also possible 9 . Allostery can be exploited for drug development. Indeed, allosteric ligands can cause an increase the orthosteric ligands' affinity for their protein target 10 . These ligands ("Positive Allosteric Modulators", PAMs) turn out to decrease the dissociation constant (K d ) of the orthosteric ligand upon target binding. In contrast, Negative Allosteric Modulators (NAMs) increases K d while Neutral Allosteric Ligand (NALs) do not affect K d 11 . Understanding the molecular basis of allosteric modulation and identifying the key elements which make this phenomenon possible is therefore of great importance for both basic science and drug discovery.
We use as a test case the M2 muscarinic acetylcholine receptor (M2), for which large structural information are available. This receptor belongs to the class A of the G-protein Coupled receptor (GPCRs) superfamily. GPCRs are transmembrane proteins which trigger a signal cascade inside the cell upon binding with the G-protein. The allosteric binding occurs prevalently when the receptor assumes a particular conformation called active state 12 . In most cases, the active state is promoted, by the binding of small molecules called agonist 13 . Here we study the binary complex between M2 with its agonist iperoxo as well as the ternary complex between the receptor, iperoxo and the PAM LY211960, whose X-ray structures have been determined (Fig. 1a,b).
GPCRs possess very complex energy landscape presenting several stable conformations 14 . Allosteric modulators can modify this landscape affecting the stability and/or the dynamical properties of each conformation 8 . In this respect, our study focuses on two stable states of a GPCR/ligand adduct. The two states are expected to be two free energy minima. It reasonable to assume that they are active states 15 . The two X-ray structures are not too dissimilar, indeed, their RMSDs differ by only about 1 Å and this difference remains small during a molecular dynamic simulation (see Fig. SI2). Therefore, the allosteric modulation, in this particular case, does not seem to involve a significant shift in the population of conformational states. Rather, it causes a local reshape of the energy landscape of the same conformational state 16 . Investigating cases in which the allosteric modulation leads to a population shift in the conformational states is beyond the scope of this paper.
Therefore, in this case, it is reasonable to think the allosteric modulation occurs by means of molecular vibrations of the protein involving the transfer of the associated energy, the vibrational energy (E vib ). In the following, the allosteric modulation will be indeed modeled as exchange of E vib among protein residue. This study will be focused only on this specific Type II PAM since sufficient experimental data and studies are available 15,17,18 .

Results and Discussion
As stated above, in this work we describe the allosteric modulation as communication which consist in the E vib exchange among protein residues. This exchange is modeled as a diffusion process to which a master equation can be associated with.
is a N-entries vector, where N is the number of residues. Each of its entries represents the percentage of total E vib relative to each residue at time t (e.g. is the vibrational energy of the i-th residue. Obviously, we do not the exact value of E t ( ) vib res i and E vib but just their ratio p i (t)). L is the transition frequency matrix, whose elements L ij are the frequencies associated with transition times τ ij , which are the characteristic time to exchange E vib between residue i and j, as calculated in reference 19 . Each element of L is defined as: The Master Eq. (1) defines a Transition Network (Markov State Model or simply Markov chain) in which nodes are residues connected via edges whose weights are the element (L ij ) of the matrix. L can be decomposed in its eigenvectors. Each of them is associated to an eigenvalue, i.e. a frequency. The eigenvalue zero represents the thermal equilibrium, in which all the residues have the same amount of E vib , and the corresponding eigenvectors (p ∞ ) is the so-called stationary distribution since dp ∞ /dt = p ∞ ·L = 0.
All the other eigenvectors represent metastable states, through which the system pass by to reach the thermal equilibrium and they "exist" for a finite period of time. In other words, they describe states in which some residues possess a larger E vib than others. This difference causes the energy exchange that we decided to employ for modeling the allosteric communication. Therefore, identifying group of residues which can retain more E vib during the process of reaching the equilibrium can be highly interesting because they might represent biologically relevant regions and studying E vib exchange among them will shed light on how allosteric modulation occurs in protein. Therefore, in the following, we will identify first this regions and study how the E vib exchange occurs among them.
Our analysis focuses here on all the residues of the M2 (Fig. 1), except those belonging to the third intracellular loop. Due to its high mobility, indeed, the approximations needed for calculating τ ij no longer holds 19 and all the possible values becomes unreliable. The loop is included in our MD simulations (see Methods). For the β2 adrenergic (another class A GPCR such as the M2 receptor), this loop is a distinct domain not affecting the seven helices bundle structural features 20 . One can reasonably expect the same for the M2 receptor, because of the structural similarity of the two receptors.
Here, we consider both the binary (receptor/iperoxo) complex with the agonist iperoxo and its the ternary complex, where also the allosteric ligand LY211960 is present.
We first used the PCCA+ 21 approach to identify clusters of the residues related to metastable states. This requires to set the number of clusters (n), using a 'goodness' criterion: A specific parameter, θ, which depends on the number of clusters n, should be as close to zero (optimal value) as possible. It turns out that setting n = 2 leads to θ = 0, while is much smaller 0 for n > 2 (See methods). Therefore, setting n = 2, protein residues are distributed among two clusters and each of them can be assigned to one cluster calculating the "grades of membership" for each residue. This ranges from 0, when the residue does not belong to the specific cluster, to 1, when the residue belongs only to the cluster (Fig. 2). Here, we establish a cutoff beyond which the residue is assigned to a specific cluster. If the cutoff is set to 0.7, the two resulting clusters turn out to include residues crucial for the binding of the cognate G-proteins 12 as well as the allosteric and orthosteric ligands 15,22 (see Fig. 3). These clusters have been identified both in the binary and the tertiary complex. Choosing a higher cutoff such as 0.8, 0.9 turns out not to significantly affect our results (data not shown).
Those residues playing a role in allosteric or orthosteric binding might be instrumental for drug development, therefore, their relevance to E vib exchange is here investigated. According to PCCA+, the two recognized clusters represents regions in which the E vib exchange processes among residues within the cluster is faster than the E vib exchange between the clusters themselves. This finding agrees with previously reported computational results showing that residues belonging to the regions identified by the clusters feature shorter characteristic time for E vib exchange 19 .
The energy current F between the two clusters, calculated using the so-called Transition Path Theory (TPT) (see Methods), increases on passing from the binary (F ag hereafter) to the ternary complex (F all ). The relative percentage change (ΔF) between the two complexes, which reads: www.nature.com/scientificreports www.nature.com/scientificreports/ for the case considered, although It is plausible (albeit not proved) that other GPCRs/allosteric ligands sharing a high degree of structural similarity with the one considered here might exhibit similar behavior binding an allosteric modulator belonging to the same class as LY211960. Conversely, what makes this study specific for the M2 receptor is the analysis of the residues involved in allosteric and orthosteric binding. This is carried out by calculating ΔF X for a specific residue X, as opposed to the overall ΔF value in Eq. (3). The question we will address is: Does a large value of ΔF X indicate an important role of the residue for ligands binding? Fig. 4 shows that the largest ΔF X values (>70%) are those for X = Y80, W422, Y177 and Y403. Interestingly, all of these residues are important for binding (see Fig. 5) (i) W422 and Y177 play a crucial role for the binding itself 22 . They form two pi-stacking bonds on two opposite sides of the ligand aromatic rings. Binding does not occur if they  www.nature.com/scientificreports www.nature.com/scientificreports/  www.nature.com/scientificreports www.nature.com/scientificreports/ are mutated with a non-aromatic residue 22 . (ii) Y403, which interacts with W422 through a T-shaped stacking, is part of tyrosine lid that has been shown to be essential for the orthosteric binding 23,24 . (iii)Y80's hydroxyl group forms a stable hydrogen bond with the allosteric ligand 15 , pointing to his role for binding. The latter is further confirmed by the change in magnitude of positive cooperativity (Δα) upon mutation of the residue to alanine in the structurally similar M4 receptor 22 . a α is the ratio between the dissociation constant of the orthosteric ligand with and without the allosteric modulator: it can be considered as the relative difference in the E vib current and a measure of the relative loss/gain of stability upon binding of the allosteric modulator. Δα value is the largest so far among all those measured (Fig. SI1). The relevance these residues have, both for the E vib exchange and for the allosteric/orthosteric binding, makes them useful for being exploited for drug development. Noteworthy the interactions they make with the allosteric ligands modifies their dynamics and, thus, their property of exchanging E vib with other residues. Even a slightly change in those interactions might lead, consequently, to either a decrease or a negligible value of ΔF producing a different allosteric behavior.

conclusion
We have presented a molecular simulation study on the M2 receptor in complex with its agonist and the PAM LY211960. Our calculations show that the latter affects the dynamics and consequently the E vib exchange in the M2 receptor structure. Upon LY211960 binding, the energy between distant protein regions is speeded up. Thus, any kind of perturbation affecting the structure could be accommodated easier than in the case the ligand is absent. Residues involved in either allosteric or orthosteric binding, showed to affect the energy current change on passing from the binary to the ternary complex. The last finding may provide valuable information for further allosteric drug development. Indeed, these residues can be exploited to model new ligands targeting specifically them and effectively modifying the allosteric communication. These results might be generalized to other proteins accommodating PAMs.

Methods
Systems and calculations. The calculations are based on the mAchR iperoxo binary (PDBID 4MQS) and mAchR iperoxo LY211960 (PDBID: 4MQT) ternary complex X-ray structures. A long intracellular loop between residues 212 to 383, not present in the crystal structures, was added using the Robetta 25 and MODELLER 26 codes.
The proteins were inserted in a pre-equilibrated membrane, built using CHARMM-GUI membrane builder web server 27 . 80.000 water molecules were added. Na + and Cl − ions were also added so as to make the system neutral and to reach a physiological concentration (about 0.15 M). The overall systems consisted of about 120.000 atoms.
The AMBER99SB 28 force field and TIP3P model 29 , Slipids 30 were used for the protein, Na + and Cl − ions, water and lipids. The General Amber force field (GAFF) parameters 31 were used for the ligands in both complexes, along with the RESP atomic charges 32 . These were obtained by fitting with the electrostatic potential (ESP) from Gaussian 09 33 calculation with the HF-6-31G* basis set. The ligands topologies were converted to the GROMACS format using the ACPYPE tool 34 .
One simulation for each system was performed with the GROMACS program suite 35 . 500 ns were performed in the NPT ensemble using 1 fs time-step, sampling each 100 ps, after a 50 ns of thermalization run, in a 11 × 11 × 20 A simulation box. Nose-Hoover thermostat 36 was used with τ = 0.4 ps along with Parrinello-Rahman barostat 37 with τ = 1.0 ps. Electrostatic interactions were treated using PME method 38 . A cut-off of 1.2 nm was used for all the long-range interactions.
Robust perron cluster ananlysis. PCCA+ 21 aims to identify metastable states associated clusters to find a diagonal-block structure in the N by N transition frequency matrix L. To do so, an arbitrary number q of excepted clusters is defined. The "goodness" of this choice can be tested a posteriori using an estimator, θ.
C k ⊂{1, … N} is the set of indices (of the matrix L) of the k-th cluster, k = 1, … q. Furthermore, the set of representative indices π(k) of the i-th cluster is defined so that a row t π(k) of L is orthogonal to the characteristic vector of the cluster l (χ l ): where δ kl is the Kronecker delta. χ C l is zero if the index of the entries does not belong to C l . For a generic row t m the Eq. (4) becomes: where p m,C i is the transition probability from the state m to the cluster C i . Hence, combining Eqs. (4) and (5) we get: This means the error one commits writing t m as a linear combination t π(s) is orthogonal to every χ C k . In principle we don't know χ C k but we know that, if there is a hidden block-diagonal structure, the L's eigenvector e (k) are almost constant on C k . This means that we can approximate the Eq. (6) as: a M4 belongs to the same subclass of GPCRs as M2, it binds the LY211960 allosteric ligand with the same affinity as M2 15 and it shares 75% sequence identity with M2, as calculated with BLAST web server 24  , which are defined as the grade of membership of the state m to the cluster s and they should range from 0 to 1. w m,s is the main outcome of PCCA+. In order to calculate them we can recast Eq. (7) as: . e (k) are obtained by diagonalizing L and π(s) are chosen selecting those which maximize the distance ||e r −e p || where e r(p) are vectors in  q so that e r(p) = ( … e e , , 21 . Finally, we have to discuss the chosen number of clusters. Ad stated above w p There is no guarantee the latter is fulfilled, therefore, the positiveness of w m s ( ) could be used to evaluate if the number of chosen clusters q is suitable for the system under consideration or not. Thus, we define θ as: Trajectories going from A to B are called reactive trajectories. The probability to find reactive trajectories, given i ∈ (A∪B) c is: where α ∞ p is the stationary probability distribution. Finally, we can define the probability current of reactive trajectories "flowing" from i to j, with i, j ∈ (A∪B) c and i ≠ j as: In conclusion, the probability current exiting from a cluster A and entering in B is : This current is directly connected to the E vib current by a multiplicative constant. However, in this work we are interested in relative changes in the flux and not in their absolute values, thus, our results hold independently this constant.