The Principles of Ligand Specificity on beta-2-adrenergic receptor

G protein-coupled receptors are recognized as one of the largest families of membrane proteins. Despite sharing a characteristic seven-transmembrane topology, G protein-coupled receptors regulate a wide range of cellular signaling pathways in response to various physical and chemical stimuli, and prevail as an important target for drug discovery. Notably, the recent progress in crystallographic methods led to a breakthrough in elucidating the structures of membrane proteins. The structures of β2-adrenergic receptor bound with a variety of ligands provide atomic details of the binding modes of agonists, antagonists and inverse agonists. In this study, we selected four representative molecules from each functional class of ligands and investigated their impacts on β2-adrenergic receptor through a total of 12 × 100 ns molecular dynamics simulations. From the obtained trajectories, we generated molecular fingerprints exemplifying propensities of protein-ligand interactions. For each functional class of compounds, we characterized and compared the fluctuation of the protein backbone, the volumes in the intracellular pockets, the water densities in the receptors, the domain interaction networks as well as the movements of transmembrane helices. We discovered that each class of ligands exhibits a distinct mode of interactions with mainly TM5 and TM6, altering the shape and eventually the state of the receptor. Our findings provide insightful prospective into GPCR targeted structure-based drug discoveries.

G protein-coupled receptors (GPCRs) are involved in a wide spectrum of physiological functions and they are the most attractive target for modern drug discovery. With the advances in both static crystal structures and molecular dynamics (MD) simulations, some common features in the activation mechanism of GPCRs have been identified, including the side-chain switches, the movement of transmembrane (TM) helices, and the formation of internal water channel [1][2][3][4] . However, an important question why certain molecules act as agonists whereas others, even with nearly identical structure, act as antagonists or invert agonists, is not well understood.
To answer this question and facilitate GPCR targeted drug discovery, we focus on the molecular fingerprint of the β 2 -adrenergic receptor (β 2 AR). β -adrenoceptors belong to rhodopsin-like GPCRs and are subdivided into three distinct groups: β 1 , β 2 , and β 3 which are identified in cardiac, airway smooth muscle, and adipose tissue, respectively 5 . β 2 AR plays an important role in many physiological processes including inhibiting labor, delaying need of micturition, facilitating respiration and providing glucose fuel 6,7 . Similar to other receptors, there are three different ligand types of β 2 AR: agonists, antagonists and inverse agonists. The agonists activate the signaling pathways and increase the receptor's basal activities. The antagonists block the signaling transductions without affecting the receptor's basal activities. However, the inverse agonists block the receptor's pocket and reduce the receptor's basal activities 4,8,9 . All these three types of compounds have been extensively characterized even before their structures were determined [10][11][12] .
With the available crystal structures 9,13-15 , we are now able to envisage both active and inactive states of the receptor. Here, we further investigated the detailed binding modes of twelve different molecules (four agonists, four antagonists, and four inverse agonists) in the corresponding states of β 2 AR through 12 × 100 ns molecular dynamics (MD) simulations (a total of 1200 ns trajectories). The propensities of residue-ligand interactions were presented as interaction fingerprints. Moreover, the fluctuation of the protein backbone, the volumes in the intracellular pockets, the water densities in the receptors, the domain interaction networks as well as the movements of transmembrane helices are characterized for each class of compounds.
Protein ligand interaction fingerprint. It is of great interest to investigate how the differences in the ligand scaffold and the receptor state affect the protein-ligand interactions. To illustrate how each ligand interacts with β 2 AR, we generated the interaction fingerprints based on the final 100 frames of our MD simulations. The root-mean-square deviations (RMSDs) of the receptor helices to the starting frame confirm the convergence of the simulations ( Figure Suppl. 2 (Fig. 2). Noticeably, an ionic interaction is always observed between the ethanolamine of each ligand and D113 3.32 . The Hbonds between ethanolamine and N312 7.39 are also frequently observed (frequency > 90%) in all ligands, except for Agon-4 ( Fig. 3, left panel). These findings suggest that D113 3.32 and N312 7.39 are the key residues for the binding of all ligands. Moreover, agonists frequently form Hbonds with multiple residues in TM5, particularly with S203 5.42 and S207 5.46 (Fig. 3a). Interestingly, the catechol ring I of Agon-2 interacts less frequently with Y199 5.38 and S203 5.42 (frequency < 30%). However, its N-substituted p-isobutylphenol in ring II forms a prominent Hbond with the amide backbone of F193 ECL2 (Fig. 3a, left panel). In contrast, antagonists and inverse agonists tend to have less polar interactions with TM5. Only Anta-3 can form rare Hbonds (frequency < 20%) with both S203 5.42 and S204 5.43 (Fig. 3b, left panel). For the inverse agonists, only iAgo-1 and iAgo-4 form isolated Hbonds with S203 5.42 and S204 5.43 respectively (Fig. 3c, left panel).
Apart from polar interactions, hydrophobic interactions are essential for stabilizing or activating the receptor. Rings I of all ligands are enveloped between the hydrophobic side chains of V117 3.36 , F193 ECL2 and F289 6.51 , except that Agon-4 interacts with F193 ECL2 and N293 6.55 instead (Fig. 3, right panel). Moreover, we noticed that the antagonists and the inverse agonists frequently form hydrophobic interactions with Y199 5.38 , S203 5.42 , S207 5.46 , W286 6.48 and F290 6.52 , whereas the agonists rarely form such interactions with these residues (Fig. 3b,c, right panel). The analysis of interaction fingerprint has so far indicated that polar interactions between the ligand and the residues on TM5 are necessary for maintaining the active state of the receptor, whereas a ligand can stabilize the receptor by exerting hydrophobic interactions on this helix and also TM6. Hence, the distinct interaction patterns with different regions of the receptor can help predicting the nature of such a ligand. Historically, in contrast to isoprenaline (Fig. 1d) which is a potent agonist on β -adrenergic receptor, dichloroisoprenaline (DCI) (Fig. 1e unexpectedly demonstrated antagonistic effects on the receptor 23,24 . Medicinal chemists were then intrigued by the possibility of developing an antagonist via appropriate chemical modifications of these molecules. Eventually, pronethalol (Fig. 1f), the naphthyl analog of isoprenaline, was the first beta blocker as a clinical candidate. The subsequent introduction of an oxymethylene bridge between the naphthalene and ethanolamine led to the discovery of propranolol (Anta-4) (Fig. 1b), the first clinically useful beta blocker 23,24 . Our results are consistent with the discovery process of beta blockers that the replacement of catechol ring with nathphalene reduces the Hbond interactions on TM5 and promotes the hydrophobic interactions. Meanwhile, the additional oxymethylene facilitates the hydrophobic interactions between the N-substituted group of the ligand and the residues on TM6. Therefore, one would expect the fingerprint of isoprenaline demonstrates frequent hydrogen bonds with TM5 and rare hydrophobic interactions with TM6. On the other hand, the fingerprints of DCI and pronathalol should reveal dominant hydrophobic interactions with both TM5 and TM6. We notice that the potent agonist Agon-4 does not share the common polar (N312 7.39 ) and hydrophobic interactions (V117 3.36 and F289 6.51 ) as observed in other systems (Fig. 3a, right panel). Moreover, our MD trajectory shows that Agon-4 adopts a fairly  . Unlike other ligands, the ring I of Agon-4 does not penetrate deeply into the pocket and therefore no hydrophobic interactions with V117 3.36 and F289 6.51 are formed. Instead, the 2-methylhydroxyl group (ring I) forms a Hbond with S203 5.42 and consequently ring I is firmly stabilized next to the extracellular loop 2 (ECL2), in the vicinity of F193 ECL2 and N293 6.51 . Interestingly, the phenol (ring I) frequently forms an intramolecular Hbond with the 2-methylhydroxyl group (ring I). On the other hand, the N-substituted isobutyl group of Agon-4 discourages N312 7.39 from approaching the ethanolamine backbone and thus disrupts the ligand-N312 7.39 Hbond.
Flexibility of each TM region. After identifying the 16 relevant residues for ligand binding, the effect of a ligand can be reflected from the overall root-mean-square fluctuations (RMSFs) of the backbone atoms.
In the presence of agonists, the average RMSF (0.69 Å) is larger than that of antagonist-bound or inverse agonist-bound systems (both 0.56 Å) and such differences are found to be statistically significant (p < 0.01) ( . The RMSF differences of all corresponding helices except for TM1 are also statistically significant (p < 0.01) (Fig. 4). It appears that agonist molecules not only increase the flexibility near the binding site, but also the major part of the receptor.
The intracellular pockets and water influx. Another important feature of receptor activation is the expansion of the intracellular pocket for the G protein coupling process. In this study, the last 100 frames in each trajectory were selected for calculating the average size of the intracellular pocket. For agonist-bound systems, the receptors are in their activated states and the average pocket volumes vary between ~450 and ~950 Å 3 . In contrast, the receptor remains in an inactive state in the presence of an antagonist or an inverse agonist and therefore the average pocket volumes only fluctuate between ~100 and ~200 Å 3 (Fig. 5). Moreover, the rotamer of the highly conserved Y326 7.53 correlates with various states of the receptor and regulates the formation of internal water channel 1 . Hence, we calculated the number of water molecules within 4 Å around Y326 7.53 alongside the pocket volume. In Agon-1 bound β 2 AR, water influx from the cytoplasmic region into the receptor is observed and the average number of water molecules around Y326 7.53 is 23 (Fig. 6a). In Anta-1 bound β 2 AR and iAgo-1 bound β 2 AR, there are isolated water patches that are disconnected with the bulk water in the cytoplasm. The average number of water molecules for both cases is around 14, less than two-thirds of that in the activated state (Fig. 6b,c). The details of other systems are provided in the Supplementary Information (Figure Suppl. 4).

Domain interaction network. The activation of receptors can lead to a rearrangement of local interactions
in the intracellular region 25 . In order to unravel couplings among residues within the receptor, we constructed the domain interaction networks to facilitate the analysis of state-specific couplings. Each node presents a cluster of residues in close interaction, while the thickness of the line connecting the nodes is weighted by the correlation values between the two clusters. We noticed that agonist-bound systems form less domains than the antagonistor inverse agonist-bound systems. For example, Agon-1 has only 8 nodes whereas Anta-1 and iAgo-1 have 11 and 9 nodes respectively (Fig. 7, bottom panel). This finding suggests that there are more discrete local interactions in the inactive state of the receptor. Noticeably the position of TM6 in the intracellular region varies with the state of the receptor, leading to a different interaction network. Among all agonist-bound systems, TM6 of the activated receptor moves outwards and eventually shares the same interaction domain with TM5 (Fig. 7a, top panel). In the antagonist-and inverse agonist-bound systems, TM6 of the inactive receptor closes up the intracellular pocket and therefore the residues in TM5 cluster with those in TM3 instead (Fig. 7b,c, top panel). These findings are in agreement with our recent work on P2Y1 receptor 26 .

Principal component analysis on the vibrational modes. The helix movements of agonist-,
antagonist-and inverse agonist-bound receptors are very different in the intracellular region, but do not demonstrate consistent patterns in the extracellular region. Figure 8 shows the first and the lowest frequency mode of the alpha carbons. The proportion of variance of atom fluctuations versus eigenvalue rank is given in Figure Suppl. 5. In the extracellular region of Agon-1 bound β 2 AR, the extracellular loop 2 moves towards TM2, whereas TM5, TM6, together with extracellular loop 3 move away from TM2 and open up the ligand binding pocket (Fig. 8a). However, in Agon-2 bound β 2 AR, TM5, TM6 and the extracellular loops 2 and 3 close up the binding pocket (Fig. 8b). Similarly, in Anta-1 bound β 2 AR, TM6, TM7 and the extracellular loop 2 move towards TM2 (Fig. 8c), whereas, in Anta-2 bound β 2 AR, these groups move away from TM2 (Fig. 8d). For the iAgo-1 bound β 2 AR, the extracellular loop 2 move aside and the binding site is closed up by TM6, TM7 and the extracellular loop 3  7.55 . An asterisk indicates a significant difference between the two values at p < 0.01, using a two-tailed t-test with equal variances.

Figure 4. The average root-mean-square-fluctuations (RMSFs) of the protein backbone in each helix, in the presence of agonists (Agon), antagonists (Anta) or inverse agonists (iAgo
( Fig. 8e). In contrast, the binding pocket of iAgo-2 bound β 2 AR is closed by the extracellular loop 2 (Fig. 8f). The helix movements in the intracellular region are more consistent. In agonist bound systems, TM6 and TM7 move away from each other and thus a large void space is created (Fig. 8a,b). In antagonist and inverse agonist bound systems, the movement of the alpha carbons in TM6 are more diverse and consequently the helix keeps the intracellular pocket closed (Fig. 8c-f).

Conclusion
The crystal structures of β 2 AR provide new information of this classical GPCR drug target. Protein-ligand interaction fingerprints generated from MD trajectories help identify the important residues and the type of interactions required for designing ligands with desired property. In this study, using interaction fingerprints, we analyzed dynamic behavior of 16 important residues in the binding pockets, among which D113 3.32 and N312 7.39 are essential for ligand binding. The polar interactions with residues in TM5, particularly S203 5.42 and S207 5.46 , are related to the agonistic properties, whereas hydrophobic interactions with residues in TM5 and TM6 help stabilize the receptor. We demonstrate that the molecular fingerprints can be a powerful tool in capturing the specific profile of protein-ligand interactions, and can be employed together with MD simulations in predicting the nature of a ligand. Agonists predominantly form Hbonds with TM5, disrupt the interactions between helices in the extracellular region and then in the rest of TM region leading to increase the flexibility of the protein. As a result, TM5 as well as TM6 form frequent non-polar interactions in the intracellular region and move away from TM7, causing the expansion of intracellular pocket and a water influx (Fig. 9a). This also explains why the residues of TM5 and TM6 in this region share the same interaction domain. In contrast, antagonists form prominently non-polar interactions with both TM5 and TM6 (Fig. 9b), whereas inverse agonists mainly form non-polar interactions with TM6 only (Fig. 9c). These modes of interactions stabilize the extracellular region of the receptor, resulting in a smaller intracellular pocket and a lower amount of water. Residues of TM5 and TM6 in the intracellular region form discrete domains and their vibrational modes become more diverse. All these findings explain the fundamental mechanism of receptor activation and use as good guidance for GPCR targeted drug discoveries.

Methods
Loop filling and refinements. All fusion proteins, lipids, counterions and stabilizing agents were removed from the experimental structures. The missing intracellular loop ICL2 in each receptor was rebuilt and refined    Molecular dynamics simulations. The membrane system was built in Desmond 35 , with the receptor crystal structure pre-aligned in the OPM (Orientations of Proteins in Membranes) database 36,37 . Pre-equilibrated 76-92 POPC lipids coupled with 6900-9200 TIP3P water molecules in a box ~ 65 Å x 63 Å x 93 Å were used for building the protein/membrane system. We modeled the protein, lipids, water and ions by using CHARMM36 force field 38 . The ligand was assigned with CHARMM CGenFF force field 39 . Next, an additional 20 ns constrained equilibration was performed at constant pressure and temperature (NPT ensemble; 310 K, 1 bar), and the force constant was trapped off gradually from 10 kcal/mol to 0 kcal/mol. All bond lengths to hydrogen atoms were constrained with M-SHAKE. Van der Waals and short-range electrostatic interactions were cut off at 10 Å. Long-range electrostatic interactions were computed by the Particle Mesh Ewald (PME) summation scheme. All MD simulations were done in Desmond 35  Average water density calculation. Water density was calculated in Volmap plugin in VMD 41 . Volmap creates a map of the weighted atomic density at each grid point. This is done by replacing each atom in the selection with a normalized gaussian distribution of width (standard deviation) equal to its atomic radius. The gaussian distribution for each atom is then weighted using an optional weight read from one of the atoms' numerical properties, and defaults to a weight of one. The various gaussians are then additively distributed on a grid. The meaning of final map will depend of the weights of mass. The average water density was calculated based on the final 100 frames of each long time scale MD simulations. Final output results were visualized in VMD.
Interaction fingerprint calculations. The interaction fingerprint between protein and ligand was done with IChem 42 . We first extracted the final 100 snapshots of the MD simulation, and shortlisted the residues with atoms less than 5 Å away from any ligand atoms. We then used IChem to convert protein− ligand coordinates into a bit-string fingerprint (TIFP) registering the corresponding molecular interaction pattern. TIFP fingerprints have been calculated for ca. 1000 protein− ligand complexes, enabling a broad comparison of relationships between interaction pattern similarities and ligand or binding site pairwise similarities 42 . In this work we kept the default parameters of IChem and focused on two types of interactions: polar interaction and hydrophobic contact. The former comprises ionic bond and Hbond, whilst the latter incorporates hydrophobic contacts, the face-to-face and edge-to-face between aromatic rings. 16 residues formed interactions with one of the ligands for at least 30 times.
Residue communication network calculation. Correlated atomic fluctuations of a particular receptor state were characterized as reported elsewhere [43][44][45] using Bio3D. The network nodes represent residues, which are connected through edges weighted by their constituent atomic correlation values. Community analysis and node centrality with Bio3D and suboptimal path calculation with the WISP software 46 were performed on each network to characterize network properties and to identify residues involved in the dynamic coupling of distal sites. The parameters for the suboptimal path analysis included input source and sink nodes, as well as the total number of paths to be calculated. The latter parameter was set to 500 paths, which was found to yield converged results in all cases 45 .