Dual binding mode of “bitter sugars” to their human bitter taste receptor target

The 25 human bitter taste receptors (hTAS2Rs) are responsible for detecting bitter molecules present in food, and they also play several physiological and pathological roles in extraoral compartments. Therefore, understanding their ligand specificity is important both for food research and for pharmacological applications. Here we provide a molecular insight into the exquisite molecular recognition of bitter β-glycopyranosides by one of the members of this receptor subclass, hTAS2R16. Most of its agonists have in common the presence of a β-glycopyranose unit along with an extremely structurally diverse aglycon moiety. This poses the question of how hTAS2R16 can recognize such a large number of “bitter sugars”. By means of hybrid molecular mechanics/coarse grained molecular dynamics simulations, here we show that the three hTAS2R16 agonists salicin, arbutin and phenyl-β-D-glucopyranoside interact with the receptor through a previously unrecognized dual binding mode. Such mechanism may offer a seamless way to fit different aglycons inside the binding cavity, while maintaining the sugar bound, similar to the strategy used by several carbohydrate-binding lectins. Our prediction is validated a posteriori by comparison with mutagenesis data and also rationalizes a wealth of structure-activity relationship data. Therefore, our findings not only provide a deeper molecular characterization of the binding determinants for the three ligands studied here, but also give insights applicable to other hTAS2R16 agonists. Together with our results for other hTAS2Rs, this study paves the way to improve our overall understanding of the structural determinants of ligand specificity in bitter taste receptors.

. List of hTAS2Rs proteins and additional name (from reference 1 ). Receptor name  Aliases  hTAS2R1  TRB7  hTAS2R40  hTAS2R58, GPR60  hTAS2R3  /  hTAS2R41  hTAS2R59  hTAS2R4  /  hTAS2R42  hTAS2R55  hTAS2R5  /  hTAS2R43  hTAS2R52  hTAS2R7  TRB4  hTAS2R44  hTAS2R31, hTAS2R53  hTAS2R8  TRB5  hTAS2R45  GPR59  hTAS2R9  TRB6  hTAS2R46  hTAS2R54  hTAS2R10  TRB2  hTAS2R47  hTAS2R30  hTAS2R13  TRB3  hTAS2R48  hTAS2R19, hTAS2R23  hTAS2R14  TRB1  hTAS2R49  hTAS2R20, hTAS2R56  hTAS2R16  /  hTAS2R50  hTAS2R51  hTAS2R38  hTAS2R61  hTAS2R60  hTAS2R56  hTAS2R39 hTAS2R57 The nomenclature of the 25 human bitter taste receptors is far from being trivial, as discussed in references 2,3 . hTAS2Rs are not numbered consecutively because of the presence of pseudogenes and because of the confusion generated by the use of different names assigned to the same receptor when different research groups published their data almost at the same time 2,3 . For the hTAS2Rs discussed in our manuscript, we refer to the nomenclature in reference 4 . Nonetheless, hTAS2R16 is one of the few examples for which there are no aliases and the same name has been consistently used in the literature.

Receptor name Aliases
Previous studies 7--10 have suggested different models for hTAS2R16 bound to its agonist salicin (see Fig. 1 for the chemical structure of the ligand). Sakurai and coworkers 7--9 proposed two alternative binding poses (referred to as models A and B), in which the glucose unit is oriented towards the extracellular or the intracellular side, respectively. In comparison, our binding pose (see the main text and Fig.  5 E--F) is similar to model A (i.e. they share the same orientation of the ligand inside the binding pocket) and will be hereafter referred to as "glucose-out".
Sakurai and coworkers designed mutagenesis experiments of the residues identified as putatively involved in binding in their computational models (i.e. E86, N89, F93, W94, H181, F240 and I243) 7--9 . Based on the obtained mutagenesis data, they concluded that model B (in which the glucose unit is flipped 180 degrees compared to our "glucose--out" binding pose) was more likely than model A. Since the models in reference 7 are unfortunately not available and the corresponding computational protocol does not contain enough details to reproduce it (e.g. the sequence alignment employed is missing), we resorted to visual inspection of the images and to a careful analysis of the description of the models in the text of reference 7 . Within the limitations of this qualitative picture, we noticed that the mutagenesis data in reference 7 are not enough to distinguish univocally between models A and B. In particular: (i) Mutations of Q177 5.39 (to N, E or A) did not affect significantly the EC 50 , probably indicating that this residue does not interact directly with the ligand. Hence, Sakurai and coworkers discarded model A, where Q177 5.39 forms a H--bond with the "outwardly--oriented" glucose moiety. However, we found that these mutagenesis data are still compatible with our glucose--out binding pose, in which Q177 5.39 is most likely involved in shaping the binding cavity.
(ii) Mutations of E86 3.33 (to D and Q) showed decreased EC 50 values for all the ligands tested. Sakurai and coworkers used this data to support model B, in which the hydroxymethyl group of the "inwardly--oriented" phenyl moiety of salicin is forming a H--bond with E86 3.33 . Nonetheless, in our opinion the model B does not explain why the E86 3.33 mutations still affect the EC 50 for arbutin and phenyl--β--D--glucopyranoside, even though both ligands lack the phenyl substituent putatively interacting with E86 3.33 . Instead, this may be explained, at least in part, by the H--bonds between E86 3.33 and the glucose unit observed for all three ligands in our simulations of the glucose--out pose.
(iii) Other residues tested experimentally by Sakurai and coworkers (N89 3.36 and H181 5.43 ) do not clearly discriminate between the two ligand orientations, in our opinion, since they interact with the ligand in both models. The loss of receptor response upon N89 3.36 mutation is compatible with both model A (showing a H-bond with the hydroxymethyl substituent of the phenyl aglycon) and model B (displaying a H--bond with the hydroxymethyl group of the glucose unit). In our glucose--out simulations, N89 3.36 is interacting with the glucose (either its hydroxyl groups or the glycosidic oxygen), thus providing evidence in favor of model A. Similarly, the decrease in EC 50 upon H181 5.43 mutation is consistent with the formation of H--bonds with the glucose unit in both models, though involving different glucose hydroxyl groups. (iv) The final set of mutations included a group of hydrophobic residues located at the bottom of the binding cavity (F93 3.40 , F240 6.52 and I243 6.55 ) and all showed a decrease in EC 50 . According to Sakurai and coworkers, these experimental data support model B, in which these residues are forming hydrophobic interactions with the glucose moiety. However, in our opinion they are also consistent with model A, where these residues can form hydrophobic interactions with the phenyl aglycon. Therefore, a critical reevaluation of the comparison between the computational models and the experimental data of Sakurai and coworkers 7--9 seems to indicate that the orientation of salicin in the binding site is far from clear. Moreover, additional experiments 6 reported after the publication of the model in reference 7 offer further elements to discriminate between the two possible ligand orientations. Indeed, the new mutagenesis data suggest new receptor--ligand interactions that are absent in the model B proposed by Sakurai and coworkers 7,9 . Below we comment on two representative data sets: mutations of E262 7.39 in the extracellular part of the binding cavity and mutations of L59 2.53 and V265 7.42 in the intracellular part. (i) The E262D mutation decreased only slightly the maximum receptor activity with salicin, while E262A completely abolishes receptor activation 6 ; this indicates that E262 7.39 is probably interacting with salicin. However, model B would place E262 7.39 approximately in front of the outwardly--oriented aglycon, which does not have any available H--bond donor to interact with E262 7.39 , as the hydroxymethyl substituent is already engaged in a H--bond with E86 3.33 . Therefore, no interaction between E262 7.39 and salicin can be formed in model B, at odds with the new experimental data 6 .
(ii) Mutations of L59 2.53 and V265 7.42 decreased or abolished the receptor response, respectively, indicating that these two residues are likely to interact with salicin 6 . Since both residues are located in the bottom part of the binding cavity, in model B they would be placed near the "inwardly--oriented" glucose unit. Given the hydrophobic nature of these two residues, it is difficult to envision any interaction with glucose, in contrast with the experimental data. Altogether, the model B proposed by Sakurai and coworkers does not seem to be fully consistent with the new experimental data in reference 6 . A possible reason that might explain, at least in part, this discrepancy is the use of a structural model of rhodopsin as template to generate the homology model of hTAS2R16 7 . In order to further validate this hypothesis, we built a binding pose similar to model B. Here, the ligand is almost parallel to the receptor axis, but with different orientation compared to the "glucose--out" binding pose described in our main text. The glucose unit is buried inside the receptor while the aglycon is pointing toward the extracellular side (see the Fig below, in which panel A shows the "glucose--in" binding pose and panel B one of the two "glucose--out" poses described in the main text). During the MM/CG simulation of this alternative "glucose--in" binding pose, we observed that the ligand already escaped from the binding site early in the simulation (see Figure S7). The instability of this pose is consistent with the absence of polar residues in the intracellular part of the binding cavity able to establish hydrogen bonds with the glucose unit and the lack of hydrophobic residues in the extracellular part that can interact with the aglycon moiety. Therefore, our simulations suggest that the "glucose--in" binding pose should be discarded. More recently, Chen and coworkers performed molecular dynamics simulations of hTAS2R16 in complex with salicin (agonist) and probenecid (antagonist) to investigate the receptor activation mechanism 10 . However, the aim of their study was not focused on understanding the binding determinants of salicin and thus their comparison with the available experimental data was superficial. Upon visual inspection of the image in reference 10 , corresponding to the initial docking pose, we noticed that the model of Chen and coworkers does not seem to be fully consistent with all the available experimental data 6,7 . Although some of the binding residues proposed by Chen and coworkers, such as E86 3.33 and N89 3.36 , are compatible with the experimental data in reference 7 , salicin appears to be too far away from F93 3.40 , F240 6.52 and I243 6.55 to form any significant interaction with these residues. This is at odds with the mutagenesis data showing that mutations of these three residues decrease EC 50 7 . In addition, Chen and coworkers did not consider the experimental data in reference 6 when validating their proposed salicin binding pose. These new mutagenesis data suggest additional receptor-ligand interactions that are absent in the model proposed by Chen and coworkers 10 . Although they identified correctly E262 7.39 as one of the putative interacting residues, in their model salicin is not located deep enough inside the binding cavity to interact with L59 2.53 , F236 6.48 and V265 7.42 , at odds with the observation that mutations of these three residues decrease or abolish receptor response 6 . Therefore, the model of Chen and coworkers does not seem to be fully consistent with all the available experimental data 6,7 . Altogether, neither of the previously proposed models for the hTAS2R16/salicin seems to be fully compatible with all the available experimental data 6,7 . In an effort at clarifying the binding poses of the hTAS2R16 agonists complexes, in this work we have built structural models of hTAS2R16 in complex with three of its agonists (arbutin, salicin and phenyl--β--D--glucopyranoside) and submitted them to extensive molecular dynamics simulations. The obtained computational models have been validated by comparison with the experimental data available as of December 2018 6,7 (see Table S1). The three ligands (LIG) are arbutin (ARB), salicin, (SAL) and phenyl--β--D--glucopyranoside (PGP). The first (Res) and second (BW) columns indicate the experimentally characterized hTAS2R16 residue 7 and its Ballesteros--Weinstein numbering 11 , respectively. The ratio between mutant EC 50 and wild type EC 50 is reported in column three (EC 50 mut/EC 50 wt), followed by the fourth column (LIG EC 50 ) containing the interpretation of the EC 50 values for the corresponding residue using the following nomenclature: c = change in EC 50 ; nc = no significant change in EC 50 . "PGP initial" refers to the initial poses of phenyl--β--D--glucopyranoside, with the glucose unit either pointing inwards (glucose--in) or outwards (glucose--out). LIG TM3 and LIG TM7 correspond to the two binding modes (TM3--facing and TM7--facing, respectively) investigated in this work for the glucose--out orientation. For each binding mode, the "Dist" column indicates whether the corresponding residue is below or above (< or >, respectively) the distance threshold (5.5 Å) used to define the binding cavity, whereas the "Pred" column reports the test outcome for that residue (TP=true positive, TN=true negative, FP=false positive, FN=false negative), depending on the presence or absence of an actual chemical interaction. These predictions were used to calculate the statistical parameters precision (PREC) and recall (REC) (see main text and Fig.  S3). The representative simulation snapshots used for this calculation were obtained following the procedure explained in the main text (see Methods section), except for "PGP initial". An additional column is included for salicin (% of activity of wt), summarizing the experimental data from 6 . The glucose--out orientation refers to the binding pose described in the main text, while the glucose--in orientation (see above) corresponds to the ligand flipped vertically by 180 degrees with respect to the glucose-out one. In this binding pose, the glucose unit is pointing toward the intracellular side of the receptor and the aglycon sits in the extracellular part of the binding pocket. For the glucose--out orientation, two binding modes were explored (TM3--and TM7--facing, see the main text), which differ by a 180 degrees horizontal flip.  Supplementary  Table  S2, we first analyzed the experimental mutagenesis data. Experimental binding residues (EB) are those whose mutation causes an increase in EC 50 higher than 5--fold compared to the wild--type (EC 50 mut / EC 50 wt > 5), while residues whose mutation does not change significantly the EC 50 value ("nc" in the PGP EC 50 column of Supplementary  Table  S2) are considered as experimental non binding residues (ENB). The results of this experimentally--based classification are shown below: Then, we analyzed the data obtained from the MM/CG molecular dynamics simulation of the same complex (PGP TM3 column in Supplementary  Table  S2). A computational binding residue (CB) was defined according to three criteria: 1) distance within 5.5 Å from the closest ligand atom ("PGP TM3 Dist" column in Supplementary Table S2); 2) rationality of the chemical interaction; 3) persistency of the interaction. For criterium (2), our hydrogen bond definition takes into account implicitly that the residue/ligand atoms involved in the interaction are within a distance of 3.5 Å (together with a 30° deviation for the angle made by donor, hydrogen and acceptor atoms). Instead, for residues forming hydrophobic interactions we applied a 5.5 Å residue/ligand distance cutoff. For criterium (3), we used two different persistency cutoff values. We considered a residue as CB when its hydrogen bond persistency with at least one hydroxyl group of the ligand is higher than 10% (Supplementary  Table  S4) or when its hydrophobic interaction persistency is at least 80% (Supplementary Table S6). The results of this computationally--based classification are shown below: Next, a comparison between the experimental and computational binding and non binding residues was performed to classify the residues into 4 different outcomes. A residue that belongs to both CB and EB groups was considered as a true positive (TP), to CNB and ENB as a true negative (TN), to CB and ENB as a false positive (FP), and to CNB and EB as a false negative (FN) (see figure below and paragraph "Interaction analysis" in the main text). Finally, the total number of TP (five), FP (one) and FN (one) was counted to calculate precision and recall according to the equations presented in the main text. In this example, precision and recall are both 0.83 (Supplementary Table S2), indicating a good agreement of the computational model with the experimental data. For the analysis of the static binding poses (PGP initial glucose--out/--in in Supplementary Table S2), we used a similar protocol to calculate precision and recall, except for the definition of computational binding and non-binding residues. "Computational binding" and "computational non--binding" residues were defined based solely on the presence and absence, respectively, of protein/ligand interactions (criteria 1 and 2 in the text above) and did not include the persistency (criterium 3). This is because the interactions in the static binding poses are not time dependent, since no MD simulation had been run at this point.  The atoms involved in H--bonds are specified in columns 3 (residue atom) and 4 (ligand atom). The H--bond persistency (expressed as a percentage of the total simulation) is shown in columns 5, 6 and 7 (direct H--bonds) and columns 6, 8 and 10 (water--mediated H--bonds). Only H--bonds with persistency larger than 10% are reported (colored in green). The darkness of the green font is related with the persistency of the H--bond, i.e. the darker the green, the greater the persistency. The atoms involved in H--bonds are specified in columns 3 (residue atom) and 4 (ligand atom). The H--bond persistency (expressed as a percentage of the total simulation) is shown in columns 5, 6 and 7 (direct H--bonds) and columns 6, 8 and 10 (water--mediated H--bonds). Only H--bonds with persistency larger than 10% are reported (colored in green). The darkness of the green font is related with the persistency of the H--bond, i.e. the darker the green, the greater the persistency. A contact is defined when this distance is below 5.5 Å. Contacts with percentage 90% or larger are highlighted in red, and those with percentage between 80% and 90% in orange.  The hTAS2R16 residue number is specified in the first column, the corresponding position in the hTAS2R family is in the second column (using the Ballesteros--Weinstein numbering 11 ) and the conservation percentage for each of the residues found in that position is indicated in the last column. Figure S7. Interaction of E86 3.33 and E262 7.39 with the ligand and between H181 5.43 with E86 3.33 . E86 3.33 and E262 7.39 form hydrogen bonds with the hydroxyl groups of the glucose unit, while H181 5.43 forms a stable salt bridge with E86 3.33 and thus has a fundamental role in keeping E86 in the right position to interact with the agonists. Figure  S8. The phenylalanine cluster. It is formed by F93 3.40 , F240 6.52 , F236 6.48 and F268 7.45 . F93 3.40 , F236 6.48 and F268 7.45 are involved in hydrophobic interactions with the ligand, as well as forming π--π stacking interactions among them. Instead, F240 6.52 forms a T--stacking interaction with F236 6.48 , and thus may help to stabilize the other phenylalanines in the cluster. The helices involved in the phenylalanine cluster are colored in green (TM3), orange (TM6) and blue (TM7).

Text S4. Speculation about possible alternative binding modes.
The simulations presented in the main text, in combination with the available experimental data, support the existence of two different binding modes in hTAS2R16 (TM3--and TM7--facing), in which the agonist is rotated by 180 degrees. Considering the bundle shape of the binding cavity, one could wonder whether a rotation by 90 degrees could enable other alternative binding modes (i.e. multiple binding mode mechanism). If the ligands were rotated by 90 degrees with respect to the binding poses identified in the manuscript (TM3-and TM7--facing modes), the hydroxyl groups of the glucose ring would point towards TM2 on one side and TM5 and/or TM6 on the other side (hereafter "TM2--facing" and "TM5/6--facing" mode). This is at odds with the extensive mutagenesis and functional data in references 6,7 . Indeed, the authors of reference (Thomas et al. 2017) already proposed that the hTAS2R16 residues key for receptor specificity and activation are located on TM3 and TM7. Since the aglycon is surrounded by hydrophobic residues (see Fig. 4 in the main text) and hydrophobic interactions are non--directional, we expect that the most significantly affected interactions upon a 90 degrees rotation are those with the glucose ring. As shown in the figure below, the polar residues in TM2, TM5 and TM6 that could potentially interact with glucose would be S63 and N66 on TM2, H181 on TM5 and Y239 on TM6 (note that we do not consider Q177 on TM5 because site--directed mutagenesis data 7 indicates that this residue is not involved in ligand binding). Compared to the glucose binding residues in the TM3-- and TM7--facing modes (E86 and N89 in TM3 and E262 and Y266 in TM7, see Fig.  3 main text), there are two equivalent (N and Y) and two different residues (S and H, instead of two E). S63 and H181 can form only one H--bond each, whereas each of the E86 and E262 residues can establish two H--bonds. Therefore, it is reasonable to think that the H--bond networks in the TM3 and TM7--facing modes are stronger than in the TM2 and TM5/6 modes. In addition, the TM3 and TM7 helices are much closer than TM2 to TM5/6. The distance between the Cα atoms of the mirroring residues E86 on TM3 and E262 on TM7 is ~12.7 Å, whereas that between the counterparts N66 on TM2 and H181 on TM5 is ~20.0 Å. As a consequence, we expect that the ligand would be more tightly bound in the TM3 and TM7--facing modes.
Altogether, the glucose/protein stabilizing interactions in the TM2 and TM5/6--facing modes are expected to be weaker than in the TM3 and TM7--facing modes. In addition, the TM2-- and TM5/6--facing modes appear not to offer a molecular explanation for (i) the complete loss of receptor activity upon E262A mutation and (ii) the significant EC50 changes observed for isosteric (E86Q) or conservative (E86D) mutations. Therefore, the TM2-and TM5/6--facing modes are unlikely to contribute to ligand binding and we expect that the corresponding MD simulations would show that these two binding poses are significantly less stable.