Abstract
Discovering biased agonists requires a method that can reliably distinguish the bias in signalling due to unbalanced activation of diverse transduction proteins from that of differential amplification inherent to the system being studied, which invariably results from the nonlinear nature of biological signalling networks and their measurement. We have systematically compared the performance of seven methods of bias diagnostics, all of which are based on the analysis of concentrationresponse curves of ligands according to classical receptor theory. We computed bias factors for a number of βadrenergic agonists by comparing BRET assays of receptortransducer interactions with Gs, Gi and arrestin. Using the same ligands, we also compared responses at signalling steps originated from the same receptortransducer interaction, among which no biased efficacy is theoretically possible. In either case, we found a high level of false positive results and a general lack of correlation among methods. Altogether this analysis shows that all tested methods, including some of the most widely used in the literature, fail to distinguish true ligand bias from “system bias” with confidence. We also propose two novel semi quantitative methods of bias diagnostics that appear to be more robust and reliable than currently available strategies.
Introduction
The term biased agonism indicates the situation in which an agonist can have large differences of efficacy in promoting receptor interactions with different transduction proteins. It primarily concerns the family of G protein coupled receptors (GPCRs), as these ligandactivated proteins must bind additional transduction proteins to trigger diverse and sometimes conflicting signalling pathways^{1}. Consequently, an agonist with efficacy biased towards favouring a particular transducer interaction may cause a disproportional stimulation of the corresponding signalling pathway, thus effectively “biasing” the pattern of receptor responsiveness towards a more restricted and specific biological function. The functional selectivity (or agonistdirected signal trafficking)^{2,3,4} that results from the activation of a receptor with a biased agonist may have important therapeutic implications, such as leading to the discovery of new drugs with reduced side effects and improved riskbenefit index^{5,6,7,8,9,10}. This perspective explains the surge of interest in biased signalling across the current biochemical and pharmacological literature.
However, measuring ligand bias is not trivial. In fact, disproportional responses to agonists also occur when there is no “true” efficacy bias. For instance, there can be differences in signal when comparing upstream with downstream responses of a pathway that starts from the same receptortransducer interaction. The reason is that the strength of ligand responses observed in most signalling pathways does not vary in direct proportion to the extent of receptor/transducer activation triggered by agonist binding. Moreover, the analytical methods used to determine biological signalling usually add further nonlinearity to the inputoutput relationship between ligandinduced activation and signal. Collectively, the term system bias is used to represent such additional mechanisms that generate apparent bias in signalling. System bias is either an artefact of analytical methods or the result of amplification in signalling network chains, which varies from one cell type to another. Thus, system bias makes it difficult to predict the actual signalling selectivity that may occur in vivo or allow for optimization of bias by modifications of a ligand’s structure. In contrast, ligand bias depends on the different efficacies of an agonist for distinct receptortransducer complex formation. It is thus encoded in the chemical identity of the ligand.
Many methods have been proposed for discriminating ligand bias from system bias and quantifying the extent of biased efficacy in agonists^{11,12,13,14}. The majority of such methods rely on the definition of ligand’s efficacy given in classical receptor theory^{15,16}. Despite the inherent oversimplification of this theoretical framework, those strategies appear to work, at least when tested over an extensive set of computer simulated data^{17}. However, exhaustive experimental verification of their validity is still lacking. Two previous studies comparing the relative abilities of several computing methods in detecting biased efficacy^{14,18} revealed significant divergences in the ability to identify biased ligands. Such results cast a legitimate doubt on whether the abundant phenomenology of biased signalling described in the literature really represents cases of agonists that have biased efficacy.
A major problem concerning the accuracy of ligand bias computations is “circular proving”. Most of currently known biased agonists were identified using the presently available methods; thus, there is no panel of independently proven biased ligands that can be used for checking the accuracy of current methods. In this study we reversed the question by checking the ability of each method to correctly identify balanced, i.e. unbiased, agonism. We expect that an accurate method of bias diagnostics should find no bias when there is none. To assess this, we compared different ligandinduced responses that stem from the same receptortransducer interaction, in which no biased efficacy is possible. We analysed the most commonly used strategies of bias analysis, including unpublished variants inspired from the same theory (7 methods altogether). As shown here, these methods produce a high rate of statistically significant false positive results, thus lacking the experimental rigor and robustness that would be necessary for a compound screening program, even at the earliest stages. We also present two new semiquantitative strategies of bias diagnostics that are both more reliable and robust than previous methods.
Results
Gs, Gi or arrestin bias in β_{2}AR
We studied 11 β_{2}AR agonists with varying efficacy for coupling the receptor to Gs, Gi or βarrestin. Receptortransducer coupling was measured by BRET (Bioluminescence Resonance Energy Transfer) assays (see Methods). We used membranes from GsKO cells to assess β_{2}ARGi interactions. In this preparation, the BRET signal originating from the rLucβ_{2}AR donor and GβrGFP acceptor is abolished by PTX treatment, indicating mediation by Gα_{i/o}subunits (Supplementary Figure S1). Thus, by comparing ligandinduced BRET responses in membranes from GsKO cells and Gsreconstituted cells treated with pertussis toxin (PTX), we could evaluate the differential ability of agonists to promote receptor coupling to Gi and to Gs, respectively.
From concentrationresponse (CR) curves of the ligands for Gs, βarrestin and Gi interactions bias factors were calculated with 7 different methods using Epinephrine as reference agonist (see Methods and Figs 1 and 2 for the description of the bias assessment strategies) and are summarized in Fig. 3. A common 95% confidence band of ±0.5 about zero bias, as assessed by a Monte Carlo strategy (see details in Supplementary Methods), indicates the boundary of statistical indeterminacy.
The bias factors quantifying efficacy differences for the transducer pairs [βarrestin vs. Gs] and [Gi vs. Gs] are shown in the first two panels of Fig. 3. Some ligands show a significant bias towards transducers other than Gs, (e.g. CLEN and CIM, apparently biased towards arrestin and Gi). For most of the agonists, however, it is evident that the different methods generate uncorrelated bias assessments. One exception is formoterol (Fig. 3, middle panel), which shows significant Gs vs. Gi bias according to most of the methods used (discussed below). To resolve this apparent lack of consistency in the analysis, we performed a confirmative test. We measured the CR curves of the same set of agonists in intracellular cAMP accumulation, which may be considered an alternative indicator of ligandinduced receptorGs interaction, and calculated the bias factors between βarrestin and cAMP responses (Fig. 3, bottom). Theoretically, these additional bias factors should agree with those obtained for βarrestin vs. Gs (Fig. 3, top), if the methods can detect the same quantity (i.e. true biased efficacy between arrestin and Gs). Thus, for the given ligands, a strong positive correlation among the bias factors in the top and bottom panels should be expected. On the contrary, the calculated correlation coefficients (with methodnumber in parentheses) were remarkably weak, or even negative in one case: 0.06(1), −0.50(2), 0.33(3), 0.23(4), 0.24(5), 0.04(6), 0.00(7). These results indicate that the bias factors shown in Fig. 3, even when statistically significant, cannot be attributed to the presence of true ligand bias with any confidence.
Apparent biased agonism in negative control experiments
To further analyse this inconsistency, we examined a number of negative controls. A good negative control experiment consists in evaluating pairs of responses that unquestionably result from the same receptortransducer interaction. Any bias detected in such cases represents “nonstatistical noise” (i.e. a statistically significant false positive result), because the methods are applied to a “biasimpossible” trial by definition. Thus, the imbalance between responses is the result of system bias, which all the computing methods examined in this study should be able to eliminate.
Negative controls consisted of 3 equivalent measures of receptorGs interactions. Ligandinduced β_{2}ARGs coupling was assessed by: (1) BRET, (2) enhancement of GTPγS binding in β_{2}ARGs fusion protein, and (3) increase of intracellular cAMP accumulation. The first two assays detect the interaction at the most proximal level of receptor stimulation, and both depend on the same molecular event, namely GDPrelease from Gαs, determined in isolated membranes. The third assay depends on Gsmediated activation of adenylyl cyclase in intact cells. The results of these experiments are shown in Fig. 4. The general variation pattern of bias factors determined in these negative controls is comparable to that observed previously (Fig. 3). There is an overall similarity between the “biasimpossible” test bench of Fig. 4 and the biologicallyrelevant comparisons of Fig. 3 with regards to the propensity of the 7 methods to generate inconsistent and uncorrelated ligand bias factors. This suggests that none of the results obtained with these procedures can be confidently attributed to the existence of true biased efficacy in a ligand.
In principle, when comparing a cell dependent response with one obtained in membranes, a false bias factor may result if some ligands have a strong bias towards arrestin: the enhanced interaction with arrestin and consequent internalization could diminish the expected cAMP response compared to the Gs interaction obtained in membrane, where no internalization occurs. However, if such a perturbation existed, it should be consistent with the βarrestin bias of the ligand that was already determined (Fig. 3). Yet, the significant bias factors (for some ligands with some method) that were detected in cAMP response versus membranedependent BRET/GTPγS responses did not appear to be related to a βarrestin interference. For example, NE, ORCI and CIM, showing a significant βarrestin bias with at least one method in the previous analysis (Fig. 3), might be expected to exhibit a bias towards the membrane assay response, due to the quenching effect of βarrestinmediated internalization on cAMP. In contrast, these ligands showed either a bias in the opposite direction (NE) or gave inconsistent results (Fig. 4).
To examine convergence, we first evaluated the withinmethod correlations of the bias factors in Fig. 4. Theoretically, the bias factors calculated from cAMP vs. Gs (BRET) comparisons (Fig. 4, upper panel) should be highly correlated with those calculated from GTPγSbinding vs. Gs (BRET) comparisons (Fig. 4. middle panel), because the information delivered is reporting on the same biochemical event. However, the correlation coefficients calculated for all ligands were remarkably low: 0.01(1), −0.24(2), 0.18(3), 0.70(4), 0.72(5), 0.17(6), 0.57(7).
We then evaluated intermethod correlations by pooling bias factors from Figs 3 and 4 (correlation matrix in Table 1). Apart from relatively high correlations among methods 4–7, which are minor variations of the same procedure (Fig. 1), the overall correlations among the methods were weak. Such poor correlations do not seem to be related to inaccurate determination of specific model parameters. For example, the results from the subset of methods that do not rely on independently measured K_{d} values (i.e. methods 2, 3, 7) were not intrinsically more correlated than the others. In summary, the seven methods documented here, although equivalent in theoretical background, give uncorrelated results when challenged on experimental data. This means that they cannot consistently measure moderate levels of biased efficacy that may exist in the studied ligands.
As a final challenge for consistency, we examined an additional set of negative controls. For this assay the same response (cAMP accumulation) from the same testpanel of agonists was examined in cells with differing levels of β_{2}AR or Gs expression using two different analytical means of cAMP detection (GloSensor vs. radioimmunoassay [RIA]). This analysis can directly assess the ability of the methods to remove different kinds of system bias. In fact, the differences in receptor or transducer expression can generate a “biological” type of system bias (i.e., differing sensitivity of the signalling pathway to the range of ligand efficacy). Conversely, different methods of cAMP detection can generate “analytical” system bias, because biosensor signals, unlike absolute determinations via RIA, tend to saturate rapidly at high cAMP concentration^{19}. Consistency in the results implies that all methods should report a lack of ligand bias, as there cannot be difference of efficacy across assays that measure the same transduction pathway. However, we observed significant bias factors in this analysis (Supplementary Figure S8). In fact, most of the ligands exhibited at least one positive hit in the comparisons. Moreover, unlike in previous analyses, the bias factors assigned to a given ligand were often concordant across the 7 methods. Surprisingly, some of the bias factors, which should be zero for these negative controls were the largest of the study, with log ratios occasionally approaching 2.
Altogether, these analyses provide compelling evidence that there is a major discrepancy between experimental reality and the theoretical background on which ligand bias computing methods are based.
Bias diagnostics based on the analysis of relative intrinsic activities of ligands
The lack of reliability documented above might reflect excessive sensitivity of bias computing methods to minor deviations between agonist CR curves and model predictions. With this hypothesis in mind, we sought to develop a strategy of ligand bias diagnostics that breaks the modeldependency of currently available methods. We reasoned that intrinsic activity (IA, i.e., the relative effects of ligands at receptor saturation) represents a modelfree and robust indicator of the power of agonist to activate a receptor system. In fact, the ranking of IA in agonists provides direct information on the relative efficacy of ligands (set aside cases of extreme signal amplification, where virtually all agonists may display identical levels of maximal response). Accordingly, biased efficacy can be detected as a change in IA rank ordering across two different transduction systems. We developed two alternative methods to assess such a change. Both rely on the analysis of agonists IA’s with respect to a reference trajectory computed from the difference in the CR curves of the full agonist obtained in two different assay responses. (Methods, Fig. 1, panels 8, 9 and Fig. 2).
Maximal ligandinduced receptor coupling to Gs, Gi or βarrestin 2 (assessed by BRET assays using a large set of test agonists) are compared in Fig. 5. The intrinsic activities of the majority of ligands lie within the joint dispersion of the expected rank order (upper panels) and do not significantly depart from the corresponding reference trajectories (lower panels). Only a few ligands displayed a statistically significant biased efficacy in at least one of these comparisons. However, these positive identifications were not consistent across the two variants of the method (compare method 8 with 9, on top and bottom panels of Fig. 5), which indicates that possible biased efficacies possessed by such compounds must be of marginal significance.
The propensity of the new methods to generate false positive bias results was checked using negative controls, as described for the other methods. The intrinsic activities of ligands measured in three different assays of the receptorGs interaction (cAMP accumulation, GTPγS binding and BRET measurements) were compared (Fig. 6) in both method 8 (upper panel) and 9 (lower panel). Both results were consistent with an overall lack of biased efficacy in the ligands. Only two agonists, isoetharine and ritodrine, yielded a potential bias in the cAMPGs comparison (left panels), but not consistently in both methods. In the GTPγS vs. BRET comparison (right panels, Fig. 6) none of the 31 test ligands displayed a significant bias. Thus, the overall rate of positive hits in negative control experiments obtained with the new methods is considerably lower than the rate observed in the other seven methods analyzed before.
To address the question of whether this enhanced reliability is achieved at the cost of reduced detection power, we analyzed a positive control consisting of angiotensin receptor 1 agonists previously shown to exhibit a strong level of biased efficacy for arrestin^{14,20}. As shown in Fig. 7 (upper panel), the peptides were first tested using the 7 modelbased methods, all of which yielded a clear and statistically significant arrestin bias. Although the reliability of such strategies is significantly improved in the case of strongly biased ligands, there were large differences (up to one order of magnitude) in the extent of bias estimated for the same ligand by different methods. This indicates that the ability of such computations to converge into an exact quantification of the difference in efficacy, while true in principle, is only an approximation in practice. We next analysed the same data with our newly developed modelfree approach. For the comparison of IA’s in arrestin recruitment and Gqmediated IP production only method 9 can be used, since the majority of the ligands in the test panel are biased, making rank ordering (i.e. method 8) unfeasible. As shown in Fig. 7 (bottom panel) the 11 agonists exhibited highly significant deviations from the reference trajectory consistent with strong efficacy bias towards arrestin, and confirm that the method is fully competent in detecting the presence of biased efficacy in ligands.
Overall rating of bias diagnostic methods
To evaluate the relative performance of the bias computing methods examined in this study we focused on two indicators. One is the residual mean squares (RMS) deviation from the nobias base level. The other is the average number of ligands with a significant biased efficacy found by each method. Unlike the former, the latter can be applied to both modeldependent (1–7) and modelindependent (8, 9) methods, thus allowing a global comparison. RMS deviations for methods 1–7 were estimated separately for the comparisons involving diverse transducers (where biased efficacy may be potentially present, i.e. the GsArrestinGi data from Fig. 3) and the comparisons involving a single transducer (where no biased efficacy can exist, e.g. the single transducer assays of Fig. 4). A global estimate pooling both cases was also calculated.
As shown in Fig. 8A, the 7 modeldependent methods vary in the propensity of attributing false bias factors to ligands, which oddly seems to be even greater in “biasimpossible” than in “biaspossible” cases. Nonetheless, the general similarity of the trends observed in the two cases enforces our conclusion that none of the bias factors detected by these methods correspond to a true efficacy difference of the ligand for the transduction proteins under study.
The global method comparison based on the frequency of positive hits allows the generation of a common “reliability scale” (Fig. 8B). It is worth noting that two strategies based on the operational model (methods 1 and 2), and the one based on slope ratios of reciprocal equieffective concentrations (method 7) are most prone to generate false positive results. In contrast, method 4, i.e., ratios of the equieffective occupancy ratios at a single response level, lies at the opposite extreme. It is also clear from this comparison that the new modelfree methods of bias diagnostics introduced in this study (methods 8 and 9) generate the lowest level of statistically significant false ligand bias.
Discussion
This study was originally designed to measure differences of efficacy among adrenergic agonists for inducing β_{2}AR interactions with their main transduction proteins, Gs, Gi and βarrestin 2. Taking advantage of the ability of Mg^{2+} ions to enhance weak receptorGi/o protein interactions and using GnasKO cell lines^{21} reconstituted or not with the Gα_{s} gene, we established a simple but efficient BRET assay for comparing receptorGs and receptorGi coupling in membranes. However, as our computations of biased efficacy used a large number of methods, serious inconsistencies in data analysis became evident. We thus turned our attention on two different objectives: a fullfledged reliability testing of the currently available methods and an attempt to develop more reliable strategies for diagnostics and quantification of biased agonism.
We have tested 7 different methods of ligand bias computations. All of them are subsumed in the main strategy of quantitative pharmacology for extracting information about ligand intrinsic efficacy from bioassays. The first group (i.e. methods 1−3 in Fig. 1, where # 2 is the most widely used in the literature^{13}) is based on transduction ratios estimated by the operational model^{22}. The second is a mechanismfree group of approaches (i.e., methods 4−7 in Fig. 1) based on the null method, which cancels systemdependent intercellular differences by taking the ratio of agonist concentrations that give equal response; here we explored all possible implementations of this second approach, some of which were not described in detail before, thus representing new alternatives to the more frequently used methods. Despite the differences, all 7 methods are borne in the same traditional theory of receptor action. Specifically, the idea that properties encoded in the chemical structure of the agonist, such as receptor recognition and responseevoking power (i.e., affinity and efficacy), can be discriminated from system parameters, such as cellular concentrations of receptor or transducer and the sensitivity of the signalling network to the extent of transducer activation. Given this common framework, all methods should yield convergent and correlated results in the identification of biased agonism, except for divergences resulting from variations of sensitivity or the different ways in which numerical and experimental errors crossinteract in determining the final result. A categorically different biasdiagnostics method suggested by Barak and Peterson^{23} was excluded from the present analyses, since it is not designed to extract efficacy information from bioassay data and thus distinguish systembias from ligandbias.
As shown here, these methods cannot identify with confidence whether ligands are weakly or not biased. Two indicators that clearly demonstrate the unreliability of bias factors are the lack of correlation among the test results and the high tendency to generate false positive scores of biased agonism in bioassay comparisons in which only system bias was present (i.e., negative controls). It should be noted that it is possible that different reporters of the same signalling pathway, e.g., G protein recruitment, GTPγS binding and cAMP for G protein signalling, may be differentially activated by different ligands depending on the mechanisms that underlie their activation^{24}.
The first is documented by the weak or even negative correlations among the bias factors computed within and between methods. Part of this may be due to the fact that these bias factors were essentially fitting noise from unbiased signalling, thus resulting in a lack of correlation between those values. However, this outcome is in contrast with the expectation that the methods should measure, albeit with different precision or sensitivity, the same quantity. The second indicator, i.e., the high frequency of false positive bias assignments, shows that the methods often fail to accurately discriminate system bias from true ligand bias. This inaccuracy is even more severe if we consider that the Monte Carlobased limit of statistical significance used here is far more conservative than the commonly used tests based on error propagation rules^{12} (e.g., see Supplemental Figure S1E).
It is worth discussing what major factor might underlie the propensity of the methods to generate erratic and falsepositive results. Arguably, the limitations of these assays are similar to those previously discussed for the traditional model of agonist action^{22}. Specifically, the lack of chemical identity for the ‘intrinsic efficacy’ parameter (ε) and the pragmatic but abstract nature of the stimuluseffect relation limit both conceptual value and experimental verification of the main assumptions on which the classical theory is based. Thus, the idea that bioassay analysis can dissect molecular parameters of receptortransducer interaction from agonistindependent components of physiological response might be fundamentally flawed.
However, as shown previously^{17}, there is a simple mathematical relationship between ε as defined by Furchgott^{15} and the macroscopic cooperativity that best describes in energetic units the allosteric effect of the agonist in stabilizing a transducerbound receptor complex^{25,26}. Although serendipitous, this relationship predicts that, within a fairly wide level of agonism, the determination of relative efficacies for multiple ligands through such methods should asses the size of that allosteric effect, even if only as a relative value. Moreover, we do not find a tight relationship between the extent of model dependency in a method and its propensity to show false positive results. In fact, methods 1 and 2 that require estimating 3 mechanistic parameters of the operational model (τ, K_{A} and r_{max}) generate the same level of statistically significant noise as method 7, which only relies on two minimal assumptions: i.e., that equal stimuli produce equal effects and that the stimulus is the product of ε and the fraction of bound receptors. The above considerations suggest that neither parameter definitions nor the dependence from model formulation might be the culprit for the unreliable outcome of the methods under test.
Interestingly, the method with the lowest score of false positive results (Fig. 5A) is the one using the fewest amounts of data points derived from the agonists CR curves. In fact, method 4 is identical to other null methodbased strategies (5−7), except that the ratio of equieffective agonist concentrations is computed using only two crossing points at a single response level. Thus, we suspect that the performance of the method is worse the greater density of information is extracted from the agonists CR curves. Conceivably, in the presence of system bias even minor anomalies that do not significantly alter the “goodnessoffit” in each individual curve might still be sufficient to give rise to nonzero bias factors in this type of analysis.
The main assumption in classical receptor theory is that functional responses are compared under steadystate conditions of the molecular events generating signalling. Thus, slow kinetics is one factor that might distort CR curves. Agonists with very slow responseeliciting dynamics exist and this anomalous behaviour can affect some bioassays far more than others. As experimentally shown recently^{27}, positive bias factors erroneously indicating differences in transducer efficacy can entirely result from abnormal kinetics. Many additional potential sources of error, such as nonlinear interactions among pathways that are not codified in the theory, including competition among transducers for the same receptor or feedback control mechanisms acting crosswise among parallel signaling cascades, may cause the anomalies described here; but a comprehensive discussion goes far beyond the scope of this report. We may assert conclusively that the bias factors calculated with these methods should be treated with extreme caution.
To mitigate the dependence of bias analysis from theoretical models and CR curves analysis, we developed a strategy that is designed to detect changes in rank orders of intrinsic activity between signalling pathways arising from different transducers. The underlying idea is that the difference between the CR curves of a full agonist in two bioassays can subsume the variations of intrinsic activities in all other agonists with lesser efficacy. From this difference, the expected relationship between rankorders of all ligands in the two assays can be used as guide for detecting deviations due to biased efficacy and assess statistical significance. We conceived two equivalent methods based on this approach, which differ on whether statistical testing is applied to a remapped rank ordering graph of the data or to the distance between observed points and predicted trajectory.
As shown in the results, these new methods are far less prone to generate false positive results in negative control assays, but maintain full capability to identify biased agonists. Although they still depend on CR curve analysis, this is limited to one reference agonist. For all the other test ligands intrinsic activity is the only needed information. Such type of information is far less costly and timeconsuming to generate than that requiring full CR curves, which is an advantage, particularly in high throughput screening efforts involving a very large number of substances.
However, these new methods also come with limitations. One is that their optimal diagnostic power is achieved when comparing “lowamplification” bioassays, in which the information about efficacy is more present in the intrinsic activity than in the EC_{50} of the ligands. Such test systems may not always be available. Moreover, method 8 requires a reasonably large number of unbiased ligands in the testpanel, in order to maintain statistical resolution. More importantly, although the new methods may provide a measure of deviation from the unbiased trajectoryline for rating biased agonism, they cannot quantify the extent of difference in efficacy. We do believe, however, that giving up the ability to quantify a theoretical parameter for an increase in reliability is still a convenient trade off, particularly if the results of bioassays are intended to be used for expensive screening campaigns in drug discovery or for deciding which ligands should be cocrystallized with receptors in Xray structural analysis.
Methods
Experimental procedures
Reagents and Drugs
Cell culture media, reagents, and foetal bovine serum (FBS) were from Invitrogen, Gibco or Biological Industries; restriction enzymes were from New England Biolabs. Coelenterazine, bisdeoxycoelenterazine (coelenterazine 400a) and luciferin (Na salt) were from Biotium Inc. or Nanolight Technology. Radiolabeled [^{35}S]GTPγS and [^{125}I]iodocyanopindolol were from Amersham. Adrenergic ligands were from Tocris, Bachem, Santa Cruz or SigmaAldrich. Pertussis toxin was from List Biologicals or SigmaAldrich. All other reagents were from SigmaAldrich, Merck or Fisher Scientific.
Plasmids
Retroviral expression vectors encoding human β_{2}AR receptors tagged at the Cterminal with rLuc and Gβ_{1}subunit or βarrestin 2 tagged at the Nterminal with rGFP were prepared as described previously^{28}. Plasmid encoding the luciferasebased intracellular cAMP probe GloSensor 22 F was purchased from Promega. The retroviral vectors with various antibiotic resistance genes (the pQC series), the plasmid encoding the envelop protein VSVG and the packaging cells GP2293 were purchased from Clontech.
Cell cultures and transfections
HEK293 (Human Embryonic Kidney) cells were grown in DMEM supplemented with penicillin (100 u/ml), streptomycin (100μg/ml) and 10% FBS in a humidified atmosphere of 5% CO_{2} at 37 °C. The 2B2 Gnas^{E2/E2} fibroblasts (kindly provided by Murat Baştepe, Harvard Medical School, Boston, MA, USA), where the 2^{nd} exon of the Gnas gene was ablated, were grown in the same conditions except that DMEM:F12 (1:1) medium supplemented with 5% FBS was used. 2B2 cell clones that stably express rLuctagged β_{2}AR alone or together with rGFPtagged Gβ_{1}, βarrestin 2 or rGFPtagged Gβ_{1}+G_{αsL} were obtained by retroviral transfection followed by appropriate antibiotics selections; G418 (500 μg/ml), hygromicin (100 μg/ml) and/or puromycin (5 μg/ml) whenever appropriate. Native or β_{2}ARover expressing HEK293 cells, and 2B2 cells (stably cotransfected with β_{2}AR+G_{αsL}) were stably transfected with GloSensor 22 F by using lipofectamine 2000 reagent (Thermo Fisher Scientific) as instructed by the supplier.
Membrane preparations
Cells were detached by PBSEDTA, pelleted at 200 × g for 5 min at room temperature, resuspended in homogenization buffer (5 mM TrisHCl pH 7.4, protease inhibitor mixture, Roche Diagnostics), and homogenized by passing the suspension 10–15 times through a 26 G syringe tip on ice. The homogenate was centrifuged at 450 × g for 10 min at 4 °C, and the resulting supernatant at 100,000 × g for 30 min at 4 °C (Beckman Coulter Optima LE80K). The pellet was suspended in 50 mM TrisHCl (pH 7.4), 10 mM MgCl_{2}, protease inhibitor mixture, and 25% sucrose at a protein concentration of approximately 2 mg/ml, and stored at −80 °C. Membranes that were used in BRET assays were prepared and stored exactly in the same way except that only distilled water + protease inhibitor mixture was used in all steps.
BRET assays for β2AR Gs, Gi or arrestin interactions
BRET signals were recorded and analyzed essentially as described previously^{29,30}. Briefly, receptorGβ_{1} interactions, which require the presence of a Gα interaction, were measured in 96well white plastic plates (Packard Optiplate or Greiner) using membrane preparations in a total volume of 100 μl of PBS. Assays were started by pipetting coelenterazine to the plate wells containing membranes in PBS, 10 mM MgCl_{2}. The incubation lasted 8 minutes (which is the time necessary to attain a stable basal BRET ratio that remains constant for about 20 min). Next, ligands at varying concentrations were rapidly added using multichannel pipettes and incubated for additional 3 minutes before counting the plate. The intensity of luminescence in the wells was counted sequentially at two wavelengths using either a Victor Light (equipped with two bandpass filters, 450/10 nm and 510/10 nm) or an Envision luminometer (bandpass filters, 470/30 and 515/30 nm) (both from Perkin Elmer). Kinetic measurements were used to verify that the change of BRET ratio induced by ligands was at plateau with 3 minutes of incubation. Two lines of 2B2 cells coexpressing the rLucβ_{2}AR and rGFPGβ_{1} constructs, one without and the other with retrovirallyexpressed G_{αsL}, were used to distinguish β2AR interactions with Gi/o and Gs. In membranes prepared from the first cell line the BRET signal is Gidependent. In membranes prepared from the Gsreconstituted line previously treated with pertussis toxin (10 ng/ml, 18 h) the BRET signal is Gs dependent (see also Supplementary Figure S1). Experiments were conducted independently in two different laboratories. Depending on the specific activities of the membrane preparations, 1–3 μg of membrane protein per well, and 1–5 μM (final concentration) of coelenterazine were used. All these experiments were pooled and the average results are presented. Receptorβarrestin 2 interactions were measured in intact cell monolayers using the luciferase substrate analogue bisdeoxycoelenterazine (bDOC, 5 μM) as described previously^{28}. Basalsubtracted concentrationresponse (CR) curves of BRET ratios (510 nm/450 nm or 515 nm/470 nm) for different agonists were normalized with respect to the epinephrine maximal response.
cAMP Measurements
GloSensor22F expressing cells were seeded in 96well white plates at a density of 25 × 10^{3} cells/well 24 hours before the experiment. Two hours before the assay, cells were washed once with PBS, and further incubated 105 minutes in PBS + 25 mM glucose in the presence of 3 mM luciferin in a total volume of 60 μl. The assay was started by adding agonists at varying concentrations and 1 mM (final) IBMX (3isobutyl1methylxanthine) on the thermostatic plate of the luminometer set at 37 °C in a total volume of 100 μl/well. Luminescence form each well was then counted every 30 s with 0.5 s integration time for 20 minutes. Ligands and IBMX were diluted in PBS + 25 mM glucose. cAMP response was determined as peak of luminescence intensity, area under the 20 min luminescencetime curves, or the initial rate of luminescence increase. All three methods gave similar results. Resulting CR curves for each ligand were corrected either with respect to the forskolin (10μM) response obtained in parallel experiments, or with respected to the maximal responses measured at saturating concentrations of the ligands. All curves were then further normalized with respect to epinephrine’s maximum response. The same procedure was followed for cAMP measurement by RIA, except that the assay was started by adding ligands + IBMX at 37 °C, and terminated after 5 minutes by adding 100 μl 0.2 N HCl. The RIA procedure for determination of cAMP concentration in the cell extracts was described before^{31}. Data were first corrected for the number of living cells determined by MTT assay (SigmaAldrich) in parallel experiments, and then normalized with respect to the epinephrine responses.
Arrestin recruitment and Inositol 1Phosphate Assay of angiotensin 1 receptor agonists
Data for arrestin recruitment and inositol 1phosphate signalling are from Rajagopal et al.^{14}, where full details for those assays are noted.
GTPγS binding in β_{2}ARGas fusion proteins
The preparation of cells expressing β_{2}ARGα_{s} fusion protein and the [^{35}S]GTPγS binding assay used to determine agonists responses have been described previously^{32}.
Radioligand binding assays
Receptor binding experiments were carried out in the membrane preparations from pertussis toxin treated (200 ng/ml, overnight) 2B2 cells that express rLuctagged β_{2}AR. Membranes (0.5–1 μg protein/well) were incubated with [^{125}I]iodocyanopindolol (15000 dpm/well; ~20 pM) in the presence of varying concentrations of competitor ligands, GDP (100μM), AlCl_{3} (20 μM) and NaF (10 mM) in a total volume of 250 μl buffer (100 mM KCl, 10 mM MgCl2, 50 mM TrisHCl pH 7.4) for 2 hour at 37 °C in 96well plates. The reaction was stopped by rapid filtration through Whatman GFB glass fiber filters by using a cell harvester (Skatron Instruments). Radioactivity on the filters was counted by using a scintillation counter (Wallac MicroBeta 1450 Trilux). K_{d} values for the competitor ligands were estimated by means of the nonlinear regression of the numerical solution of bimolecular binding equation that includes nonspecific binding parameters in the presence of multiple ligands, using a visual Basic MS Excel routine. For each ligand, the average of 2 independent binding curves obtained in quadruplicates was used for the regression analysis. In all regressions the K_{d} value for the radioligand was fixed to 40 pM, as determined by stauration binding assays. On average, the receptor density in these membranes were calculated to be 23 ± 3 pmol/mg protein (n = 29). Serial dilutions of salmeterol were made in the presence of 0.1% (w/v) bovine albumin for all experiments.
Analysis of the agonist CR curves
The curves obtained in various bioassays were fitted either with the operational model (methods 1, 2) or with a 3parameter logistic function (methods 4–7). For both models we used the logform of the original equations, which behave better than the original ones in the fitting procedures. For the operational model, as suggested by Kenakin et al.^{13} we used
For the 3parameter logistic, we used:
The estimated parameters are r_{max}, n (slope factor), lnτ, (and lnK_{d}) in case of the operational model, or E_{max}, n (slope factor) and lnEC_{50} in case of the 3parameter logistic model. Note that the slope factor in the operational model is not equivalent to that of the logistic model. For the estimation of regression summary statistics, including the 95% band for the fitted curves, we used the standard asymptotic methods (see for example Bates and Watts^{33}). We used the average of at least 3 independent triplicate or quadruplicate CR curves in each regression analysis. Exact numbers of data points and replicates used in the analyses are indicated case by case in figure legends. All computational procedures were programmed using scripts written in MS Excel Visual Basic and/or Matlab programming language, whenever appropriate.
Analysis of biased agonism
In the present analyses we used nine different methods of bias computing (indicated below with numbers in the parentheses). The first seven comprise most of the currently used strategies, which can be classified in two groups: those based on the operational model and those based on the null method. Additionally, we introduce here two novel methods that represent a third group of modelindependent interrelated strategies of bias diagnostics.
Strategies based on the operational model
The operational model^{22} is a special case of Stephenson’s theory of receptor action^{16}, which is schematically summarized as follows:
A ligand (L) binds to the receptor according to a bimolecular reversible interaction:
where O is receptor occupancy, R_{t}, total receptors, and K_{d} the equilibrium dissociation constant for the binding reaction. The ligandreceptor complex is assumed to produce a “stimulus” (s) in the cell that is proportional to both O and the ligand’s intrinsic efficacy ε, i.e., s = ε O.
The observed biological response r is related to the stimulus by an unspecified but monotonic function, i.e., response, r = f(s). If we posit that f may be a 3parameter hyperbolic function, where r_{max} is the global maximum, n a slope factor, and K_{E} a sensitivity parameter, we get the operational model. This is defined as:
Equation (2) implies that the maximum response of the ligand (E_{max}) is defined as
Estimates of τ, which contain information about ligand efficacy, might in principle be obtained by fitting the above equation to agonist CR curves. The ratios of τ values with respect to a reference agonist in a given response yield the relative ε values of the agonists, as the system dependent parameters, R_{t}, and K_{E}, cancel in the τ ratios. Thus, for any given ligand the logratio of relative efficacies at two different responses provides the bias factor^{34,35}.
However, due to the identifiability problem in the regression of the operational model, K_{d} and τ cannot be estimated independently especially for strong agonists. This leaves only two viable options. Either K_{d} values are experimentally measured and used to constrain the regression, thus obtaining a reliable estimate of τ from the computer fit^{14,36}, or the two parameters are computed as unresolved τ/K_{d} ratios from the regression^{13}. In the latter case no relative efficacy of agonists can be measured from each individual response^{17}, but the bias factors are still computable as ratios of τ/K_{d} ratios across different responses. Selecting the best of these approaches is often not a free choice, because ligand binding constants cannot be determined in all receptor systems. Therefore, we used both strategies here. Specifically:
(1) The regression of the model was constrained using the K_{d} values of the agonists, which were obtained from radioligand binding experiments in the absence of receptortransducer interaction; then the bias factors were computed as (log) ratios of relative efficacies (see Supplementary data S9 for the full set of competition binding curves and corresponding logK_{d} values).
(2) The K_{d} values were treated as unknown parameters and let free to change in the model regression; then the bias factors were computed as (log) ratios of τ/K_{d} ratios. (panels 1 and 2, Fig. 1)
(3) We also computed bias factors using a simpler approach, by determining the E_{max}/EC_{50} ratios^{37}. In fact, in the operational model the median effective concentration is defined as
Thus, in case of a slope factor very close to one, the ratio E_{max}/EC_{50} is:
Since r_{max} (the maximal possible response in each measured biological system) is a system dependent parameter, thus common to all agonists, it is cancelled when all E_{max}/EC_{50} ratios are scaled to that of a reference ligand. Hence, the (log) ratio of the relative values of E_{max}/EC_{50} at two different responses provides a bias factor estimate. This is equivalent to that obtained from the second strategy described above, if the slope factors are all equal, or very close to unity. We estimated E_{max} and EC_{50} by fitting a 3parameter logistic function to the CR curves. (see panels 3 in Fig. 1).
Strategies based on the “null method” according to StephensonFurchgott theory
The “null hypothesis” in SF theory states that equal stimuli should produce equal responses. Thus, two different ligands, L_{r} and L_{t} (with subscripts indicating reference and test agonist) will generate equal responses when ε_{r}O_{r} = ε_{t}O_{t}. Therefore, the ratio of occupancies, O_{r}/O_{t}, measured at equieffective concentrations of the agonists corresponds to the relative efficacy, ε_{t}/ε_{r}. Bias factors can be computed from the (log) ratios of the equieffective occupancy ratios measured for two different biological responses. There are again two options in applying this strategy, depending on whether K_{d} values can or cannot be measured experimentally.
In the first case, fractional occupancies at equieffective concentrations ([L]_{eq}) are calculated as [L]_{eq}/([L]_{eq} + K_{d}). By inverting the 3parameter logistic functions fitted to the agonist CR curves, equieffective concentrations are computed. For derivation of equieffective occupancy ratios we used three possible variants of the approach:
(4) Selecting a single response level common to all agonists (panel 4, Fig. 1).
(5) Using multiple response levels within a range common to all agonists (panel 5, Fig. 1).
(6) Finally, using multiple response levels within a variable range that was optimized according to each pair of reference/test agonist under study. (panel 6, Fig. 1).
Note that in methods (5) and (6), all the determinations made at the multiple response levels were averaged to compute occupancy ratios and relative efficacies, as all should converge to a unique theoretical estimate, with variations only due to experimental error.
In the second case (i.e., unknown K_{d}’s), SF theory suggests a strategy that does not require equieffective occupancy ratio calculations, hence, no information about the binding constant K_{d}. Briefly, at equieffective concentrations of reference (L_{r}) and test ligand (L_{t}), the theory predicts the following equality (with K_{r} and K_{t} being the respective binding constants of reference and test agonists):
(7) Equation (5) shows that there is a linear relationship between the reciprocal equieffective concentrations of reference and test agonist, with slope yielding the relative value of ε/K_{d} of the test agonist with respect to that of the reference agonist. Accordingly, we compute the bias factors from the (log) ratio of the slopes of the doublereciprocal lines determined in the two different assay systems. This procedure is schematized in panel 7, Fig. 1. A very similar approach, albeit used in a different context, has been proposed by Rajagopal et al.^{14}.
Modelindependent strategies
We have developed two new semiquantitative diagnostic tools to assess bias. Both are based on the evaluation of relative intrinsic activities (IA’s) of ligands observed in two signalling pathways (i.e., the maximal responses of all agonists normalized to that of a full agonist producing the highest level of response in both assays). The underlying empirical principle is that the rank order of agonist IA’s for two different responses that are mediated by the same pair of receptor/transducer are invariant when there is no ligand bias. Although system bias may generate strong non linearity in the relationship between sets of IA’s measured at two different responses, it cannot change the relative rank of the agonists in these responses. This rank only depends on relative efficacies. However, in the presence of non linearity combined with experimental error, it can be virtually impossible to assess whether a true and significant change in rank ordering exists.
To solve the problem we took advantage of the idea that the CR curve of a full agonist in each bioassay spans all possible levels of intrinsic activity that all other agonists can show in that system. Thus, from the analysis of the best fitting full agonist CR curves generated in the two assays we can specify the exact shape of the theoretical trajectory that the relationship between IA’s of all unbiased agonists should follow. This “reference trajectory” becomes a ruler for testing if changes in rank ordering of intrinsic activity exist and are statistically significant. Two different strategies were used (schematized in Fig. 1, panel 8 and 9).
Analysis of changes in rank ordering (method 8)
This analysis is best suited to the situation in which a larger set of unbiased ligands comprising a wide range of IA’s coexist with a smaller subset of biased agonists. In the absence of agonist bias, all ligands IA’s should lie on the reference trajectory (Fig. 2, panel A). Deviations that are not justified by experimental error identify biased agonists. To perform the analysis, theoretical and experimental points are translated into a uniform rank ordering map. Experimental error perturbs this ordering in a peculiar way, depending on the shape of the reference trajectory and the error structure of experimental data (Fig. 2, panels B and C). Under the statistical null hypothesis (i.e., no agonist bias is present), a joint distribution for the random error perturbations can be built and used to assess the significance of potentially biased ligands. The whole procedure is described by the following sequel of steps:
(a) Construct the reference trajectory from the fitted CR curves of the full agonist in the two signalling pathways as shown in Fig. 2, panel A, by fitting a twoparameter logistic equation onto the normalized CR curves of the reference agonist obtained at two different responses. Using the fitted values of EC_{50}’s (i.e. c and c’), slope factors (i.e. b and b’) and the equation shown in Fig. 2, panel A (i.e. Y vs. Y’) draw the reference trajectory on a plot whose coordinates are the ranges of IA for the two bioassays under test.
(b) With the two experimental IA values of each test ligand, generate the observed point and identify the corresponding “theoretical” point (Fig. 2, panel B), i.e., the point on the reference trajectory that minimizes the distance between the curve and the observed point. For distance calculation we used an anisotropic metric that takes into account the statistical uncertainty in both axis (see Supplemental method and figure SM2[A–C] for further details).
(c) Assign to each observed and projected point two integers representing the sequence numbers – i.e. rank orders – in the sorted list of intrinsic activity for each response. The rank orders of the theoretical points trace the identity line, since the reference trajectory is a monotonic curve (Fig. 2C).
(d) Perturb the positions of such theoretical points by adding to their x and y coordinates two random numbers generated from uncorrelated normally distributed random variables with zero means and standard deviations equal to those of the corresponding experimental points; resort the perturbed IA’s and reassign new rank orders as described above; record the resulting rank orders; repeat this step for 500,000 times.
(e) From the recorded set of rank orders, calculate the joint frequencies of the ranks. Find the envelope that encloses 95% of the joint ranks (i.e., the 95% confidence contour of the joint distribution).
(f) Identify as biased agonists those ligands that lie outside the confidence contour by comparing experimentally determined ranks with the joint distributions of the ranks obtained under the statistical null hypothesis (schematics in Fig. 2D).
Analysis of deviations from expected trajectory (method 9)
Unlike the rank ordering analysis described above, this second approach does not require coverage of a wide range of IA in test ligands; thus, it can be applied to a few or even a single test agonist. It consists in testing the statistical significance of the distance between any observed value of the IA plot (i.e. representing the experimentally measured IA’s of a given ligand in the two bioassays) and the corresponding “theoretical” point on the reference trajectory. Both theoretical trajectory and the projections of observed IA’s on this reference trajectory are determined as described in steps (a) and (b) of the previous section. Next, we compute 95% confidence ellipses for the observed points and corresponding theoretical points on the trajectory (Fig. 2E). The ellipses for the observed points are constructed from the experimentally observed errors of the IA’s; those of the theoretical points are calculated from the asymptotic 95% confidence band of the nonlinear regression of the CR curves (blue or red dotted curves in Fig. 2A; see also Supplemental Figure SM2, panels DE, for more details). An agonist is considered biased when the observed and the projected trajectory points have nonintersecting confidence ellipses. Excel templates and Matlab scripts that help to implement the above computations are available from the first author.
Additional Information
How to cite this article: Onaran, H. O. et al. Systematic errors in detecting biased agonism: Analysis of current methods and development of a new modelfree approach.. Sci. Rep. 7, 44247; doi: 10.1038/srep44247 (2017).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1
Hall, R. A., Premont, R. T. & Lefkowitz, R. J. Heptahelical receptor signaling: beyond the G protein paradigm. J Cell Biol 145, 927–32 (1999).
 2
Urban, J. D. et al. Functional selectivity and classical concepts of quantitative pharmacology. J Pharmacol Exp Ther 320, 1–13 (2007).
 3
Kenakin, T. Agonistreceptor efficacy. II. Agonist trafficking of receptor signals. Trends Pharmacol Sci 16, 232–8 (1995).
 4
Berg, K. A. et al. Effector pathwaydependent relative efficacy at serotonin type 2A and 2C receptors: evidence for agonistdirected trafficking of receptor stimulus. Mol Pharmacol 54, 94–104 (1998).
 5
DeWire, S. M. & Violin, J. D. Biased ligands for better cardiovascular drugs: dissecting Gproteincoupled receptor pharmacology. Circ Res 109, 205–16 (2011).
 6
Luttrell, L. M., Maudsley, S. & Bohn, L. M. Fulfilling the Promise of “Biased” G ProteinCoupled Receptor Agonism. Mol Pharmacol 88, 579–88 (2015).
 7
Mailman, R. B. GPCR functional selectivity has therapeutic impact. Trends Pharmacol Sci 28, 390–6 (2007).
 8
Wisler, J. W., Xiao, K., Thomsen, A. R. & Lefkowitz, R. J. Recent developments in biased agonism. Curr Opin Cell Biol 27, 18–24 (2014).
 9
Rajagopal, S., Rajagopal, K. & Lefkowitz, R. J. Teaching old receptors new tricks: biasing seventransmembrane receptors. Nat Rev Drug Discov 9, 373–86 (2010).
 10
Stallaert, W., Christopoulos, A. & Bouvier, M. Ligand functional selectivity and quantitative pharmacology at G proteincoupled receptors. Expert Opin Drug Discov 6, 811–25 (2011).
 11
Figueroa, K. W., Griffin, M. T. & Ehlert, F. J. Selectivity of agonists for the active state of M1 to M4 muscarinic receptor subtypes. J Pharmacol Exp Ther 328, 331–42 (2009).
 12
Kenakin, T. & Christopoulos, A. Signalling bias in new drug discovery: detection, quantification and therapeutic impact. Nat Rev Drug Discov 12, 205–16 (2013).
 13
Kenakin, T., Watson, C., MunizMedina, V., Christopoulos, A. & Novick, S. A simple method for quantifying functional selectivity and agonist bias. ACS Chem Neurosci 3, 193–203 (2012).
 14
Rajagopal, S. et al. Quantifying ligand bias at seventransmembrane receptors. Mol Pharmacol 80, 367–77 (2011).
 15
Furchgott, R. F. The use of betahaloaklylamines in the differentiation of the receptors and in the determination of dissociation constants of receptoragonist complexes. In Advances in Drug Research, Vol. 3 (eds. Harper, N. J. & Simmonds, A. B. ) 21–55 (Academic Press, New York, 1966).
 16
Stephenson, R. P. A modification of receptor theory. Br J Pharmacol Chemother 11, 379–93 (1956).
 17
Onaran, H. O., Rajagopal, S. & Costa, T. What is biased efficacy? Defining the relationship between intrinsic efficacy and free energy coupling. Trends Pharmacol Sci 35, 639–47 (2014).
 18
Brust, T. F., Hayes, M. P., Roman, D. L., Burris, K. D. & Watts, V. J. Bias analyses of preclinical and clinical D2 dopamine ligands: studies with immediate and complex signaling pathways. The Journal of pharmacology and experimental therapeutics 352, 480–93 (2015).
 19
Binkowski, B. F. et al. A luminescent biosensor with increased dynamic range for intracellular cAMP. ACS Chem Biol 6, 1193–7 (2011).
 20
Strachan, R. T. et al. Divergent transducerspecific molecular efficacies generate biased agonism at a G proteincoupled receptor (GPCR). J Biol Chem 289, 14211–24 (2014).
 21
Bastepe, M. et al. Receptormediated adenylyl cyclase activation through XLalpha(s), the extralarge variant of the stimulatory G protein alphasubunit. Mol Endocrinol 16, 1912–9 (2002).
 22
Black, J. W. & Leff, P. Operational models of pharmacological agonism. Proc R Soc Lond B Biol Sci 220, 141–62 (1983).
 23
Barak, L. S. & Peterson, S. Modeling of bias for the analysis of receptor signaling in biochemical systems. Biochemistry 51, 1114–25 (2012).
 24
Furness, S. G. et al. LigandDependent Modulation of G Protein Conformation Alters Drug Efficacy. Cell 167, 739–749 e11 (2016).
 25
De Lean, A., Stadel, J. M. & Lefkowitz, R. J. A ternary complex model explains the agonistspecific binding properties of the adenylate cyclasecoupled betaadrenergic receptor. J Biol Chem 255, 7108–17 (1980).
 26
Onaran, H. O. & Costa, T. Where have all the active receptor states gone? Nat Chem Biol 8, 674–7 (2012).
 27
Klein Herenbrink, C. et al. The role of kinetic context in apparent biased agonism at GPCRs. Nat Commun 7, 10842 (2016).
 28
Casella, I., Ambrosio, C., Gro, M. C., Molinari, P. & Costa, T. Divergent agonist selectivity in activating beta1 and beta2adrenoceptors for Gprotein and arrestin coupling. Biochem J 438, 191–202 (2011).
 29
Molinari, P., Casella, I. & Costa, T. Functional complementation of highefficiency resonance energy transfer: a new tool for the study of protein binding interactions in living cells. Biochem J 409, 251–61 (2008).
 30
Molinari, P. et al. Morphinelike opiates selectively antagonize receptorarrestin interactions. J Biol Chem 285, 12522–35 (2010).
 31
Kaya, A. I. et al. Cell contactdependent functional selectivity of beta2adrenergic receptor ligands in stimulating cAMP accumulation and extracellular signalregulated kinase phosphorylation. J Biol Chem 287, 6362–74 (2012).
 32
Ambrosio, C. et al. Different structural requirements for the constitutive and the agonistinduced activities of the beta2adrenergic receptor. J Biol Chem 280, 23464–74 (2005).
 33
Bates, D. M. & Watts, D. G. Nonlinear regression analysis and its applications. (John Wiley & Sons, New York, 1988).
 34
Kenakin, T. P. & Beek, D. Is prenalterol (H133/80) really a selective beta 1 adrenoceptor agonist? Tissue selectivity resulting from differences in stimulusresponse relationships. J Pharmacol Exp Ther 213, 406–13 (1980).
 35
Kenakin, T. P. & Beek, D. In vitro studies on the cardiac activity of prenalterol with reference to use in congestive heart failure. J Pharmacol Exp Ther 220, 77–85 (1982).
 36
Rajagopal, S. Quantifying biased agonism: understanding the links between affinity and efficacy. Nat Rev Drug Discov 12, 483 (2013).
 37
Griffin, M. T., Figueroa, K. W., Liller, S. & Ehlert, F. J. Estimation of agonist activity at G proteincoupled receptors: analysis of M2 muscarinic receptor signaling through Gi/o, Gs, and G15. J Pharmacol Exp Ther 321, 1193–207 (2007).
Acknowledgements
The present study was supported in part by the following grants: Turkish Scientific and Technical Research Council (TUBITAK) 113S909 (to HOO, ÖÜ and EMK); Italian Ministry of Health, grant RF201102351158 (to T.C.). VV is supported by a Telethon fellowship (grant no. GGP13227). SR is supported by NIH K08H114643 and a Burroughs Wellcome Career Award for Medical Scientists.
Author information
Affiliations
Contributions
H.O.O. conceived and developed the new modelfree bias computing methods, the rankordering statistics and the previously undescribed variants of the modeldependent methods described in the paper. H.O.O., S.R. and T.C. discussed and refined the theoretical basis of bias analysis summarized in the manuscript. C.A., M.C.G. and V.V. generated the transfected cells and performed the experiments (BRET analysis of tranducers interactions, GTPγS binding, cAMP assays) in the Italian site. Ö.U., E.M.K. and H.O.O. performed radioreceptor binding assays, BRET assays of receptor–Gi intractions and cAMP assays (Glosensor vs. RIA) in the Turkish site. S.R. provided the assays of AT_{1}R activity on IP1 stimulation and arrestin recruitment in the USA site. All authors contributed to the final writing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Onaran, H., Ambrosio, C., Uğur, Ö. et al. Systematic errors in detecting biased agonism: Analysis of current methods and development of a new modelfree approach. Sci Rep 7, 44247 (2017). https://doi.org/10.1038/srep44247
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/srep44247
Further reading

Influence of G proteinbiased agonists of μopioid receptor on addictionrelated behaviors
Pharmacological Reports (2021)

A single unified model for fitting simple to complex receptor response data
Scientific Reports (2020)

Exploring use of unsupervised clustering to associate signaling profiles of GPCR ligands to clinical response
Nature Communications (2019)

An intact model for quantifying functional selectivity
Scientific Reports (2019)

Biased signalling: from simple switches to allosteric microprocessors
Nature Reviews Drug Discovery (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.