BIGL: Biochemically Intuitive Generalized Loewe null model for prediction of the expected combined effect compatible with partial agonism and antagonism

Clinical efficacy regularly requires the combination of drugs. For an early estimation of the clinical value of (potentially many) combinations of pharmacologic compounds during discovery, the observed combination effect is typically compared to that expected under a null model. Mechanistic accuracy of that null model is not aspired to; to the contrary, combinations that deviate favorably from the model (and thereby disprove its accuracy) are prioritized. Arguably the most popular null model is the Loewe Additivity model, which conceptually maps any assay under study to a (virtual) single-step enzymatic reaction. It is easy-to-interpret and requires no other information than the concentration-response curves of the individual compounds. However, the original Loewe model cannot accommodate concentration-response curves with different maximal responses and, by consequence, combinations of an agonist with a partial or inverse agonist. We propose an extension, named Biochemically Intuitive Generalized Loewe (BIGL), that can address different maximal responses, while preserving the biochemical underpinning and interpretability of the original Loewe model. In addition, we formulate statistical tests for detecting synergy and antagonism, which allow for detecting statistically significant greater/lesser observed combined effects than expected from the null model. Finally, we demonstrate the novel method through application to several publicly available datasets.


Supplementary information BIGL paper 1 Supplementary methods
Two statistical tests were designed to measure whether a combination of different compounds results in a synergistic or antagonistic effect. The first test, referred to as MeanR, tests the overall fit of the data to the Generalized Loewe additivity model (further referred to as the null model). If the assay readout values deviate too strongly from this null model, then one may identify synergy or antagonism. The size of the deviation that is allowed under the null model depends on the biological variability present in the data.
A second test, MaxR, allows the identification of combinations of concentrations where synergy or antagonism is present by comparing the absolute deviation between observed and predicted readout. Both tests are tests of the null hypothesis that the null model holds true.
The following sections describe the derivation of the test statistics MeanR and MaxR, as well as their null distributions required for p-value calculations.

Derivation of the MeanR test statistic
The null hypothesis states that the Generalized Loewe additivity model holds true. The parameters in this model come from the Hill equations for the mono-therapies. Hence, only the observations from the mono-therapies are needed to fit the model. The compound concentrations for the mono-therapies are referred to as the on-axis points. The residual variance is denoted by . Based on the fit an estimate of the residual variance is computed as the mean squared error, which is denoted by . The fitted model is further used to predict the assay readouts at the concentrations of the compound combinations, which are referred to as off-axis points.
We use the index to refer to an off-axis point. We first describe the method for the setting where no replicate measurements are available at the n 1 off-axis points. Let denote readout at off-axis point , and let ^ denote the predicted readout at off-axis point according to the null model. We assume that all readouts are independently distributed. Under the null hypothesis the off-axis observations come from the same null model, with possibly a different residual variance, say . Our test is based on the residuals ^ .
In matrix notation we write ^ (i.e. vectors constructed by stacking , and ^ ). The covariance matrix of is then given by where is the covariance matrix of the predictions.
For a linear model, can be easily computed from the design matrices of the data used to fit the null model ( ) and of the off-axis points of the predictions ( ): , but for a nonlinear model has no simple expression. We propose to approximate by means of a bootstrap procedure (see further).
Under the assumption of normality, so far we have ∼ 0, .
Hence, a meaningful test statistic is the quadratic form .
However, the residual variances and are unknown. With and their corresponding consistent estimators, the test statistic

MSE MSE
is asymptotically distributed as under the null hypothesis, but the convergence may be slow.
An exact null distribution may be obtained by making the additional assumption that . This common variance is denoted by . We then get ∼ 0, , and the quadratic form becomes

∼ .
Suppose we have an estimator of , say (details follow later), then it is often possible to show (for a linear model, and under the assumption of normality of the model error terms)

∼ ,
for an appropriate . The test statistic may now be written as in which we recognize in the numerator a statistic with distribution / , and in the denominator a statistic with distribution / . If numerator and denominator are independent, then by definition this ratio is distributed as , . Hence we write Thus, if we find an estimator of for which (A) / ∼ and (B) which is independent of , we can use as a test statistic with null distribution , . In this paper we propose to use as the estimator; under the normality assumption (A) holds with (residual degrees of freedom from the fit of the null model using only the on-axis points), and since only makes use of the mono therapy data, it is definitely independent of the residuals that are calculated from the offaxis data. Thus the final form of the test statistic is given by Thus, if the residuals are normally distributed and if the residual variances at the off-axis and on-axis (mono-therapies) points are equal, it can be shown that has approximately an , null distribution. Where equals the degrees of freedom of MSE .
If no normality can be assumed, then the null distribution can be obtained through the parametric bootstrap (see further).

MeanR in the presence of replicates
The previous definition holds when no replicates are included in the study. When replicates are available at the off-axis points, we continue to work with the average residuals at the off-axis points. In particular, let denote the average readout of the replicates at off-axis point . We then define the (average) residual at off-axis point as

^ .
Let denote the vector with all the stacked. Then, under the null hypothesis of no lack-of-fit and assuming constant variance, with a diagonal matrix with at the th position 1/ .
The test statistic and its null distribution can now be obtained in a similar way as before. In particular, If no replicates are available, the matrix becomes the identity matrix and the method reduces to the one described in the previous section.
The matrix will be estimated using the bootstrap method, and to avoid relying on the normality assumption a parametric bootstrap method will be used for p-value calculation.

Bootstrap method to estimate the matrix
Since the BIGL null model is highly non-linear, we propose a parametric bootstrap procedure to approximate the matrix: 1. pool all residuals of the on-axis and off-axis points 2. for each on-axis point i, sample (with replacement) m i residuals completely at random from the pooled sample (step 1) 3. add residuals to corresponding predicted readout from fitted Hill equations (mono-therapies); this generates a bootstrap sample of readouts at the on-axis points 4. refit the Hill equations to the bootstrap sample data 5. use the fit from step 4 to generate predictions at the off-axis points 6. repeat steps 2-5 many times (say at least 100 times) and compute the covariance matrix between the off-axis predictions. This gives an approximation of the matrix.

Bootstrap method to estimate the null distribution of
If no normality can be assumed, then the null distribution of the test statistic can be obtained through a parametric bootstrap procedure. This procedure is very similar to the bootstrap procedure outlined in the previous section. In particular, steps 1-5 are identical, but the remaining steps are now: 6. Sample completely at random m i residuals from the pooled sample of residuals (step 1) for each off-axis point i. Add these sampled residuals to the predicted readout. This forms a bootstrap sample of readouts at the off-axis points (under the null hypothesis of additivity) 7. Compute the test statistic based on the bootstrap sample of step 6. 8. Repeat steps 2-7 many times. The set of test statistics computed in step 7 forms the bootstrap null distribution of the test statistic and can be used for p-value calculations.
Note that in the calculation of the test statistic (step 7) the matrix C p is required. Hence, the final procedure is a nested bootstrap method in which the bootstrap procedure of section 1.1.2 is included in step 7. However, from experience (simulation studies) we have learnt that the matrix C p is quite stable. Therefore, as an approximation, we suggest to compute C p only once and use this single C p approximation throughout the bootstrap procedure as outlined in this section.

Derivation of the MaxR test statistic
We start from the residuals defined for MeanR:

^ .
Let denote the vector with all the stacked. Then, assuming normality, equal residual variance at off and on-axis points, and under the null hypothesis of no lack-of-fit,  where , … , i. i. d .
If the null hypothesis is rejected at the alpha-level of significance with the test statistic, one can identify the off-axis points that were responsible for the rejection (i.e. the off-axis points at which the corresponding element in / / MSE exceeds the critical value of the test). These points correspond to the compounds for which the readout is different from what is expected under additivity. This procedure basically performs simultaneous hypothesis tests at the individual off-axis points. The procedure controls the familywise error rate (FWER) at the nominal significance level of the test.

Supplementary results
Monte-Carlo simulations were conducted to assess whether the MeanR method is able to control the type I error at a nominal level of 5%. Furthermore, we have assessed whether the MeanR method could truly detect synergy and whether the MaxR method is able to select the true synergistic off-axis points.
For each scenario, 3000 Monte-Carlo simulations were performed. All tests were conducted at the nominal 5% significance level. All p-values were computed with the parametric bootstrap (100 bootstrap runs).

Simulations under additivity
In each Monte-Carlo run, data were generated with parameters estimated from a real dataset, using the Generalized Loewe additivity model. Data of multiple replicates of on-and off-axis points were available in the dataset. The matrix is computed based on 50 bootstrap runs.
The results are presented in Figure S1. From the results we conclude that the test statistic follows indeed approximately an F-distribution and that the null distribution of the p-values is approximately uniform. The type I error rate is 6% which is close to the nominal level of 5%.

Simulations under synergy
To verify whether the method is indeed able to detect synergy, we have simulated data with a synergistic effect at three dose combinations. Data were first simulated under the null hypothesis using the estimated parameters obtained from the Loewe model. Afterwards, a small effect of 0.075 was subtracted from the predicted effect under additivity. This process was repeated 3000 times and the bootstrap p-values of each test were saved. Figure S2 shows that the MeanR method is indeed able to detect synergy at the -level of 5%. For the chosen effect of 0.075 we found that a synergy call was made in 63.9% of the Monte-Carlo runs (i.e. power of test equals 63.9% in this scenario). The power of the test increases with increasing effect size.

Shiny application
• An interactive shiny application (https://bigl.openanalytics.eu) visualizes the predicted response surface for a given set of marginal model parameters. Data points are simulated according to the BIGL null model from the provided marginal parameters. Marginal parameters are then reestimated from the data and the response surface is subsequently computed and depicted as a 3-dimensional plot. Detailed predicted response decomposition in the BIGL model is available within the application as well. This interactive web application comes with 3 pre-defined examples of compound pairs, namely two agonists, an agonist and a partial agonist as well as an agonist and an antagonist ( Figure S3).
Additionally one can use an alternative null model, being either the classical Loewe or the highest single agent (HSA) model. In the classical Loewe, both maximal responses of the marginals are restricted to be the same and the surface is computed. Expected response for the HSA, on the other hand, is constructed simply by taking either the minimum (if dose response curves are decreasing) or the maximum (if dose response curves are increasing) at a particular dose combination.

Figure S3. Example BIGL null surface for Agonist-Inverse Agonist combination as obtained by R Shiny app.
For additional interpretation of the BIGL null response surface, in the R shiny web application a corresponding two-dimensional visualization (isobologram using a colour scale) is made. As an example, in Figure S4 a dose response combination of a partial agonist and a neutral antagonist has been simulated using the BIGL model from the marginal dose response curves as shown below (doses are in linear scale). Figure S4. Example for partial Agonist-neutral Antagonist combination as obtained by R Shiny app. a. dose response curve partial Agonist b. dose response curve neutral Antagonist c. expected BIGL null response surface d. isobologram of the null model.