Network-based prediction of drug combinations

Drug combinations, offering increased therapeutic efficacy and reduced toxicity, play an important role in treating multiple complex diseases. Yet, our ability to identify and validate effective combinations is limited by a combinatorial explosion, driven by both the large number of drug pairs as well as dosage combinations. Here we propose a network-based methodology to identify clinically efficacious drug combinations for specific diseases. By quantifying the network-based relationship between drug targets and disease proteins in the human protein–protein interactome, we show the existence of six distinct classes of drug–drug–disease combinations. Relying on approved drug combinations for hypertension and cancer, we find that only one of the six classes correlates with therapeutic effects: if the targets of the drugs both hit disease module, but target separate neighborhoods. This finding allows us to identify and validate antihypertensive combinations, offering a generic, powerful network methodology to identify efficacious combination therapies in drug development.

I have a few concerns that I would invite the authors to address in their revised version: 1. The authors state that one of the key finding of the paper is that Overlapping Exposure, while it has no statistically significant efficacy in treating the disease, it has statistically significant adverse effects (last paragraph of page 8). But looking at figure 2, it seems to me that the only conclusion that can be drawn is that the separation measure between the drug target modules has no real influence on adverse effects. Only the z-score between each drug and the disease module seem to matter, as adverse effects appear independently of whether the separation measure between the two drugs is negative (figure 2a) or positive (figure 2b) as long as the z-scores for both drugs are smaller than 0. The same is confirmed in supplementary figure 9a,b for other adverse effects.
2. The authors base their key findings and test their results on just one disease, hypertension. While I understand the difficulty of finding the necessary data for other diseases I am wondering whether one can draw conclusions from just one single disease. I should add here that my personal opinion is that what the authors found for hypertension will indeed generalize to other diseases: the results shown here do make perfect sense in the context of network medicine. However, it seems to me that it would be important that the authors discuss this issue and attempt to say something on how their finding may (or may not) generalize to other diseases. In the light if this, I do feel that some statements in the abstract and discussion are a bit too general/strong (e.g. on page 13 "we demonstrated that a network-based methodology that identifies the relative network location of drug-target modules with respect to the disease module can help prioritize potentially efficacious pairwise drug combinations").
3. About the z-score measure: a) the authors write that they used the same procedure as in the Guney et al paper (and the url for the toolbox used for calculations is the same). However, I believe that in the Guney et al paper both sets of nodes (drug targets and disease proteins) were randomly shuffled to generate the reference distribution; while reading the supplementary note 1 of this manuscript, it seems to me that only the disease proteins were shuffled. Is there a reason for this? b) The point above becomes confusing when one reads the second half of page 5 of the manuscript. Here the authors state that: "each drug has only a small number of experimentally reported targets (on average 3, Supplementary Fig. 2). Therefore, the randomization procedure is not producing a Gaussian distribution as described in our previous study, limiting the applicability of the z-score". I am puzzled because, as I mentioned earlier, I believe that in the Guney et al paper both drug targets and disease proteins were randomly shuffled. At this point, I don't understand why, in Supplementary Fig 3, the z-score cannot discriminate FDA-approved pairwise combinations or clinically reported adverse drug interactions from random drug pairs.
1) The first part of the result section on page 4, is a bit difficult to follow, and I believe it should be slightly expanded. Also, I note that the z-score is not symmetric (swapping the set of drug targets and the set of disease proteins would give a different results), and while it is clear that the order used by the authors is the only one that makes sense, it would help the reader if they could add a sentence explaining their choice.
2) The three adverse interactions presented in Supplementary figure 9 (arrhythmia, heart failure and myocardial infarction) appear rather abruptly on page 9 --it would be easier for the reader if they had been introduced earlier.
3) There has been some recent work in predicting adverse effects for pairs of drugs. While I understand that the problem addressed in this manuscript is different, as diseases are included in the analysis, I believe it would help the reader to explain the relation with other network-based DDI prediction methods, eg. Cami et al., PLoS ONE, 2013. 4) Important issues in drug combinations which are not discussed in the paper are dosage and combinations of more than 2 drugs. It would be interesting if the authors could say a few words on how (and if) this methodology could be extended to include these two aspects of the problem. 5) In figure 1d, it seems that two drugs with very negative separation score would have very low chemical similarity, which is counterintuitive. A similar observation can be done for cellular component similarity. Could the authors comments on this? 6) In figure 2 a, I believe that the number of drug pairs for the real case should be 1 instead of 0, since in figure 3 there is one circle in a blue square (indicating P1). 7) I found the legend of Supplementary figure 4d difficult to follow.
Reviewer #2 (Remarks to the Author): In this manuscript the authors apply network-based approaches for the prediction of synergistic drug combinations, with a particular focus on hypertension. They determine that for successful drug synergism the targets of the two drugs must be separate from each other, and both must overlap with the disease module in a concept termed 'complementary exposure'. In contrast, the other five possible relationships between drug targets and disease modules are not enriched for synergistic combinations, in some instances rather for antagonistic ones. Overall, the proposed concept is intriguing and will be of wide interest to the community. However, before publication, some mechanistic validation of the concept is desirable.
Main points: -The authors have selected only drugs with at least two targets. How many drugs were excluded based on them only having a single reported target. Can the method be repurposed to include these drugs? -More problematically, we know that all drugs engage multiple cellular targets, and these interactions often remain elusive from biochemical studies and many such interactions may thus have simply not yet been reported. On the other hand, drug-target databases also contain irrelevant connections that only occur at supra-physiological concentrations. Furthermore, a binary drug-target concept does not take into account the dose dependence of interactions and different interaction strengths. It appears that detection of a single shared target would change the relationship from "complementary exposure" to "overlapping exposure". Would a more quantitative measure of both drug-target relationships and network overlap be more appropriate? Can the method be repurposed to detect novel target modules for drugs? -For several categories, e.g. "non-exposure" and "independent action" there appear very few drugs in the respective categories (less than 0.5 drug pairs in the random set). At least it should be discussed whether the analysis is sufficiently powered in these categories. Also, why are the number of drug pairs in the random category so different in the hypertension combinations and adverse drug interactions high blood pressure sets (Fig. 2). Different drugs in the two categories? Randomization of targets not drugs? It should be possible to study the same combined drug set in both categories.
-Most drug-drug interactions, particularly those known from clinical data, occur through modulation of drug metabolism thus altering drug plasma levels. It is unclear whether the same is true for the drug-drug interactions that were used for validating the different network models. If, as I expect, the majority of synergisms and antagonisms between these drugs occur by modulation of ADME properties, e.g. via cytochrome P450 enzymes, then how would the canonical targets of the anti-hypertensive disease module be of relevance. Or are these drug metabolic enzymes also part of the target and disease networks and major drivers of the predictions? -While the authors show that the "overlapping exposure" category is associated with higher incidence of adverse combinations also in other diseases, no such validation is attempted for the positive effects of combining compounds in the "complementary exposure" category. It would be desirable to add such data, which could also function as a validation set following the use of hypertension data as a training set to develop the method.
Minor points: -The authors find that FDA-approved drug combinations have a lower target network distance compared to random combinations. Possible reasons for this effect should be added to the discussion. This most likely is caused by selection bias, in that combinations of drugs that target related proteins in the same disease module are more likely to be tested in clinical combination trials. To detect this bias, the authors should compare approved drug combinations to all combinations entering clinical trials. -The authors start with a long introduction on the z-score as distance measure, but then use sscores instead. I propose they either consistently compare z-scores and s-scores throughout (e.g. in Fig d-j) or rewrite the beginning of the results section with less focus on z-scores -Why are there differences between the blue columns "Drug Combinations Hypertension" between Fig. 2 and Fig S10 specifically for the "Complementary exposure" and "indirect exposure" categories? These panels appear flipped.

REVIEWERS' COMMENTS:
Reviewer #1 (Remarks to the Author): The authors have done a great amount of work addressing our comments. Most of my concerns have been addressed. Only one of the points that I raised earlier is still unclear to me, and I would invite the authors to address it.
As I had mentioned in my earlier review, the authors state that one of the key findings of the paper is that Overlapping Exposure, while it has no statistically significant efficacy in treating the disease, it has statistically significant adverse effects. The new Figure 2 in the main paper seems to support this claim. However, in my opinion, this claim is not supported by Figure 11 in the supplementary material (which was Figure 2 of the main paper in the earlier version of the manuscript). As I had mentioned earlier, in my opinion, the only conclusion that can be drawn from Figure 11 in SM is that the separation measure between the drug target modules has no real influence on adverse effects. Only the z-scores between each drug and the disease module seem to matter, as adverse effects appear independently of whether the separation measure between the two drugs is negative (figure 11a) or positive (figure 11b) as long as the z-scores for both drugs are smaller than 0. This is somewhat acknowledged in the discussion, where the authors write that << Altogether, adverse effects can appear independently from the separation of the two drug target modules, occurring significantly in both Overlapping Exposure (Fig. 2a) and Complementary Exposure (Fig.  2b).>> However, I notice that for the "adverse drug interaction" column, the P value for Complementary Exposure (Fig. 2b) is not significant (while it is significant in Figure 11b  I would probably invite the authors to add a subsection in the Methods section (or in SM) where they clearly explain the different randomization procedures used in the paper and, for each, which datasets they used.
Also related to this: the legend of Figure 2, was not clear to me: << We randomly selected the same number of adverse drug-drug interactions on high-blood pressure from 1,512 clinically reported adverse interactions corresponding to the number of antihypertensive combinations using a bootstrapping algorithm in R software and this process was repeated 100 times. >> Also, the same legend refers to Supplementary Table 6, which contains the "top 30 networkpredicted combinations for hydrochlorothiazide in treatment of hypertension". It is not clear to me how this data originated and was used.
OTHER POINTS: 1) Supplementary table 3 (referred to in page 8) contains data about Hypertension, but the datasets for cancer are missing.
2) As I mentioned in my earlier review, the first part of the result section is a bit difficult to follow, and I believe it should be slightly expanded. Also, I notice that the z-score is not symmetric (swapping the set of drug targets and the set of disease proteins would give a different results), and while it is clear to me that the order used by the authors is the only one that makes sense, it would help the reader if they could add a sentence explaining their choice. In point 5 of their rebuttal, the authors wrote that they we have extended this explanation in line with my recommendation, but I could not find it.
3) Having now looked at cancer drug combinations, it would have been interesting to look at the statistical significance for the adverse drug reactions for cancer, as it was done for hypertension. I understand that this could be complicated by the fact that anti-cancer drugs tend to produce a myriad of adverse effects, but I would suggest that the authors at least discuss this point in the Discussion section. 4) I think it would be interesting to include the explanation provided in point 9 of the author's rebuttal in the supplementary material.
Reviewer #2 (Remarks to the Author): In the revised version, the authors have not expanded the concept of network-based combination prediction beyond hypertension and show that also for cancer drug combinations the "complementary exposure" mode correlates with increased numbers of approved combinations. Thereby they provide an important second disease example suggesting the future exploration of the general applicability of the concept. The majority of my points have now been addressed, and I support publication of the manuscript.