Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Network medicine framework shows that proximity of polyphenol targets and disease proteins predicts therapeutic effects of polyphenols


Polyphenols, natural products present in plant-based foods, play a protective role against several complex diseases through their antioxidant activity and by diverse molecular mechanisms. Here we develop a network medicine framework to uncover mechanisms for the effects of polyphenols on health by considering the molecular interactions between polyphenol protein targets and proteins associated with diseases. We find that the protein targets of polyphenols cluster in specific neighbourhoods of the human interactome, whose network proximity to disease proteins is predictive of the molecule’s known therapeutic effects. The methodology recovers known associations, such as the effect of epigallocatechin-3-O-gallate on type 2 diabetes, and predicts that rosmarinic acid has a direct impact on platelet function, representing a novel mechanism through which it could affect cardiovascular health. We experimentally confirm that rosmarinic acid inhibits platelet aggregation and α-granule secretion through inhibition of protein tyrosine phosphorylation, offering direct support for the predicted molecular mechanism. Our framework represents a starting point for mechanistic interpretation of the health effects underlying food-related compounds, allowing us to integrate into a predictive framework knowledge on food metabolism, bioavailability and drug interaction.


Diet plays a defining role in human health. Indeed, while poor diet can substantially increase the risk for coronary heart disease and type 2 diabetes mellitus (T2D), a healthy diet can play a protective role, even mitigating genetic risk for coronary heart disease1. Polyphenols are a class of compounds present in plant-based foods, including fruits, vegetables, nuts, seeds, beans (for example coffee and cocoa), herbs, spices, tea and wine, that play a well-documented protective role as antioxidants, which affect several diseases, from cancer to T2D, cardiovascular and neurodegenerative diseases2,3. Previous efforts have profiled over 500 polyphenols in more than 400 foods4,5 and have documented the high diversity of polyphenols to which humans are exposed through their diet, ranging from flavonoids to phenolic acids, lignans and stilbenes.

The underlying molecular mechanisms through which specific polyphenols exert their beneficial effects on human health remain largely unexplored. From a mechanistic perspective, dietary polyphenols are not engaged in endogenous metabolic processes of anabolism and catabolism, but rather affect human health through their anti- or pro-oxidant activity6 by binding to proteins and modulating their activity7,8, interacting with digestive enzymes9 and modulating gut microbiota growth10,11. Yet the variety of experimental settings and the limited scope of studies that explore the molecular effects of polyphenols have, to date, offered a range of often conflicting evidence. For example, two clinical trials, both limited in terms of the number of subjects and the intervention periods, resulted in conflicting conclusions about the beneficial effects of resveratrol on glycemic control in T2D patients12,13. We therefore need a framework to interpret the evidence present in the literature and to offer in-depth mechanistic predictions of the molecular pathways responsible for the health implications of polyphenols present in the diet. Ultimately, these insights could help us provide evidence on causal diet–health associations as well as guidelines of food consumption for different individuals and help to develop novel diagnostic and therapeutic strategies that could lead to the synthesis of novel drugs.

Here, we address this challenge by developing a network medicine framework to capture the molecular interactions between polyphenols and their cellular binding targets, unveiling their relationship to complex diseases. The developed framework is based on the human interactome, a comprehensive subcellular network consisting of all known physical interactions between human proteins that has been validated previously as a platform for understanding disease mechanisms14,15, rational drug target identification and drug repurposing16,17.

We find that the proteins to which polyphenols bind form identifiable neighbourhoods in the human interactome, allowing us to demonstrate that the proximity between polyphenol targets and proteins associated with specific diseases is predictive of the known therapeutic effects of polyphenols. Finally, we unveil the potential therapeutic effects of rosmarinic acid (RA) on vascular diseases (VD), predicting that its mechanism of action is related to modulation of platelet function. We confirm this prediction by experiments that indicate that RA modulates platelet function in vitro by inhibiting tyrosine protein phosphorylation. Altogether, our results demonstrate that the network-based relationship between disease proteins and polyphenol targets offers a tool to systematically unveil the health effects of polyphenols.


Polyphenol targets cluster in specific functional neighbourhoods of the interactome

We mapped the targets of 65 polyphenols (Methods) to the human interactome, consisting of 17,651 proteins and 351,393 interactions (Fig. 1a,b). We find that 19 of the 65 polyphenols have only one protein target, while a few polyphenols have an exceptionally large number of targets (Fig. 1c). We computed the Jaccard index (JI) of the protein targets of each polyphenol pair, finding only a limited similarity of targets among different polyphenols (average JI = 0.0206) (Supplementary Fig. 1a). Even though the average JI is small, it is still significantly higher (Z = 147, Supplementary Fig. 1b) than the JI expected if the polyphenol targets were randomly assigned from the pool of all network proteins with degrees matching the original set. This finding suggests that while each polyphenol targets a specific set of proteins, their targets are confined to a common pool of proteins, likely determined by commonalities in the polyphenol-binding domains of the three-dimensional structure of the protein targets18. Gene ontology enrichment analysis recovers existing mechanisms8 and also helps identify new processes related to polyphenol protein targets, such as post-translational protein modifications, regulation and xenobiotic metabolism (Fig. 1d). The enriched gene ontology categories indicate that polyphenols modulate common regulatory processes, but the low similarity in their protein targets, illustrated by the low average JI, indicates that they target different processes within the same process.

Fig. 1: Properties of polyphenol protein targets.

a, Schematic representation of the human interactome, highlighting regions where polyphenol targets and disease proteins are localized. b, Diagram showing the selection criteria for the polyphenols evaluated in this study. c, Distribution of the number of polyphenol protein targets mapped to the human interactome. d, Top (n = 15) enriched gene ontology terms (Biological Process) among all polyphenol protein targets. The x axis shows the proportion of targets mapped to each pathway. e, Size of the LCC formed by the targets of each polyphenol in the interactome and the corresponding significance (Z-score).

We next asked whether the polyphenol targets cluster in specific regions of the human interactome. We focused on polyphenols with more than two targets (n = 46, Fig. 2) and measured the size and significance of the largest connected component (LCC) formed by the targets of each polyphenol. We found that 25 of the 46 polyphenols have a larger LCC than expected by chance (Z-score > 1.95) (Fig. 1e and Fig. 2). In agreement with experimental evidence documenting the effect of polyphenols on multiple pathways19, we find that ten polyphenols have their targets organized in multiple connected components of size > 2.

Fig. 2: Protein–protein interactions of polyphenol targets.

The 23 polyphenols whose targets form connected components in the interactome and their respective subgraphs. For example, piceatannol targets form a unique connected component of 23 proteins, while quercetin targets form multiple connected components, the largest having 140 proteins. Polyphenol targets that are not connected to any other target are not shown in the figure. Colours distinguish connected components of different polyphenols.

These results indicate that the targets of polyphenols modulate specific well-localized neighbourhoods of the interactome (Fig. 2 and Supplementary Fig. 1c). This prompted us to explore whether the interactome regions targeted by the polyphenols reside within network neighbourhoods associated with specific diseases, thereby seeking a network-based framework to unveil the molecular mechanisms through which specific polyphenols modulate health.

Proximity between polyphenol targets and disease proteins reveals their therapeutic effects

Polyphenols can be viewed as drugs in that they bind to specific proteins, affecting their ability to perform their normal functions. We therefore hypothesized that we can apply the network-based framework used to predict the efficacy of drugs in specific diseases16,17 to also predict the therapeutic effects of polyphenols. The closer the targets of a polyphenol are to disease proteins, the more likely that the polyphenol will affect the disease phenotype. We therefore calculated the network proximity between polyphenol targets and proteins associated with 299 diseases using the closest measure, dc, representing the average shortest path length between each polyphenol target and the nearest disease protein (Methods). Consider, for example, (−)-epigallocatechin-3-O-gallate (EGCG), a polyphenol abundant in green tea. Epidemiological studies have found a positive relationship between green tea consumption and reduced risk of T2D20,21, and physiological and biochemical studies have shown that EGCG presents glucose-lowering effects in both in vitro and in vivo models22,23. We identified 54 experimentally validated EGCG protein targets and mapped them to the interactome, finding that the EGCG targets form an LCC of 17 proteins (Z-score = 7.61) (Fig. 3a). We also computed the network-based distance between EGCG targets and 83 proteins associated with T2D, finding that the two sets are significantly proximal to each other. We ranked all 299 diseases based on their network proximity to the EGCG targets to determine whether we could recover the 82 diseases in which EGCG has known therapeutic effects according to the comparative toxicogenomics database (CTD)24. By this analysis, we were able to recover 15 previously known therapeutic associations among the top 20 ranked diseases (Table 1), confirming that network proximity can discriminate between known and unknown disease associations for polyphenols, as previously confirmed for drugs16,17.

Fig. 3: Proximity between polyphenol targets and disease proteins is predictive of the therapeutic effects of the polyphenol.

a, Interactome neighbourhood showing the EGCG protein targets and their interactions with T2D-associated proteins. b, Distribution of AUC values of the predictions of therapeutic effects for 65 polyphenols. c, Comparison of the EGCG–disease associations for the CTD and the in-house database derived from the manual curation of the literature. d, Comparison of the prediction performance of the known EGCG–disease associations from the CTD, the in-house manually curated database or the combined datasets. CI, confidence interval; TPR, true positive rate, FPR, false positive rate; dashed line, random expectation.

Table 1 Top 20 predicted therapeutic associations between EGCG and human diseases

We expanded these methods to all polyphenol–disease pairs to predict diseases for which specific polyphenols might have therapeutic effects. For this analysis, we grouped all 19,435 polyphenol–disease associations between 65 polyphenols and 299 diseases into known (1,525) and unknown (17,910) associations. The known polyphenol–disease set was retrieved from the CTD, which is limited to manually curated associations for which there is literature-based evidence. For each polyphenol, we tested how well network proximity discriminates between the known and unknown sets by evaluating the area under the curve (AUC) of the receiving operating characteristic curve. For EGCG, network proximity offers good discriminative power (AUC = 0.78, CI = 0.70–0.86) between diseases with known and unknown therapeutic associations (Table 1). We find that network proximity (dc) offers predictive power with an AUC > 0.7 for 31 polyphenols (Fig. 3b). The methodology recovers many associations well-documented in the literature, such as the beneficial effects of umbelliferone on colorectal neoplasms25,26. In Table 2, we summarize the top 10 polyphenols for which the network medicine framework offers the best predictive power of therapeutic effects, limiting the entries to those with predictive performance of AUC > 0.6 and where the precision of the performance of the top predictions is greater than 0.6. Given the lack of data on true negative examples, we considered unknown associations as negative cases, observing the same trend when we used an alternative performance metric that does not require true negative labels (that is, AUC of the Precision–Recall curve) (Supplementary Fig. 2).

Table 2 Top-ranked polyphenols

Finally, we performed multiple robustness checks to exclude the role of potential biases in the input data. To test whether the predictions are biased by the set of known associations retrieved from the CTD, we randomly selected 100 papers from PubMed containing medical subject headings (MeSH) terms that tag EGCG to diseases. We manually curated the evidence for EGCG’s therapeutic effects for the diseases discussed in the published papers, excluding reviews and non-English language publications. The dataset was processed to include implicit associations (Methods), resulting in a total of 113 diseases associated with EGCG, of which 58 overlap with the associations reported by the CTD (Fig. 3c). We observed that the predictive power of network proximity was unaffected by whether we considered the annotations from the CTD, the manually curated list or the union of both (Fig. 3d). To test the role of potential biases in the interactome, we repeated our analysis using only high-quality polyphenol–protein interactions retrieved from ligand–protein three-dimensional resolved structures (Supplementary Fig. 1d) and a subset of the interactome derived from an unbiased high-throughput screening (Supplementary Fig. 1f). We found that the predictive power was largely unchanged, indicating that the literature bias in the interactome does not affect our findings. Finally, we retested the predictive performance by considering not only the therapeutic polyphenol–disease associations, but also the marker/mechanism ones (another type of curated association available in the CTD) finding that the predictive power remains largely unchanged (Supplementary Notes and Supplementary Fig. 3).

Network proximity predicts gene expression perturbation induced by polyphenols

To validate that network proximity reflects the biological activity of polyphenols observed in experimental data, we retrieved expression perturbation signatures from the Connectivity Map database27 for the treatment of the breast cancer MCF7 cell line with 21 polyphenols (Supplementary Table 1 and Supplementary Fig. 4). We investigated the relationships between the extent to which polyphenols perturb the expression of disease genes, the network proximity between the polyphenol targets and disease proteins and their known therapeutic effects (Fig. 4a). For example, we observe different perturbation profiles for gene pools associated with different diseases: for treatment with genistein (1 µM, 6 hours) we observed 10 skin disease genes with perturbation scores greater than 2, while we observed only one highly perturbed cerebrovascular disorder gene (Fig. 4b). Indeed, network proximity indicates that skin disease is closer to the genistein targets than cerebrovascular disorder, suggesting a relationship between network proximity, gene expression perturbation and the therapeutic effects of the polyphenol (Fig. 4a). To test this hypothesis, we computed an enrichment score that measures the over-representation of disease genes among the most perturbed genes (Methods), finding 13 diseases that have their genes significantly enriched among the genes most deregulated by genistein, of which four have known therapeutic associations. We find that these four diseases are significantly closer to the genistein targets than the nine diseases with unknown therapeutic associations (Fig. 4c). We observed a similar trend for treatments with other polyphenols, whether we use the same concentration (1 µM, Fig. 4c) or different ones (100 nM to 10 µM, Supplementary Fig. 5). This result suggests that changes in gene expression caused by a polyphenol are indicative of its therapeutic effects, but only if the observed expression change is limited to proteins proximal to the polyphenol targets (Fig. 4a).

Fig. 4: Relationships among gene expression perturbation, network proximity and the therapeutic effects of polyphenols on diseases.

a, Schematic representation of the relationships between the extent to which a polyphenol perturbs disease genes expression, its proximity to the disease genes and its therapeutic effects. b, Interactome neighbourhood showing the modules of skin diseases, genistein and cerebrovascular disorders. The skin diseases module has 10 proteins with high perturbation scores (>2) in the treatment of the MCF7 cell line with 1 µM of genistein. Genes associated with skin disease are significantly enriched among the most differentially expressed genes, and the maximum perturbation score among disease genes is higher in skin disease than cerebrovascular disorders. c, Among the diseases in which genes are enriched with highly perturbed genes, those with therapeutic associations show smaller network distances to the polyphenol targets than those without. The same trend is observed in treatments of the polyphenols quercetin, resveratrol and myricetin. Boxplots show the median (horizontal line), 25th and 75th percentiles (lower and upper boundaries, respectively). Whiskers extend to data points that lie within 1.5 interquartile ranges of the 25th and 75th quartiles; and observations that fall outside this range are displayed independently.

Consequently, network proximity should also be predictive of the overall gene expression perturbation caused by a polyphenol on the genes of a given disease. To test this hypothesis, in each experimental combination defined by the polyphenol type and its concentration, we evaluated the maximum perturbation among genes for each disease. We then compared the magnitude of the observed perturbation between diseases that were proximal (dc below the 25th percentile, \(Z_{d_{{\mathrm{c}}}}\) < −0.5) or distal (dc above the 75th percentile, \(Z_{d_{{\mathrm{c}}}}\) > −0.5) to the polyphenol targets. Figure 5a,b and Supplementary Fig. 6 show the results for the genistein treatment (1 µM, 6 hours), which indicate that diseases proximal to the polyphenol targets show higher maximum perturbation values than distal diseases. The same trend is observed for other polyphenols when we use different dc and \(Z_{d_{{\mathrm{c}}}}\) thresholds for defining proximal and distant diseases (Fig. 5b and Supplementary Figs. 69), confirming that the impact of a polyphenol on cellular signalling pathways is localized in the network space, being greater in the vicinity of the polyphenol targets than in neighbourhoods remote from these targets. We also considered gene expression perturbations in the network vicinity of the polyphenol targets, regardless of whether the proteins were disease proteins, and observed higher perturbation scores for proximal proteins in 12 out 21 polyphenols tested at 10 µM (Supplementary Fig. 10). Finally, we found that the enrichment score of perturbed genes among disease genes was not as predictive of the polyphenol therapeutic effects as network proximity (Supplementary Fig. 11).

Fig. 5: Diseases proximal to polyphenol targets have higher gene expression perturbation profiles.

a, Proximal and distal diseases in relation to genistein targets. Each node represents a disease and the node size is proportional to the perturbation score after treatment with genistein (1 µM, 6 hours). Distance from the origin represents the network proximity (dc) to genistein targets. Purple nodes represent diseases for which the therapeutic association was previously known. b, Cumulative distribution function (CDF) of the maximum perturbation scores of genes from diseases that are distal or proximal to polyphenol targets for the polyphenols genistein, quercetin, resveratrol and myricetin (1 µM, 6 hours). Statistical significance was evaluated with the Kolmogorov–Smirnov test.

Altogether, these results indicate that network proximity offers a mechanistic interpretation for the gene expression perturbations induced by polyphenols on disease genes. They also show that network proximity can indicate when gene expression perturbations result in therapeutic effects, suggesting that future studies could integrate gene expression (whenever available) with network proximity as they aim to more accurately prioritize polyphenol–disease associations.

Experimental evidence confirms that RA modulates platelet function

To demonstrate how the network-based framework can facilitate the mechanistic interpretation of the therapeutic effects of selected polyphenols, we next focus on VD. Of 65 polyphenols evaluated in this study, we found 27 to have associations to VD, as their targets were within the VD network neighbourhood (Supplementary Table 3). We therefore inspected the targets of 15 of the 27 polyphenols with 10 or fewer targets. The network analysis identified direct links between biological processes related to vascular health and the targets of three polyphenols: gallic acid, RA and 1,4-naphthoquinone (Supplementary Fig. 12 and Supplementary Notes). The network neighbourhood containing the targets of these polyphenols suggests that gallic acid activity involves thrombus dissolution processes, RA acts on platelet activation and antioxidant pathways through FYN and its neighbours and 1,4-naphthoquinone acts on signalling pathways of vascular cells through MAP2K1 activity (Supplementary Fig. 12 and Supplementary Notes).

To validate the developed framework, we set out to obtain direct experimental evidence of the predicted mechanistic role of RA in VD. The RA targets are in close proximity to proteins related to platelet function, forming the RA/VD-platelet module: a connected component formed by the RA target FYN and the VD proteins associated with platelet function PDE4D, CD36 and APP (Fig. 6a). We therefore asked whether RA influenced platelet activation in vitro. As platelets can be stimulated through different activation pathways, RA effects can, in principle, occur in any of them. To test these different possibilities, we pretreated platelets with RA and then either (1) activated glycoprotein VI by collagen or collagen-related peptide (CRP/CRPXL); (2) activated protease-activated receptors-1,4 by thrombin receptor activator peptide-6 (TRAP-6); (3) activated prostanoid thromboxane receptor by the thromboxane A2 analogue (U46619) or (4) activated P2Y1/12 receptor by adenosine diphosphate (ADP)28. When we compared the network distance between each stimulant receptor and the RA/VD-platelet module (Fig. 6a), we observed that the receptors for CRP/CRPXL, TRAP-6 and U46619 are closer than would be expected for a random distribution, while the receptor for ADP is more distant (Fig. 6b). We expected that platelets would be most affected by RA when treated with stimulants whose receptors are most proximal to the RA/VD-platelet module, that is, CRP/CRPXL, TRAP-6 and U46619, and as a control, we expect no effect for the distant ADP receptor. The experiments confirm this prediction: RA inhibits collagen-mediated platelet aggregation (Fig. 6c) and impairs dense granule secretion induced by CRPXL, TRAP-6 and U46619 (Supplementary Fig. 13). RA-treated platelets also displayed dampened α-granule secretion (Fig. 6d) and integrin αIIbβ3 activation (Supplementary Fig. 13) in response to U46619. As expected, RA did not affect platelet function when we used an agonist whose receptor is distant from the RA/VD-platelet module (that is, ADP). These findings suggest that RA impairs basic hallmarks of platelet activation via strong network effects, supporting our hypothesis that the proximity between RA targets and the neighbourhood associated with platelet function (Fig. 6a) could in part explain RA’s impact on VD.

Fig. 6: RA modulates platelet function.

a, Interactome neighbourhood showing RA targets and the RA/VD-platelet module (that is, the connected component formed by the RA target FYN and the VD proteins associated with platelet function (PDE4D, CD36 and APP)) and the receptor for platelet agonists used in our experiments (collagen/CRPXL, TRAP-6, U46619 and ADP). b, Average shortest path length from each platelet agonist receptor and the RA/VD-platelet module formed by the proteins FYN, PDE4D, CD36 and APP. Error bars represent standard deviation of that same measure over 1,000 iterations of random selection of nodes in a degree-preserving fashion. cf, Platelet-rich plasma (PRP) or washed platelets were pretreated with RA for 1 hour before stimulation with either collagen (1 μg ml−1), collagen-related peptide (CRPXL, 1 μg ml−1), thrombin receptor activator peptide-6 (TRAP-6, 20 μM), U46619 (1 μM) or ADP (10 μM). Platelets were assessed for either aggregation (c) or α-granule secretion (d). Platelet lysates were also probed for either non-specific tyrosine phosphorylation (p-Tyr) of the whole cell lysate (e) or site-specific phosphorylation of Src family kinases (SFKs) and FYN at residue 416 (f). n = 3–6 separate blood donations. In c and d, points represent the mean and error bars represent ± s.e.m. In f, points represent values from biological replicates and error bars represent s.e.m. P values were determined by the Kruskal–Wallis test in c and d and by unpaired t-tests in f.

We next sought to clarify the molecular mechanisms involved in the impact of RA on platelets. Given that platelet activation is coordinated by several kinases, we hypothesized that RA inhibits platelet function by blocking agonist-induced protein tyrosine phosphorylation. We observed that RA-treated platelets demonstrated a dose-dependent reduction in total tyrosine phosphorylation in response to CRPXL, TRAP-6 and U46619 (Fig. 6e). Given that RA caused a substantial decrease in phosphorylation of proteins with atomic mass between 50 and 60 kDa (Fig. 6e), we hypothesized that RA may reduce phosphorylation of FYN (59 kDa) or other similarly sized members of the same protein family (that is, SFKs). To test this, we measured the level of phosphorylation within the activation domain (amino acid 416) of SFKs, finding that RA reduced collagen induced phosphorylation of FYN as well as basal tyrosine phosphorylation of SFKs (Fig. 6f). This indicates that RA perturbs the phospho-signalling networks that regulate platelet response to extracellular stimuli.

Taken together, these findings support our prediction that RA modulates platelet activation and function. They also support the observation that its mechanism of action involves reduction of phosphorylation at the activation domain of the protein tyrosine kinase FYN (Fig. 6a) and the inhibition of general tyrosine phosphorylation. Finally, while polyphenols are usually associated with their antioxidant function, here we illustrate another mechanistic pathway through which they could benefit health.


Here, we propose a network-based framework to predict the therapeutic effects of dietary polyphenols in human diseases. We find that polyphenol protein targets cluster in specific functional neighbourhoods of the interactome, and we show that the network proximity between polyphenol targets and disease proteins is predictive of the therapeutic effects of polyphenols. We demonstrate that diseases whose proteins are proximal to polyphenol targets tend to have significant changes in gene expression in cell lines treated with the respective polyphenol, while such changes are absent for diseases whose proteins are distal to polyphenol targets. Finally, we find that the network neighbourhood around the RA targets and VD proteins are related to platelet function. We validate this mechanistic prediction by showing that RA modulates platelet function through inhibition of protein tyrosine phosphorylation. These observations suggest a role of RA on prevention of VD by inhibiting platelet activation and aggregation.

The observed results also suggest multiple avenues through which our ability to understand the role of polyphenols could be improved. First, some of the known health benefits of polyphenols might be caused not only by the native molecules but also by their metabolic byproducts29,30. However, we lack data concerning colonic degradation, liver metabolism, bioavailability and interaction with proteins of specific polyphenols or their metabolic byproducts. Future experimental data on protein interactions with polyphenol byproducts and conjugates could be incorporated in the proposed framework, further improving the accuracy of our predictions. The lack of these data does not invalidate the findings presented here, since previous studies report the presence of unmetabolized polyphenols in blood31,32,33 and it has been hypothesized that, in some instances, deconjugation of liver metabolites occurs in specific tissues or cells34,35,36. Therefore, the lack of data concerning specific polyphenols and the fact that other mechanisms exist through which they can affect health (for example, antioxidant activity and microbiota regulation) explain why this methodology might still miss a few known relationships between polyphenols and diseases. Second, considering that several experimental studies of polyphenol bioefficacy have been observed in in vitro and in vivo models, the proposed framework might help us interpret literature evidence, possibly even allowing us to exclude chemical candidates when considering the health benefits provided by a given food in epidemiological association studies.

Our assumption that network proximity recovers therapeutic associations is based on its predictive performance on a ground-truth dataset for observed therapeutic effects and also relies on previous observations about the effect of drugs on diseases16,17,37. While the proposed methodology offers a powerful prioritization tool to guide future research, the real effect of polyphenols on diseases might still be negative, given other unmet factors such as dosage, comorbidities and drug interactions, which can only be ruled out by preclinical and clinical studies. Gene expression perturbation profiles, such as the ones provided by the Connectivity Map, can also be integrated with network proximity to further highlight potential beneficial or harmful effects of chemical compounds38,39.

The low bioavailability of some polyphenols in food might still present challenges when considering the therapeutic utility of these molecules. However, 48 of the 65 polyphenols we explored here are predicted to have high gastrointestinal absorption (Supplementary Table 2) and different methodologies are available to increase bioavailability of natural compounds40,41. Additionally, in the same way that the polyphenol phlorizin led to the discovery of new strategies for disease treatment resulting in the development of new compounds with higher efficacy42, we believe that the present methodology can help us identify polyphenol-based candidates for drug development.

The methodology introduced here offers a foundation for the mechanistic interpretation of alternative pathways through which polyphenols can affect health, such as the combined effect of different polyphenols37,43 and their interactions with drugs44. To address such synergistic effects, we need ground-truth data on these aspects. The developed methodology can be applied to other food-related chemicals, providing a framework by which to understand their health effects. Future research may help us also account for the way food-related chemicals affect endogenous metabolic reactions, impacting not only signalling pathways but also catabolic and anabolic processes. Finally, the methodology provides a framework to interpret and find causal support for associations identified in observational studies. Taken together, the proposed network-based framework has the potential to systematically reveal the mechanism of action underlying the health benefits of polyphenols, offering a logical, rational strategy for mechanism-based drug development of food-based compounds.


Building the interactome

The human interactome was assembled from 16 databases containing six different types of protein–protein interactions (PPIs): (1) binary PPIs tested by high-throughput yeast two-hybrid (Y2H) experiments45; (2) kinase–substrate interactions from literature-derived low-throughput and high-throughput experiments from KinomeNetworkX46, Human Protein Resource Database (HPRD)47 and PhosphositePlus48; (3) carefully literature-curated PPIs identified by affinity purification followed by mass spectrometry (AP-MS) and from literature-derived low-throughput experiments from InWeb49, BioGRID50, PINA51, HPRD52, MINT53, IntAct53 and InnateDB54; (4) high-quality PPIs from three-dimensional protein structures reported in Instruct55, Interactome3D56 and INSIDER57; (5) signalling networks from literature-derived low-throughput experiments as annotated in SignaLink2.0 (ref. 58) and (6) protein complexes from BioPlex2.0 (ref. 59). The genes were mapped to their Entrez ID based on the National Center for Biotechnology Information (NCBI) database as well as their official gene symbols. The resulting interactome includes 351,444 PPIs connecting 17,706 unique proteins (Supplementary Data 1). The LCC has 351,393 PPIs and 17,651 proteins.

Polyphenols, polyphenol targets and disease proteins

We retrieved 759 polyphenols from the PhenolExplorer database4. The database lists polyphenols with food composition data or that have been profiled in biofluids after interventions with polyphenol-rich diets. For our analysis, we only considered polyphenols that (1) could be mapped in PubChem IDs, (2) were listed in the Comparative Toxicogenomics (CTD) database24 as having therapeutic effects on human diseases and (3) had protein-binding information present in the STITCH database60 with experimental evidence (Fig. 1a). After these steps, we considered a final list of 65 polyphenols, for which 598 protein targets were retrieved from STITCH (Supplementary Table 1). We considered 3,173 disease proteins corresponding to 299 diseases retrieved from Menche et al.15. Gene ontology enrichment analysis of protein targets was performed using the Bioconductor package clusterProfiler with a significance threshold of P < 0.05 and Benjamini–Hochberg multiple testing correction with Q < 0.05.

Polyphenol–disease associations

We retrieved the polyphenol–disease associations from the Comparative Toxicogenomics Database (CTD). We considered only manually curated associations labelled as therapeutic. By considering the hierarchical structure of diseases along the MeSH tree, we expanded explicit polyphenol–disease associations to also include implicit associations. This procedure was performed by propagating associations in the lower branches of the MeSH tree to consider diseases in the higher levels of the same tree branch. For example, a polyphenol associated with heart diseases would also be associated with the more general category of cardiovascular diseases. By performing this expansion, we obtained a final list of 1,525 known associations between the 65 polyphenols and the 299 diseases considered in this study.

Network proximity between polyphenol targets and disease proteins

The proximity between a disease and a polyphenol was evaluated using a distance metric that takes into account the shortest path lengths between polyphenol targets and disease proteins16. Given S, the set of disease proteins, T, the set of polyphenol targets and d(s,t), the shortest path length between nodes s and t in the network, we define:

$$d_{{\mathrm{c}}}\left( {S,T} \right) = \frac{1}{{\left\| T \right\|}}\mathop {\sum}\nolimits_{t \in T} {\min _{s \in S}d\left( {s,t} \right)}$$

We also calculated a relative distance metric (\(Z_{d_{{\mathrm{c}}}}\)) that compares the absolute distance dc(S,T) between a disease and a polyphenol with a reference distribution describing the random expectation. The reference distribution corresponds to the expected distances between two randomly selected groups of proteins matching the size and degrees of the original disease proteins and polyphenol targets in the network. It was generated by calculating the proximity between these two randomly selected groups across 1,000 iterations. The mean μd(S,T) and standard deviation σd(S,T) of the reference distribution were used to convert the absolute distance dc into the relative distance \(Z_{d_{{\mathrm{c}}}}\), defined as:

$$Z_{d_{{\mathrm{c}}}} = \frac{{d - \mu _{d_{{\mathrm{c}}}\left( {S,T} \right)}}}{{\sigma _{d_{{\mathrm{c}}}\left( {S,T} \right)}}}$$

We performed a degree-preserving random selection, but due to the scale-free nature of the human interactome, we avoid repeatedly choosing the same (high-degree) nodes by using a binning approach in which nodes within a certain degree interval were grouped together such that there were at least 100 nodes in the bin. Supplementary Data 2 reports the proximity scores dc and \(Z_{d_{{\mathrm{c}}}}\) for all pairs of diseases and polyphenols.

Area under receiving operating characteristic curve analysis

For each polyphenol, we used the AUC to evaluate how well the network proximity distinguishes diseases with known therapeutic associations from all the others in the set of 299 diseases. The set of known associations (therapeutic) retrieved from the CTD were used as positive instances and all unknown associations were defined as negative instances, and the area under the receiving operating characteristic curve was computed using the implementation in the Scikit-learn Python package. Furthermore, we calculated 95% CIs using the bootstrap technique with 2,000 resamplings with sample sizes of 150 each. Considering that the AUC provides an overall performance, we also searched for a metric to evaluate the top-ranking predictions. For this analysis, we calculated the precision of the top 10 predictions, considering only the polyphenol–disease associations with relative distance \(Z_{d_{{\mathrm{c}}}}\) < −0.5 (ref. 16).

Analysis of network proximity and gene expression deregulation

We retrieved perturbation signatures from the Connectivity Map database ( for the MCF7 cell line after treatment with 21 polyphenols. These signatures reflect the perturbation of the gene expression profile caused by the treatment with that particular polyphenol relative to a reference population, which comprises all other treatments in the same experimental plate27. For polyphenols having more than one experimental instance (such as time of exposure, cell line and dose), we selected the one with highest distil_cc_q75 value (75th quantile of pairwise Spearman correlations in landmark genes, We performed gene set enrichment analysis61 to evaluate the enrichment of disease genes among the top deregulated genes in the perturbation profiles. This analysis offers enrichment scores that have small values when genes are randomly distributed among the ordered list of expression values and high values when they are concentrated at the top or bottom of the list. The enrichment score significance is calculated by creating 1,000 random selections of gene sets with the same size as the original set and calculating an empirical P value by considering the proportion of random sets resulting in enrichment scores smaller than the original case. The P values were adjusted for multiple testing using the Benjamini–Hochberg method. The network proximity dc of disease proteins and polyphenol targets for diseases with significant enrichment scores were compared according to their therapeutic and unknown-therapeutic associations using the Student’s t-test. The relevant code for calculating the network proximity, AUCs and enrichment scores can be found at

Platelet isolation

Human blood collection was performed as previously described in accordance with the Declaration of Helsinki and ethics regulations with Institutional Review Board approval from Brigham and Women’s Hospital (P001526). Healthy volunteers did not ingest known platelet inhibitors for at least 10 days prior. Citrated whole blood underwent centrifugation with a slow brake (177 × g, 20 minutes), and the PRP fraction was acquired for subsequent experiments. For washed platelets, PRP was incubated with 1 μM prostaglandin E1 (Sigma, P5515) and immediately underwent centrifugation with a slow brake (1,000 × g, 5 minutes). Platelet-poor plasma was aspirated, and pellets were resuspended in platelet resuspension buffer (PRB; 10 mM HEPES, 140 mM NaCl, 3 mM KCl, 0.5 mM MgCl2, 5 mM NaHCO3, 10 mM glucose, pH 7.4).

Platelet aggregometry

Platelet aggregation was measured by turbidimetric aggregometry as previously described62. Briefly, PRP was pretreated with RA for 1 hour before adding 250 μl to siliconized glass cuvettes containing magnetic stir bars. Samples were placed in Chrono-Log Model 700 Aggregometers before the addition of various platelet agonists. Platelet aggregation was monitored for 6 minutes at 37 °C with a stir speed of 1,000 r.p.m. and the maximum extend of aggregation recorded using AGGRO/LINK8 software. In some cases, dense granule release was simultaneously recorded by supplementing samples with Chrono-Lume (Chrono-Log, 395) according to the manufacturer’s instructions.

Platelet α-granule secretion and integrin αIIbβ3 activation

Changes in platelet surface expression of P-selectin (CD62P) or binding of Alexa Fluor 488-conjugated fibrinogen were used to assess α-granule secretion and integrin αIIbβ3 activation, respectively. First, PRP was preincubated with RA for 1 hour, followed by stimulation with various platelet agonists under static conditions at 37 °C for 20 minutes. Samples were then incubated with APC-conjugated anti-human CD62P antibodies (BioLegend, 304910) and 100 μg ml−1 Alexa Fluor 488-Fibrinogen (Thermo Scientific, F13191) for 20 minutes before fixation in 2% (v/v) paraformaldehyde (Thermo Scientific, AAJ19945K2). For each sample, 50,000 platelets were processed using a Cytek Aurora spectral flow cytometer. Percent positive cells were determined by gating on fluorescence intensity compared to unstimulated samples.

Platelet cytotoxicity

Cytotoxicity was tested by measuring lactate dehydrogenase (LDH) release by permeabilized platelets into the supernatant63. Briefly, washed platelets were treated with various concentrations of RA for 1 hour, before isolating supernatants via centrifugation (15,000 × g, 5 min). A Pierce LDH Activity Kit (Thermo Scientific, 88953) was then used to assess supernatant levels of LDH.

Immunoprecipitation and western blot

Washed platelets were pretreated with RA for 1 hour, followed by a 15 minute treatment with Eptifibatide (50 μM). Platelets were then stimulated with various agonists for 5 minutes under stirring conditions (1,000 r.p.m., 37 °C). Platelets were lysed on ice with RIPA Lysis Buffer System (Santa Cruz, sc-24948) and supernatants clarified via centrifugation (15,000 × g, 10 min, 4 °C). For immunoprecipitation of FYN, lysates were first precleared of IgG by incubating with Protein A agarose beads (Cell Signaling Technologies, 9863S) for 30 minutes at 4 °C, before isolation of the supernatant via centrifugation (15,000 × g, 10 min, 4 °C). Supernatants were incubated with anti-FYN antibodies (Abcam, 2A10) overnight at 4 °C before incubation with Protein A beads for 1 hour. Beads were then washed five times with NP-40 lysis buffer (144 mM Tris, 518 mM NaCl, 6 mM EDTA, 12 mM Na2VO3, 33.3% (v/v) NP-40, Halt protease inhibitor cocktail (Thermo, 78429)).

For western blot analysis, total cell lysates or immunoprecipitated FYN were reduced with Laemmli Sample Buffer (Bio-Rad, 1610737) and proteins separated by molecular weight in PROTEAN TGX precast gels (Bio-Rad, 4561084). Proteins were transferred to PVDF membranes (Bio-Rad, 1620174) and probed with either 4G10 (Milipore, 05-321), a primary antibody clone that recognizes phosphorylated tyrosine residues, or primary antibodies that probe for the site-specific phosphorylation of SFKs (p-Tyr416) within their activation loop. Membranes were incubated with horseradish peroxidase-conjugated secondary antibodies (Cell Signaling Technologies, 7074S) to catalyse an electrochemiluminescent reaction (Thermo Scientific, PI32109). Membranes were visualized using a Bio-Rad ChemiDoc Imaging System, and densitometric analysis of protein lanes was conducted using ImageJ (NIH, Version 1.52a).

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All data supporting the findings of this study are available at and within the paper and its Supplementary Information files.

Code availability

Computer code is available at


  1. 1.

    Khera, A. V. et al. Genetic risk, adherence to a healthy lifestyle, and coronary disease. N. Engl. J. Med. 375, 2349–2358 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Arts, I. C. W. & Hollman, P. C. H. Polyphenols and disease risk in epidemiologic studies. Am. J. Clin. Nutr. 81, 317S–325S (2005).

    CAS  PubMed  Google Scholar 

  3. 3.

    Wang, X., Ouyang, Y. Y., Liu, J. & Zhao, G. Flavonoid intake and risk of CVD: a systematic review and meta-analysis of prospective cohort studies. Br. J. Nutr. 111, 1–11 (2014).

    ADS  PubMed  Google Scholar 

  4. 4.

    Neveu, V. et al. Phenol-Explorer: an online comprehensive database on polyphenol contents in foods. Database 2010, bap024 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Pérez-Jiménez, J., Neveu, V., Vos, F. & Scalbert, A. Systematic analysis of the content of 502 polyphenols in 452 foods and beverages: an application of the phenol-explorer database. J. Agric. Food Chem. 58, 4959–4969 (2010).

    PubMed  Google Scholar 

  6. 6.

    Zhang, H. & Tsao, R. Dietary polyphenols, oxidative stress and antioxidant and anti-inflammatory effects. Curr. Opin. Food Sci. 8, 33–42 (2016).

    Google Scholar 

  7. 7.

    Boly, R. et al. Quercetin inhibits a large panel of kinases implicated in cancer cell biology. Int. J. Oncol. 38, 833–842 (2011).

    CAS  PubMed  Google Scholar 

  8. 8.

    Lacroix, S. et al. A computationally driven analysis of the polyphenol-protein interactome. Sci. Rep. 8, 2232 (2018).

    ADS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Hanhineva, K. et al. Impact of dietary polyphenols on carbohydrate metabolism. Int. J. Mol. Sci. 11, 1365–1402 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Hervert-Hernández, D. & Goñi, I. Dietary polyphenols and human gut microbiota: a review. Food Rev. Int. 27, 154–169 (2011).

    Google Scholar 

  11. 11.

    Zhang, S. et al. Dietary pomegranate extract and inulin affect gut microbiome differentially in mice fed an obesogenic diet. Anaerobe 48, 184–193 (2017).

    CAS  PubMed  Google Scholar 

  12. 12.

    Thazhath, S. S. et al. Administration of resveratrol for 5 wk has no effect on glucagon-like peptide 1 secretion, gastric emptying, or glycemic control in type 2 diabetes: a randomized controlled trial. Am. J. Clin. Nutr. 103, 66–70 (2016).

    CAS  PubMed  Google Scholar 

  13. 13.

    Bhatt, J. K., Thomas, S. & Nanjan, M. J. Resveratrol supplementation improves glycemic control in type 2 diabetes mellitus. Nutr. Res. 32, 537–541 (2012).

    CAS  PubMed  Google Scholar 

  14. 14.

    Sharma, A. et al. A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma. Hum. Mol. Genet. 24, 3005–3020 (2014).

    Google Scholar 

  15. 15.

    Menche, J. et al. Disease networks. Uncovering disease–disease relationships through the incomplete interactome. Science 347, 1257601 (2015).

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Guney, E., Menche, J., Vidal, M. & Barabási, A.-L. Network-based in silico drug efficacy screening. Nat. Commun. 7, 10331 (2016).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Cheng, F. et al. Network-based approach to prediction and population-based validation of in silico drug repurposing. Nat. Commun. 9, 2691 (2018).

    ADS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Kovács, I. A. et al. Network-based prediction of protein interactions. Nat. Commun. 10, 1240 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Sarkar, F. H., Li, Y., Wang, Z. & Kong, D. Cellular signaling perturbation by natural products. Cell. Signal. 21, 1541–1547 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Iso, H. et al. The relationship between green tea and total caffeine intake and risk for self-reported type 2 diabetes among Japanese adults. Ann. Intern. Med. 144, 554–562 (2006).

    PubMed  Google Scholar 

  21. 21.

    Song, Y., Manson, J. E., Buring, J. E., Sesso, H. D. & Liu, S. Associations of dietary flavonoids with risk of type 2 diabetes, and markers of insulin resistance and systemic inflammation in women: a prospective study and cross-sectional analysis. J. Am. Coll. Nutr. 24, 376–384 (2005).

    CAS  PubMed  Google Scholar 

  22. 22.

    Keske, M. A. et al. Vascular and metabolic actions of the green tea polyphenol epigallocatechin gallate. Curr. Med. Chem. 22, 59–69 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Wolfram, S. et al. Epigallocatechin gallate supplementation alleviates diabetes in rodents. J. Nutr. 136, 2512–2518 (2006).

    CAS  PubMed  Google Scholar 

  24. 24.

    Davis, A. P. et al. The comparative toxicogenomics database: update 2019. Nucleic Acids Res. 47, D948–D954 (2019).

    CAS  PubMed  Google Scholar 

  25. 25.

    Muthu, R., Selvaraj, N. & Vaiyapuri, M. Anti-inflammatory and proapoptotic effects of umbelliferone in colon carcinogenesis. Hum. Exp. Toxicol. 35, 1041–1054 (2016).

    CAS  PubMed  Google Scholar 

  26. 26.

    Muthu, R. & Vaiyapuri, M. Synergistic and individual effects of umbelliferone with 5-fluorouracil on tumor markers and antioxidant status of rat treated with 1,2-dimethylhydrazine. Biomed. Aging Pathol. 3, 219–227 (2013).

    CAS  Google Scholar 

  27. 27.

    Subramanian, A. et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171, 1437–1452.e17 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Grover, S. P., Bergmeier, W. & Mackman, N. Platelet signaling pathways and new inhibitors. Arterioscler. Thromb. Vasc. Biol. 38, e28–e35 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Moco, S., Martin, F. P. J. & Rezzi, S. Metabolomics view on gut microbiome modulation by polyphenol-rich foods. J. Proteome Res. 11, 4781–4790 (2012).

    CAS  PubMed  Google Scholar 

  30. 30.

    van Duynhoven, J. et al. Metabolic fate of polyphenols in the human superorganism. Proc. Natl Acad. Sci. USA 108, 4531–4538 (2011).

    ADS  PubMed  Google Scholar 

  31. 31.

    Ottaviani, J. I., Heiss, C., Spencer, J. P. E., Kelm, M. & Schroeter, H. Recommending flavanols and procyanidins for cardiovascular health: revisited. Mol. Aspects Med. 61, 63–75 (2018).

    CAS  PubMed  Google Scholar 

  32. 32.

    Stalmach, A., Troufflard, S., Serafini, M. & Crozier, A. Absorption, metabolism and excretion of Choladi green tea flavan-3-ols by humans. Mol. Nutr. Food Res. 53, S44–53 (2009).

  33. 33.

    Meng, X. et al. Identification and characterization of methylated and ring-fission metabolites of tea catechins formed in humans, mice, and rats. Chem. Res. Toxicol. 15, 1042–1050 (2002).

  34. 34.

    Perez-Vizcaino, F., Duarte, J. & Santos-Buelga, C. The flavonoid paradox: conjugation and deconjugation as key steps for the biological activity of flavonoids. J. Sci. Food Agric. 92, 1822–1825 (2012).

    CAS  PubMed  Google Scholar 

  35. 35.

    Shimoi, K. & Nakayama, T. Glucuronidase deconjugation in inflammation. Methods Enzymol. 400, 263–272 (2005).

    CAS  PubMed  Google Scholar 

  36. 36.

    Kaneko, A. et al. Glucuronides of phytoestrogen flavonoid enhance macrophage function via conversion to aglycones by β-glucuronidase in macrophages. Immun. Inflamm. Dis. 5, 265–279 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Cheng, F., Kovács, I. A. & Barabási, A.-L. Network-based prediction of drug combinations. Nat. Commun. 10, 1197 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Smalley, J. L., Gant, T. W. & Zhang, S.-D. Application of connectivity mapping in predictive toxicology based on gene-expression similarity. Toxicology 268, 143–146 (2010).

    CAS  PubMed  Google Scholar 

  39. 39.

    Lamb, J. et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).

    ADS  CAS  PubMed  Google Scholar 

  40. 40.

    Amanzadeh, E. et al. Quercetin conjugated with superparamagnetic iron oxide nanoparticles improves learning and memory better than free quercetin via interacting with proteins involved in LTP. Sci. Rep. 9, 6876 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Shaikh, J., Ankola, D. D., Beniwal, V., Singh, D. & Kumar, M. N. V. R. Nanoparticle encapsulation improves oral bioavailability of curcumin by at least 9-fold when compared to curcumin administered with piperine as absorption enhancer. Eur. J. Pharm. Sci. 37, 223–230 (2009).

    CAS  PubMed  Google Scholar 

  42. 42.

    Chao, E. C. & Henry, R. R. SGLT2 inhibition-A novel strategy for diabetes treatment. Nat. Rev. Drug Discov. 9, 551–559 (2010).

    CAS  PubMed  Google Scholar 

  43. 43.

    Caldera, M. et al. Mapping the perturbome network of cellular perturbations. Nat. Commun. 10, 5140 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Jensen, K., Ni, Y., Panagiotou, G. & Kouskoumvekaki, I. Developing a molecular roadmap of drug–food interactions. PLoS Comput. Biol. 11, e1004048 (2015).

    ADS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Rolland, T. et al. A proteome-scale map of the human interactome network. Cell 159, 1212–1226 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Cheng, F., Jia, P., Wang, Q. & Zhao, Z. Quantitative network mapping of the human kinome interactome reveals new clues for rational kinase inhibitor discovery and individualized cancer therapy. Oncotarget 5, 3697–3710 (2014).

    PubMed  PubMed Central  Google Scholar 

  47. 47.

    Calçada, D. et al. The role of low-grade inflammation and metabolic flexibility in aging and nutritional modulation thereof: a systems biology approach. Mech. Ageing Dev. (2014).

  48. 48.

    Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 43, D512–D520 (2015).

    CAS  PubMed  Google Scholar 

  49. 49.

    Li, T. et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2016).

    MathSciNet  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Chatr-Aryamontri, A. et al. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 45, D369–D379 (2017).

    CAS  PubMed  Google Scholar 

  51. 51.

    Cowley, M. J. et al. PINA v2.0: mining interactome modules. Nucleic Acids Res. 40, D862–D865 (2012).

    CAS  PubMed  Google Scholar 

  52. 52.

    Peri, S. et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res. 32, D497–D501 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Orchard, S. et al. The MIntAct project–IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, D358–D363 (2014).

    CAS  PubMed  Google Scholar 

  54. 54.

    Breuer, K. et al. InnateDB: systems biology of innate immunity and beyond–recent updates and continuing curation. Nucleic Acids Res. 41, D1228–D1233 (2013).

    CAS  PubMed  Google Scholar 

  55. 55.

    Meyer, M. J., Das, J., Wang, X. & Yu, H. INstruct: a database of high-quality 3D structurally resolved protein interactome networks. Bioinformatics 29, 1577–1579 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Mosca, R., Céol, A. & Aloy, P. Interactome3D: adding structural details to protein networks. Nat. Methods 10, 47–53 (2013).

    CAS  PubMed  Google Scholar 

  57. 57.

    Meyer, M. J. et al. Interactome INSIDER: a structural interactome browser for genomic studies. Nat. Methods 15, 107–114 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Fazekas, D. et al. SignaLink 2 – a signaling pathway resource with multi-layered regulatory networks. BMC Syst. Biol. (2013).

  59. 59.

    Huttlin, E. L. et al. Architecture of the human interactome defines protein communities and disease networks. Nature 545, 505–509 (2017).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Szklarczyk, D. et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2015).

    CAS  PubMed  Google Scholar 

  61. 61.

    Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    ADS  CAS  Google Scholar 

  62. 62.

    Roweth, H. G. et al. Two novel, putative mechanisms of action for citalopram-induced platelet inhibition. Sci. Rep. 8, 16677 (2018).

    ADS  PubMed  PubMed Central  Google Scholar 

  63. 63.

    Roweth, H. G. et al. Citalopram inhibits platelet function independently of SERT-mediated 5-HT transport. Sci. Rep. 8, 3494 (2018).

    ADS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Nath, S., Bachani, M., Harshavardhana, D. & Steiner, J. P. Catechins protect neurons against mitochondrial toxins and HIV proteins via activation of the BDNF pathway. J. Neurovirol. 18, 445–455 (2012).

    CAS  PubMed  Google Scholar 

  65. 65.

    Park, K.-S. et al. (−)-Epigallocatethin-3-O-gallate counteracts caffeine-induced hyperactivity: evidence of dopaminergic blockade. Behav. Pharmacol. 21, 572–575 (2010).

    CAS  PubMed  Google Scholar 

  66. 66.

    Ramesh, E., Geraldine, P. & Thomas, P. A. Regulatory effect of epigallocatechin gallate on the expression of C-reactive protein and other inflammatory markers in an experimental model of atherosclerosis. Chem. Biol. Interact. 183, 125–132 (2010).

    CAS  PubMed  Google Scholar 

  67. 67.

    Han, S. G., Han, S.-S., Toborek, M. & Hennig, B. EGCG protects endothelial cells against PCB 126-induced inflammation through inhibition of AhR and induction of Nrf2-regulated genes. Toxicol. Appl. Pharmacol. 261, 181–188 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Sheng, R., Gu, Z.-L. & Xie, M.-L. Epigallocatechin gallate, the major component of polyphenols in green tea, inhibits telomere attrition mediated cardiomyocyte apoptosis in cardiac hypertrophy. Int. J. Cardiol. 162, 199–209 (2013).

    PubMed  Google Scholar 

  69. 69.

    Devika, P. T. & Stanely Mainzen Prince, P. (−)-Epigallocatechin gallate protects the mitochondria against the deleterious effects of lipids, calcium and adenosine triphosphate in isoproterenol induced myocardial infarcted male Wistar rats. J. Appl. Toxicol. 28, 938–944 (2008).

    CAS  PubMed  Google Scholar 

  70. 70.

    Yi, Q.-Y. et al. Chronic infusion of epigallocatechin-3-O-gallate into the hypothalamic paraventricular nucleus attenuates hypertension and sympathoexcitation by restoring neurotransmitters and cytokines. Toxicol. Lett. 262, 105–113 (2016).

    CAS  PubMed  Google Scholar 

  71. 71.

    Devika, P. T. & Prince, P. S. M. Preventive effect of (−)-epigallocatechin-gallate (EGCG) on lysosomal enzymes in heart and subcellular fractions in isoproterenol-induced myocardial infarcted Wistar rats. Chem. Biol. Interact. 172, 245–252 (2008).

    CAS  PubMed  Google Scholar 

  72. 72.

    Hushmendy, S. et al. Select phytochemicals suppress human T-lymphocytes and mouse splenocytes suggesting their use in autoimmunity and transplantation. Nutr. Res. 29, 568–578 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Shen, K. et al. Epigallocatechin 3-gallate ameliorates bile duct ligation induced liver injury in mice by modulation of mitochondrial oxidative stress and inflammation. PLoS ONE 10, e0126278 (2015).

    PubMed  PubMed Central  Google Scholar 

  74. 74.

    ZHEN, M. et al. Green tea polyphenol epigallocatechin-3-gallate inhibits oxidative damage and preventive effects on carbon tetrachloride–induced hepatic fibrosis. J. Nutr. Biochem. 18, 795–805 (2007).

    CAS  PubMed  Google Scholar 

  75. 75.

    Yasuda, Y. et al. (−)-Epigallocatechin gallate prevents carbon tetrachloride-induced rat hepatic fibrosis by inhibiting the expression of the PDGFRβ and IGF-1R. Chem. Biol. Interact. 182, 159–164 (2009).

    CAS  PubMed  Google Scholar 

  76. 76.

    Cao, W. et al. iTRAQ-based proteomic analysis of combination therapy with taurine, epigallocatechin gallate, and genistein on carbon tetrachloride-induced liver fibrosis in rats. Toxicol. Lett. 232, 233–245 (2015).

    CAS  PubMed  Google Scholar 

  77. 77.

    Kitamura, M. et al. Epigallocatechin gallate suppresses peritoneal fibrosis in mice. Chem. Biol. Interact. 195, 95–104 (2012).

    CAS  PubMed  Google Scholar 

  78. 78.

    Sakla, M. S. & Lorson, C. L. Induction of full-length survival motor neuron by polyphenol botanical compounds. Hum. Genet. 122, 635–643 (2008).

    CAS  PubMed  Google Scholar 

  79. 79.

    Shimizu, M. et al. (−)-Epigallocatechin gallate inhibits growth and activation of the VEGF/VEGFR axis in human colorectal cancer cells. Chem. Biol. Interact. 185, 247–252 (2010).

    CAS  PubMed  Google Scholar 

Download references


This study was supported, in part, by NIH grants 1P01HL132825, HG007690, HL108630 and HL119145; American Heart Association grants 151708 and D700382 and ERC grant 810115-DYNASET. We would like to thank P. Ruppert, G. Menichetti and I. Kovacs for support in this study, F. Cheng for assembling the human interactome and A. Grishchenko for help with data visualization.

Author information




I.F.d.V. and A.-L.B. designed the study. I.F.d.V. performed all computational analyses. H.G.R., M.W.M., E.B. and J.L. designed and performed experimental validation. J.L. guided I.F.d.V. in validation case studies. S.M. and D.B. guided I.F.d.V. in data interpretation and curation of disease associations obtained from the literature. I.F.d.V. and A.-L.B. wrote the paper with input from all authors. All authors read and approved the manuscript.

Corresponding author

Correspondence to Albert-László Barabási.

Ethics declarations

Competing interests

J.L. and A.-L.B. are co-scientific founders of Scipher Medicine, Inc., which applies network medicine strategies to biomarker development and personalized drug selection; A.-L.B. is the founder of Datapolis Inc., which explores mobility patterns in urban planning, and Foodome, Inc., which applies data science to health. I.F.d.V. is a scientific consultant for Foodome, Inc.

Additional information

Peer review information Nature Food thanks Dariush Mozaffarian and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–13 and Notes.

Reporting Summary

Supplementary Table 1

Summary of polyphenols evaluated in this study.

Supplementary Table 2

Predicted gastrointestinal (GI) absorption and bioavailability.

Supplementary Table 3

Polyphenols proximal to vascular diseases.

Supplementary Data 1

Human interactome assembled in this study.

Supplementary Data 2

Network proximity calculations between 65 polyphenols and 299 diseases.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

do Valle, I.F., Roweth, H.G., Malloy, M.W. et al. Network medicine framework shows that proximity of polyphenol targets and disease proteins predicts therapeutic effects of polyphenols. Nat Food 2, 143–155 (2021).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing