Introduction

COVID-19 is the term coined for the pandemic caused by SARS-CoV-2. Unprecedented in the history of science, this pandemic has elicited a worldwide, collaborative response from the scientific community. In addition to the strong focus on the epidemiology of the virus1,2,3, experiments aimed at understanding mechanisms underlying the pathophysiology of the virus have led to new insights in a comparably short amount of time4,5,6,7.

In the field of computational biology, several initiatives have started generating disease maps that represent the current knowledge pertaining to COVID-19 mechanisms8,9,10,11. Such disease maps have proven valuable before in diverse areas of research such as cancer, Alzheimer's Disease, Parkinson’s Disease, and influenza12,13,14,15.

When taken together with related work including cause-and-effect modeling8, entity relationship graphs16, and pathways8; these disease maps represent a considerable amount of highly curated “knowledge graphs” which focus primarily on COVID-19 biology. Here, we use the term “mechanism” to describe a single, or multiple cause-and-effect relationships (i.e. a subgraph), “pathways” to refer to a well-established series of interactions resulting in cellular change or a defined product, and “models” for describing a collection of experimental data or known interactions defined in the context of a particular biological process or pathology. As of July 2020, a collection consisting of 10 models representing core knowledge about the pathophysiology of SARS-CoV-2 and its primary target, the lung epithelium, was shared with the public.

With the rapidly increasing generation of data (e.g. transcriptome17, interactome18, and proteome19 data), we are now in the position to challenge and validate these COVID-19 pathophysiology knowledge graphs with experimental data. This is of particular interest as validation of these knowledge graphs bears the potential to identify those disease mechanisms that are highly relevant for targeting in drug repurposing approaches.

The concept of drug repurposing (the secondary use of already developed drugs for therapeutic uses other than those they were designed for) is not new. The major advantage of drug repurposing over conventional drug development is the massive decrease in time required for development as important steps in the drug discovery workflow have already been successfully passed for these compounds20, 21.

Our group and many others have already begun performing assays to screen for experimental compounds and approved drugs to serve as new therapeutics for COVID-19. Dedicated drug repurposing collections, such as the Broad Institute library22 and the comprehensive ReFRAME library23 were used to experimentally screen for either viral proteins as targets for functional inhibition24, or for virally infected cells in phenotypic assays25. In our own work, compounds were assessed for their inhibition of virus-induced cytotoxicity using the human cell line Caco-2 and a SARS-CoV-2 isolate26. A total of 63 compounds with IC50 < 20 µM were identified, from which 90% have not yet been previously reported as being active against SARS-CoV-2. Out of the active compounds, 31 are approved drugs, 23 are in phases 1–3 and 9 are preclinical candidate molecules. The described mechanisms of action for the inhibitors included kinase signaling, PDE activity modulation, and long chain acyl transferase inhibition (e.g. “azole class antifungals”).

The approach presented here integrates experimental results, the output from other informatic pipelines, as well as proprietary and public data to provide a comprehensive overview on the therapeutic efficacy of candidate compounds, the mechanisms targeted by these candidate compounds, and a rational approach to test the drug-mechanism associations for their potential in combination therapy.

Methodology

Generation of the COVID-19 PHARMACOME

Disparate COVID-19 disease maps focus on different aspects of COVID-19 pathophysiology. Based on comparisons of the COVID-19 knowledge graphs, we found that not a single disease map covers all aspects relevant for the understanding of the virus, host interaction and the resulting pathophysiology. Thus, we optimized the representation of integral COVID-19 pathophysiology mechanisms by integrating several public and proprietary COVID-19 knowledge graphs, disease maps, and experimental data (Supplementary Table 1) into one unified knowledge graph, the COVID-19 Supergraph.

To this end, we converted all knowledge graphs and interactomes into OpenBEL27, a language that is both ideally suited to capture and represent “cause-and-effect” relationships in biomedicine, and is fully interoperable with major pathway databases28, 29. In order to ensure that molecular interactions were correctly normalized, individual pipelines were constructed for each model to convert the raw data to the OpenBEL format. For example, the COVID-19 Disease Map contained 16 separate files, each of which represented a specific biological focus of the virus. Each file was parsed individually and the entities and relationships that did not adhere to the OpenBEL grammar were mapped accordingly. Whilst most of the entities and relationships in the source disease maps could be readily translated into OpenBEL, a small number of triples from different source disease maps required a more in-depth transformation. When classic methods of naming objects in triples failed, the recently generated COVID-19 ontology30 as well as other available standard ontologies and vocabularies were used to normalize and reference these entities.

In addition to combining the listed models, we also performed a dedicated curation of the COVID-19 supergraph in order to annotate the mechanisms pertaining to selected targets and the biology around prioritized repurposing candidates. The resulting BEL graphs were quality controlled and subsequently loaded into a dedicated graph database system underlying the Biomedical Knowledge Miner (BiKMi), which allows for comparison and extension of biomedical knowledge graphs (see http://bikmi.covid19-knowledgespace.de).

Once the models were converted to OpenBEL and imported into the database, the resulting nodes from each mechanism-based model were compared (Fig. 1). Even when separated by data origin type, the COVID-19 knowledge graphs had very little overlap (3 shared nodes between all manually curated models and no shared nodes between all models derived from interaction databases), but by unifying the models, our COVID-19 supergraph improves the coverage of essential virus- and host-physiology mechanisms substantially.

Figure 1
figure 1

Venn diagrams comparing major mechanistic models in the COVID-19 supergraph. Mechanism-based models were divided, and their entities compared within their resulting subgroups. Model abbreviations are defined in Supplementary Table 1. (a) Manual node comparison shows the overlap of entities in the models that are knowledge-based, manually curated relationships that have been directly encoded in OpenBEL. (b) Automated node comparison shows the overlap of entities in models re-encoded into OpenBEL from other formats (e.g. SBML models).

Additionally, by enriching the COVID-19 supergraph with drug-target information linked from highly curated drug-target databases (DrugBank, ChEMBL, PubChem), we created an initial version of the COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph representing COVID-19 pathophysiology mechanisms that includes both drug targets and their ligands (Fig. 2). In order to maximize its utility, this network includes both experimentally validated drug-target relationships as well as a wide distribution of biological entities and concepts (Supplementary Figure 1). The entire COVID-19 PHARMACOME was manually inspected and re-curated; this graph database is openly accessible to the scientific community at https://graphstore.scai.fraunhofer.de.

Figure 2
figure 2

The COVID-19 supergraph integrates drug-target information to form the COVID-19 PHARMACOME. (a) An aggregate of 10 constituent COVID-19 computable models covering a wide spectrum of pathophysiological mechanisms associated with SARS-CoV-2 infection or harmonized to generate the mechanism-based COVID-19 supergraph. (b) The COVID-19 supergraph is annotated with drug-target information from a variety of curated sources to generate the COVID-19 PHARMACOME composed of 150,662 nodes (representing proteins, pathologies, and other biological entities/concepts) and 573,929 edges (indicating relationships or interactions between the pair of nodes they connect).

Systematic review and integration of information from phenotypic screening

At the time of the writing of this paper, six phenotypic cellular screening experiments have been shared via archive servers and journal publications (Supplementary Table 2). Although only a limited number of these manuscripts have been officially accepted and published, we were able to extract their primary findings from the pre-publication archive servers. A significant number of reports on drug repurposing screenings in the COVID-19 context demonstrate how appealing the concept of drug repurposing is as a quick answer to the challenge of a global pandemic. Drug repurposing screenings were all performed with compounds for which a significant amount of information on safety in humans and primary mechanism of action is available. We generated a list of “hits” from cellular screening experiments while results derived from publications that reported on in-silico screening were ignored. Therefore, we keep a strict focus on well-characterized, well-understood candidate molecules since a pivotal advantage of this knowledge base is its use for drug repurposing.

Subgraph annotation

The COVID-19 PHARMACOME contains several subgraphs, three of which correspond to major views on the biology of SARS-CoV-2 as well as the clinical impact of COVID-19:

  • the viral life cycle subgraph focuses on the stages of viral infection, replication, and spreading.

  • the host response subgraph represents essential mechanisms active in host cells infected by the virus.

  • the clinical pathophysiology subgraph illustrates major pathophysiological processes of clinical relevance.

These subgraphs were annotated by identifying nodes within the COVID-19 PHARMACOME that represent specific biological processes or pathologies associated with each subgraph category and traversing out to their first-degree neighbors. For example, a biological process node representing “viral translation” would be classified as a starting node for the viral life cycle subgraph while a node defined as “defense response to virus” would be categorized as belonging to the host response subgraph. Though the viral life cycle and host response subgraphs contain a wide variety of node types, the pathophysiology subgraph is restricted to pathology nodes associated with either the SARS-CoV-2 virus or the COVID-19 pathology.

Mapping of gene expression data onto the COVID-19 PHARMACOME

Two single cell sequencing data sets representing infected and non-infected cells directly derived from human samples31 and cultured human bronchial epithelial cells32 (HBECs) were used to identify the areas of the COVID-19 PHARMACOME responding at gene expression level to SARS-CoV-2 infection. Details of the gene expression data processing and mapping are available in the supplementary material (see section “Gene expression data analysis”).

Pathway enrichment

Associated pathways for subgraphs and significant targets were identified using the Enrichr33 feature of the gseapy Python package34. Briefly, gene symbol lists were assembled from their respective subgraph or dataset and compared against multiple pathway gene set libraries including Reactome, KEGG, and WikiPathways. To account for multiple comparisons, p values were corrected using the Benjamini-Hochberg35 method and results with p values < 0.01 were considered significantly enriched.

Drug repurposing screening

We performed phenotypic assays to screen for repurposing drugs that inhibit the replication and the cytopathic effects of virus infection. A derivative of the Broad repurposing library was used to incubate Caco-2 cells before infecting them with an isolate of SARS-CoV-2 (FFM-1 isolate, see36). Survival of cells was assessed using a cell viability assay and measured by high-content imaging using the Operetta CLS platform (PerkinElmer). Details of the drug repurposing screening are described in the supplemental material.

Drug combinations assessment with anti-cytopathic effect measured in Caco-2 cells

As described in Ellinger et al.,37 we challenged four combinations of five different compounds with the SARS-CoV-2 virus in four 96-well plates containing two drugs each. Eight drug concentrations were chosen ranging from 20 to 0.01 µM, diluted by a factor of 3 and positioned orthogonally to each other in rows and columns. No pharmacological control was used, only cells with and without exposure to SARS CoV-2 virus at 0.01 MOI.

In addition, recently published data from the work of Bobrowski et al.38, were mapped to the COVID-19 PHARMACOME and compared to the results of the combinatorial treatment experiments performed here.

Results

Comparative analysis of the hits from different repurposing screenings

Data from six published drug repurposing screenings were downloaded, and extensive mapping and curation was performed in order to harmonize chemical identifiers. The curated list of drug repurposing “hits” together with an annotation of the assay conditions is available under http://chembl.blogspot.com/2020/05/chembl27-sars-cov-2-release.html.

Initially, we analyzed the overlap between compounds identified in the reported drug repurposing screening experiments. Figure 3a shows no overlap between experiments, which is not surprising, as we are comparing highly specific candidate drug experiments with screenings based on large drug repositioning libraries. However, the overlap is still quite marginal for those screenings where large compound collections (Broad library, ReFRAME library) have been used.

Figure 3
figure 3

Overlap of compound hits between different drug repurposing screening experiments. (a) Direct comparison of overlapping hits in drug repurposing screenings revealed no overlap between the experiments. These experiments were performed using different cell types (Vero E6 cells and Caco2 cells). (b) Protein target space overlap between different COVID-19 drug repurposing screenings. Drug targets were identified by confidence level ≥ 8 and single protein targets according to the ChEMBL database. Comparison of experiments indicates over one hundred common protein targets.

Mapping of repurposing hits to target proteins

In order to identify which proteins are targeted by the repurposing hits, and to investigate the extent to which there are overlaps between repurposing experiments at the target/protein level, we mapped all the identified compounds from the drug repurposing experiments to their respective targets. As most drugs bind to more than one target, we increase the likelihood of overlaps between the drug repurposing experiments when we compare them at the protein/target space. Indeed, Fig. 3b shows an overlap of 112 targets between all the drug repurposing experiments, thereby creating a list of potential proteins for therapeutic intervention when the compound targets are considered rather than the compounds themselves.

The COVID-19 PHARMACOME associates pathways derived from drug repurposing targets with pathophysiology mechanisms

A non-redundant list of drug repurposing candidate molecules that display activity in phenotypic (cellular) assays was generated and mapped to the COVID-19 PHARMACOME. Figure 4 shows the distribution of repurposing drugs in the COVID-19 cause-and-effect graph, the “responsive part” of the graph that is characterized by changes in gene expression associated with SARS-CoV-2 infection and the overlap between the two subgraphs. This overlap analysis allows for the identification of repurposing drugs targeting mechanisms that are modulated by viral infection.

Figure 4
figure 4

Identification of suitable targets for combination therapy by comparing subgraphs within the COVID-19 PHARMACOME. Incorporation of gene expression data into the COVID-19 PHARMACOME resulted in a subgraph characterized by the entities (genes/proteins) that respond to viral infection (a). Mapping of the filtered results obtained from drug repurposing screenings (IC50 < 10 µM) to the PHARMACOME resulted in a subgraph enriched for drug repurposing targets (b). The intersection between subgraphs presented in (a, b) is highly enriched for drug repurposing targets directly linked to the viral infection response (c).

A total number of 870 mechanisms were identified as being targeted by most of the drug repurposing candidates (see section “Associated pathway identification” in supplementary materials). When compared to the annotated subgraphs in the COVID-19 PHARMACOME, 201 of the 227 determined associated pathways found for the viral life cycle subgraph overlapped with those for the drug repurposing targets while the host response subgraph shared 90 of its 105 pathways.

Mapping of drug repurposing signals to hypervariable regions of the COVID-19 PHARMACOME

One of the key questions arising from the network analysis is whether the repurposing drugs target mechanisms are specifically activated during viral infection. In order to establish this link, we mapped differential gene expression analyses from two single-cell sequencing studies to our COVID-19 PHARMACOME. An overlay of differential gene expression data (adjusted p value ≤ 0.1 and abs(log fold-change) > 0.25) on the COVID-19 PHARMACOME reveals a distinct pattern characterized by the high responsiveness (expressed by variation of regulation of gene expression) to the viral infection (Fig. 4a).

Virus-response mechanisms are targets for repurposing drugs

In the next step, we analyzed which areas of the COVID-19 graph respond to SARS-CoV-2 infection (indicated by significant variance in gene expression) and are targets for repurposing drugs. To this end, we mapped signals from the drug repurposing screenings to the subgraph that showed responsiveness to SARS-CoV-2 infection (Fig. 4b). Figure 4c depicts the resulting subgraph that is characterized by the transcriptional response to SARS-CoV-2 infection and the presence of target proteins of compounds that have been identified in drug repurposing screening experiments.

The COVID-19 PHARMACOME supports rational targeting strategies for COVID-19 combination therapy

We mapped existing combinatorial therapy data to the COVID-19 PHARMACOME in order to evaluate its potential in guiding rational approaches towards combination therapy using repurposing drug candidates. In drug combination therapy, the interaction between compounds can be defined as either additive (the combined effect is the same given proportional doses of the individual drugs), synergistic (the combined effect is larger than the additive effect of each individual drug), or antagonistic (the combined effect is smaller than the additive effect of each individual drug)39, 40Combinatorial treatment data obtained from the results published by Bobrowski41 and Ellinger et al42 were mapped to the COVID-19 PHARMACOME. Figure 5 provides an overview of the mapped compounds, their protein targets, and the interaction mechanisms. Analysis of the overlaps between the drug repurposing screening data showed that four of the ten compounds reported in the synergistic treatment approach by drug repurposing data were represented in our initial non-redundant set of candidate repurposing drugs.

Figure 5
figure 5

Visualization of drug repurposing candidates (and their targets) used in combination treatment experiments. The subgraph depicts the drug repurposing candidate molecules in relation to each other and their targets. Shortest path lengths between drug combinations were calculated from this subgraph and are available in the supplementary material (Supplementary Table 5).

Based on the association between repurposing drug candidates and the areas of the COVID-19 PHARMACOME that respond to SARS-CoV-2 infection (Fig. 4), we hypothesized that the number of edges between a pair of drug nodes may be linked to the effectiveness of the drug combination (Supplementary Figure 2). In order to evaluate whether the determined outcome of a combination of drugs correlated with the distance between said drug nodes, we compared distances for combinations of drugs within the COVID-19 PHARMACOME for which their effect was known (Supplementary Tables 3 and 5). Of the 47 drug combinations we were able to check within the COVID-19 PHARMACOME, we found that the pairs of drugs known to have a synergistic effect in the treatment of SARS-CoV-2 had an average shortest path length of 2.43, while antagonistic combinations were found to be farther apart with an average shortest path length of 4.0 (Supplementary Table 7). Based on our calculations, we formulated three categories for predicting the outcome of new drug combinations on infection using the shortest path lengths between them within the COVID-19 PHARMACOME. Drug combinations with shortest path lengths of 2 indicate a synergistic relationship between the compounds, 3 was determined to be inconclusive as our calculations did not justify a specific outcome, and those with a shortest path length of 4 or more were predicted to have an antagonistic relationship.

In order to test our ability to predict the outcome of novel drug combinations, we selected five compounds: Remdesivir (a virus replicase inhibitor), Nelfinavir (a virus protease inhibitor), Raloxifene (a selective estrogen receptor modulator), Thioguanosine (a chemotherapy compound interfering with cell growth), and Anisomycin (a pleiotropic compound with several pharmacological activities, including inhibition of protein synthesis and nucleotide synthesis). These compounds were used in four different combinations (Remdesivir/Thioguanosine, Remdesivir/Raloxifene, Remdesivir/Anisomycin and Nelfinavir/Raloxifene) to test the potency of these drug pairings in phenotypic, cellular assays. Figure 6 shows the results of these combinatorial treatments on the virus-induced cytopathic effect in Caco-2 cells.

Figure 6
figure 6

Dose–response curves (DRC) depicting viral inhibition of SARS-CoV-2 by select drug combinations. (a) A threshold effect can be seen with the Remdesivir/Anisomycin combination when Anisomycin reaches 20 µM, well beyond Anisomycin’s IC50 alone. Remdesivir activity does not appear to be affected by Anisomycin, while Remdesivir seems to be equally affected (de-potentiated) by low to high concentrations of Raloxifene. (b) Viral inhibition for Remdesivir/Thioguanosine can be seen only at lower Thioguanosine concentrations, at higher concentrations the clear curve shift of Remdesivir at lower concentration (effect beyond Loewe’s additivity formula) could not be appreciated. (c) Raloxifene had an antagonistic effect on Remdesivir’s viral replication inhibition activity. (d) A clear shift in Nelfinavir’s DRC can be observed when combined with Raloxifene, but also suggests a threshold effect when Raloxifene concentrations are higher than 2.2 µM.

Our results indicate that compound combinations acting on different viral mechanisms, such as Remdesivir and Thioguanosine (Fig. 6b) or Nelfinavir and Raloxifene (Fig. 6d), showed synergy, while compounds acting on host mechanisms, for instance Anisomycin or Raloxifene, when combined with Remdesivir (Fig. 6a, c, respectively), resulted in neither synergistic nor additive effects. Interestingly, our experiments revealed that the HIV-Protease inhibitor Nelfinavir, which already appeared to be active against viral post-entry fusion steps of both SARS-CoV43 and SARS-CoV-244, displayed synergistic effects when combined with high concentrations of Raloxifene. This result agrees with our predictions generated using the COVID-19 PHARMACOME in which the drug combination with the shortest distance, Raloxifene and Nelfinavir (Supplementary Table 5), would have a synergistic effect on SARS-CoV-2 pathology.

Discussion

By combining a significant number of knowledge graphs which represent various aspects of COVID-19 pathophysiology and drug-target information we were able to generate the COVID-19 PHARMACOME, a unique resource that covers a wide spectrum of cause-and-effect knowledge about SARS-CoV-2 and its interactions with the human host. Based on a systematic review of the results derived from published drug repurposing screening experiments, as well as our own drug repurposing screening results, we were able to identify mechanisms targeted by a variety of compounds showing virus inhibition in phenotypic, cellular assays. With the COVID-19 PHARMACOME, we are now able to link repurposing drugs, their targets and the mechanisms modulated by said drugs within one computable data structure, thereby enabling us to target—in a combinatorial treatment approach—different, independent mechanisms. By challenging the COVID-19 PHARMACOME with gene expression data, we have identified subgraphs that are responsive (at gene expression level) to virus infection. Network analysis along with the overview on previous repurposing experiments provided us with the insights needed to select the optimal repurposing drug candidates for combination therapy. Experimental verification showed that this systematic approach is valid; we were able to identify two drug-target-mechanism combinations that demonstrated synergistic action of the repurposed drugs targeting different mechanisms in combinatorial treatments.

We are fully aware of the fact that the COVID-19 PHARMACOME combines experimental results generated in different assay conditions. In the course of our work, we accumulated evidence that assay responses recorded using Vero E6 cells in comparison to Caco-2 cells may only partially overlap. Comparative analysis of the results of both assay systems to virus infection by means of transcriptome-wide gene expression analysis is one of the experiments we plan to perform next. However, for the identification of meaningful combinations of repurposing drugs, the current model-driven information fusion approach was shown to work well despite the putative differences between drug repurposing screening assays.

Given the urgent need for treatments that work in an acute infection situation, our approach described here paves the way for systematic and rational approaches towards combination therapy of SARS-CoV-2 infections. We want to encourage all our colleagues to make use of the COVID-19 PHARMACOME, improve it, and add useful information about pharmacological findings (e.g. from candidate repurposing drug combination screenings). In addition to vaccination and antibody therapy, (combination) treatment with small molecules remains one of the key therapeutic options for combatting COVID-19. The COVID-19 PHARMACOME will therefore be continuously improved and expanded to serve integrative approaches in anti-SARS-CoV-2 drug discovery and development.