Responding quickly to unknown pathogens is crucial to stop uncontrolled spread of diseases that lead to epidemics, such as the novel coronavirus, and to keep protective measures at a level that causes as little social and economic harm as possible. This can be achieved through computational approaches that significantly speed up drug discovery. A powerful approach is to restrict the search to existing drugs through drug repurposing, which can vastly accelerate the usually long approval process. In this Review, we examine a representative set of currently used computational approaches to identify repurposable drugs for COVID-19, as well as their underlying data resources. Furthermore, we compare drug candidates predicted by computational methods to drugs being assessed by clinical trials. Finally, we discuss lessons learned from the reviewed research efforts, including how to successfully connect computational approaches with experimental studies, and propose a unified drug repurposing strategy for better preparedness in the case of future outbreaks.
Similar content being viewed by others
The novel SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) pathogen has infected around 60 million people and caused more than a million deaths worldwide (https://covid19.who.int/; as of November 2020). As a result, there is a need to find treatments that can be applied immediately to reduce mortality or morbidity.
Repurposing existing drugs is a rapid and effective way to provide such treatments by identifying new uses for drugs that have well-established pharmacological and safety profiles1. Many drugs used to treat different diseases have already been successfully repurposed and approved for new indications2. While repurposing can be conducted at any point in drug development, its greatest potential can be applied to drugs that are already approved3. In the case of the COVID-19 pandemic, it is a fast and cost-efficient approach to identify novel treatments4.
Recent studies have increasingly employed computational methods to systematically predict new drug targets or drug repurposing candidates. In contrast to experimental high-throughput screening, in silico approaches are faster, lower-cost, and can serve as an initial filtering step for evaluating thousands of compounds. Thus, they are useful for prioritizing drugs that warrant further evaluation and experimental validation. This requires the application of suitable algorithmic approaches to identify mechanisms relevant or specific to the disease4.
This Review discusses current in silico drug repurposing efforts for COVID-19, followed by a discussion of the lessons learned from different perspectives (from data resources to the quality of predictions) and a proposed unified strategy to improve the response in potential future outbreaks. The covered studies employed standard drug repurposing workflows and data-driven algorithms.
As new studies are published almost every day, it is not possible to provide a broad and comprehensive overview of all repurposing studies. Hence, this Review focuses on the computational methods for drug repurposing, their application, availability and feasibility in a selection of studies (peer-reviewed and preprint) that were selected to cover a wide variety of different methods. It is worth noting that most of these studies are not considered successful clinically. Nevertheless, it is important to properly evaluate and improve the predictive power of in silico approaches that are capable of utilizing information from existing drugs as well as host and virus biology, even with limited availability of data on the novel emerging pathogen. This promotes a rapid and practical response to infection and therefore improves success in future pandemics, particularly in tackling the rise in infection cases at the early stages of the pandemic or ahead of vaccine development.
Besides experimental datasets, the rapid availability of resources that integrate different data types is crucial in a pandemic. Sharing data accelerates research, as computational methods depend on high-quality datasets, and experimental labs do not need to collect the information on their own. The large number of resources used in COVID-19 drug repurposing studies have shown that data can be quickly generated and gathered through strong community efforts. This section presents a selection of data resources used in the reviewed studies to describe the resource types that accelerated computational drug repurposing approaches: most of them are general data resources that were already established before the pandemic but that have been extended with COVID-19 or SARS-CoV-2-specific data. The resources used in the reviewed studies are listed in Supplementary Table 1. A list of COVID-19 specific data resources that were not used in the reviewed studies but may become relevant in the future is given in Supplementary Table 2.
Molecular data resources
All molecular data used in the reviewed publications were extracted from already established, general data resources that were quickly extended with SARS-CoV-2-specific data. Resources such as GenBank5, the GISAID initiative6, or UniProt7 provide genomic/proteomic sequence information about hosts and SARS-CoV-2. Structural resources collecting information about proteins, such as the Protein Data Bank (PDB)8, were extended by various SARS-CoV-2-specific proteins. Finally, transcriptome resources that collect gene expression data were used in several COVID-19 drug repurposing approaches. For instance, the Genotype-Tissue Expression (GTEx)9 program offers insights into tissue-specific gene expression. Expression in lung tissues is of high interest in COVID-19 drug repurposing research and was often integrated in computational models or studies. Other resources, such as the LINCS L1000 database10, profile gene expression changes under certain drug treatment conditions and were used to identify drugs with reverse expression profiles to the samples infected with SARS-CoV-2.
Network and interaction resources
Protein–protein interaction (PPI) networks enable visualization and analyses of the interactions between either host or virus proteins and other host proteins. Furthermore, PPI networks allow for particular adaptation and search strategies (for example, edge filtering) and can be connected to drug resources. Gordon et al.11 identified 332 high-confidence virus–host interactions between SARS-CoV-2 and human proteins. It was the only newly created and exclusively SARS-CoV-2-related resource used in the reviewed publications of this work. VirHostNet12,13, a virus–host PPI resource that already existed before the 2019/2020 SARS outbreak, was expanded with 167 new SARS-CoV-2 interactions. In contrast to virus–host PPIs, host PPIs are not virus specific. All resources that were used in the reviewed studies were already available before the pandemic but have since been widely used in COVID-19 drug repurposing approaches14,15. Besides molecular networks, knowledge graphs, such as the Global Network of Biomedical Relationships (GNBR)16, have demonstrated their utility for drug repurposing. These networks comprise various types of biological relationships assembled from literature and were integrated into COVID-19 drug repurposing approaches17.
Drug and trial resources
Drug databases that already existed before the pandemic and that are continuously extended with newly developed drugs were used to connect the results of different approaches to potential drugs. A widely used drug database is DrugBank18, with more than 13,000 drug entries of approved and in-trial drugs, including drug targets. On the other hand, ChEMBL19 and ZINC1520 contain millions of compounds that exhibit drug-like properties.
Drug repurposing approaches also benefited from trial databases as they can be used to validate whether the predicted drugs are already in trial or have not yet been evaluated. Examples of such resources are the EU Clinical Trials Register (https://www.clinicaltrialsregister.eu/) and ClinicalTrials.gov (https://clinicaltrials.gov/). The latter contains more than 350,000 research studies from 219 countries.
Drug repurposing studies
Various clinical, experimental and computational drug repurposing efforts have been rapidly mobilized prioritizing compounds to identify promising drug candidates for the SARS-CoV-2 pandemic. In this section, we examine a selection of studies representing the different computational approaches to identify potential new targets and repurposable drugs for COVID-19.
Virus-targeting approaches mostly rely on structure-based drug screening methods, which take the three-dimensional structures of target proteins to predict affinities or interaction energies of known chemical compounds to the proteins (Fig. 1). These methods were mainly used to identify candidate drugs that target viral proteins, so we refer to them as virus-targeting approaches, although they can also be applied to host proteins. Two main methodological workflows were applied, namely, structure-based21 and deep-learning (DL)-based drug screening. Here, we describe these methods and compare 23 COVID-19 drug repurposing studies22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44.
Structure-based drug screening
The first step for structure-based screening is the selection of the drug library and the target protein. For COVID-19, the intuitive candidate for targeting virus proteins were antivirals. Thus, many studies limited their search to these. The number of screened drugs ranged from 3 (ref. 37) to 123 antiviral drugs33. Broader studies, such as that by Chen et al.26, combined compounds from the KEGG (Kyoto Encyclopedia of Genes and Genomes) and DrugBank databases to screen 7,173 drugs.
The other crucial step is the selection of the target protein and its corresponding three-dimensional structure (experimental or predicted). Wu et al.40 performed screening on 19 encoded proteins of the virus. By comparison, most other studies focused on the 3CLpro, envelope (E), spike, RNA polymerase and methyltransferase proteins.
Virtual screening of the drug libraries utilized established software, such as Autodock45 and Glide46. Candidate drugs were selected using respective scoring methods, followed by validations with molecular dynamics simulations30,37.
Most drugs were predicted for 3CLpro (Supplementary Table 3), which was also the focus of most studies (17 studies), followed by RdRp and PLPro. For 3CLpro, the predictions ranged from 2 (ref. 29) to 27 (ref. 40) drugs per study. The 5 most frequently predicted drugs were ritonavir (8 studies), lopinavir (6 studies), nelfinavir, remdesivir and saquinavir (5 studies each). However, 99 of the candidate drugs were only predicted in 1 study, showing a high variability in the resulting candidate sets. Interestingly, the studies that screened full databases also predicted antiviral drugs as top scorers (Supplementary Table 4). Of the 23 studies, 10 have not yet been peer-reviewed, which we discuss in the section on ‘A unified drug repurposing strategy’.
DL-based repurposing strategies
DL models can predict binding affinities or docking scores and have shown advantages over conventional docking protocols. While standard docking protocols are limited to millions47, DL approaches can analyze billions of chemical compounds. This allows them to be applied to whole databases, which increases the diversity of the tested compounds and the likelihood of finding unconventional compounds47. Furthermore, they are capable of processing more (physico-)chemical features48 and can find features related to a non-favorable docking47. However, most of these methods require datasets for training, which often come from real docking simulations; thus, the performance of many DL-based approaches still rely on the accuracy of the docking software used for training.
Ton et al.42 developed DeepDocking47, which utilizes quantitative structure–activity relationship models trained to predict docking scores of compounds targeting the SARS-CoV-2 3CLpro protein. It requires fewer docking pipelines, since it performs docking only on subsets of compounds and can produce a reduced list of compounds, which is also enriched in potential top hits.
Nguyen et al.49 developed the method MathDL, which utilizes low-dimensional mathematical representations of the drug–target protein complex structures, which are then fed to DL algorithms to predict binding energies of drug–protein complexes. For SARS-CoV-2, the authors used experimental binding affinity data from SARS-CoV ligand–3CLpro complexes from PDBbind and SARS-CoV protease inhibitors as training data to predict binding energies on DrugBank compounds for SARS-CoV-2 3CLpro (ref. 50) and does not depend on docking software.
Beck et al.44 developed a DL-based drug-target interaction prediction model, named Molecule Transformer-Drug Target Interaction. It utilizes simplified molecular-input line-entry system (SMILES)51 representations for drugs and protein sequences as input for training and predicts affinities. For SARS-CoV-2, the model was trained on commercially available antiviral drugs and viral target proteins. Antiviral drugs already used against SARS-CoV-2 were found among the candidate drugs identified.
Host-targeting approaches involve identifying potential drugs that interfere with host mechanisms that contribute to viral pathogenesis, which also makes them less prone to drug resistance52,53. In addition, SARS-CoV-2 infections can trigger a hyper-reactive immune response characterized by the excessive release of pro-inflammatory cytokines and chemokines54. Thus, drugs that modulate the host immune response can benefit critically ill patients with COVID-19 by targeting specific dysregulated pathways54,55,56.
Signature-based approaches primarily utilize transcriptome datasets from samples infected with SARS-CoV-2 or closely related human coronaviruses to identify candidate drugs through connectivity mapping (Fig. 2), a well-established approach that relies on finding drug-induced expression signatures exhibiting reverse profiles to a disease signature57,58. Several studies adopted this as a primary method for identifying new therapeutics for COVID-19. Loganathan et al.59 performed differential expression analysis of virus-infected cells and extracted consistently dysregulated genes in infected conditions. They were used to query the Connectivity Map database58 for drug perturbation profiles exhibiting anti-correlated expression signatures. A modified approach was implemented by Jia et al.60, wherein expression data from infected and healthy individuals were used as input to a pathway-guided drug repurposing framework. They identified disease co-expression clusters and performed enrichment analyses prior to reverse signature matching60.
The general network-based approach applied in drug repurposing studies on COVID-19 integrates multiple data sources, including virus–host interactions, PPIs, co-expression networks, functional associations or drug–target interactions (Fig. 2). Network-based algorithms or topology measures are applied to the assembled networks to identify relevant host protein targets or regions of the host interactome that can be targeted.
Multiple studies implement random-walk-based algorithms as the primary method to identify new putative drug targets. Law et al.61 implemented several algorithms on a virus–host interactome to identify additional SARS-CoV-2 interactors. The coronavirus spike protein primarily has been established to mediate viral entry into host cells62. Similarly, but focusing on a specific context, Messina et al.63 explored the pathogenic mechanisms triggered by the spike protein using data from three closely related coronaviruses. They implemented a random walk algorithm on assembled molecular networks using the spike protein as seed to identify relevant targets for COVID-1963. In addition, CoVex64 implemented TrustRank65, a variant of the PageRank66 algorithm, to propagate scores from user-defined seeds to the other host proteins and rank host drug targets.
Network proximity relies on the principle that a drug can be effective if it targets proteins within the neighborhood of disease-associated proteins in the interactome67. Zhou et al.68 utilized this concept to compute the network proximity measure between drug targets and coronavirus-associated proteins in the human interactome. They also used the ‘complementary exposure’ pattern, which is based on the shortest distance between targets of two drugs predicted by network proximity, to identify potential drug combinations to treat COVID-19 patients68.
Several studies combined multiple network-based strategies to predict drug candidates. Gysi et al.69 characterized and extracted a COVID-19 disease module using experimentally determined SARS-CoV-2 interactors. They performed network-based analyses accounting for tissue specificity and potential disease comorbidities. They employed a multi-modal approach to the virus–host interactome integrating network proximity, diffusion state distance and graph convolutional networks (GCNs) to identify drugs that can perturb the activity of host proteins associated with the COVID-19 disease module. The final drug list was obtained by rank aggregation from the different pipelines69.
CoVex64 is a web platform for exploring SARS-CoV and SARS-CoV-2 virus–host–drug interactomes64. Users can predict drug targets and drug candidates using several graph analysis methods that allow custom seed proteins as input. For instance, KeyPathwayMiner70 is a network enrichment tool that identifies condition-specific subnetworks by extracting a maximally connected subnetwork from the host interactome starting from the seeds. CoVex also implements a weighted multi-Steiner tree method that aggregates several non-unique approximations of Steiner trees, which are subnetworks of minimum cost connecting the set of seeds, into a single subnetwork.
Other studies additionally utilize machine learning to predict drug candidates against SARS-CoV-2. Belyaeva et al.71 implemented a hybrid approach between signature matching and network-based methods. Using autoencoders, they learned feature embeddings for drugs using drug-induced expression profiles to identify drugs exhibiting reverse profiles to the SARS-CoV-2 infection signature. Steiner tree and causal network discovery algorithms were then used to extract the mechanisms mediated by both SARS-CoV-2 and aging71. Ge et al.72 constructed a virus-related knowledge graph and employed a GCN algorithm. The list of drug candidates was further filtered for existing evidence of antiviral activities through text mining72. Similarly, Zeng et al.17 assembled a large-scale knowledge graph derived from PubMed articles. A GCN model was then applied to learn low-dimensional embeddings of the nodes and edges17.
In the following, we examine the quality and potential of the reviewed data resources and computational methods in order to improve the response in future pandemics.
The availability of molecular datasets is a precondition to develop drug repurposing methods quickly. Besides that, network-based resources were a large driver in drug repurposing. However, a large portion of the publications are based on only a few primary resources, which always induces the risk of bias or measurement errors. In addition, the only type of molecular interaction network used was PPI. Still, high confidence PPIs are needed since, for instance, none of the approaches included structure data. In the future, other network types, such as gene regulatory networks, should be considered. Other data resources, such as off-label data for drugs, should also be integrated in drug repurposing studies.
Finally, existing drug and trial resources were widely used for developing the drug repurposing pipelines. However, we observed no standardization in trial resources, making it hard to analyze trials for certain drugs due to different names, different spellings, or typing errors. Standardization is usually implemented for drug resources (for example, DrugBank), but some drugs undergoing trials could not be found in the databases. Keeping the resources up to date and interconnected should be a focus and will enhance accessibility.
Assessing the quality of predictions is challenging, since many studies are not peer-reviewed, do not perform experimental evaluation, or rely on clinical trial databases. We examined the quality of predictions by determining the overlap between the final candidate drug lists from the individual studies and the drugs undergoing clinical trials from ClinicalTrials.gov (https://clinicaltrials.gov/) and Biorender (https://biorender.com/covid-vaccine-tracker) databases. In addition, we provide supplementary in vitro screening data, such as IC50 values for viral targets and inhibition indices from cell culture studies for SARS-CoV-2 (Supplementary Data 1). Our effort to compile these data shows that a substantial number of predictions have not been experimentally tested.
Evaluating virus-targeting approaches
We identified 53 drugs predicted with docking simulations that are undergoing current trials (Supplementary Table 5). Wu et al.40 identified most of the drugs (36 drugs); however, these drugs were predicted for multiple viral proteins (for example, chlorhexidine for 11 and methotrexate for 6 different viral proteins). This indicates that their approach did not yield specific and feasible candidates. After excluding this study, the remaining drugs were only predicted for one specific protein each, except for chloroquine (3CLpro and PLpro) and remdesivir (3CLpro and RdRp). The top five drugs in clinical trials, which were predicted by docking simulations using the 3CLpro main protease, were predicted by 14.3% (darunavir), 19.0% (remdesivir), and 23.8% (lopinavir, nelfinavir, ritonavir) of the total number of included docking studies (Supplementary Table 6), showing that for each drug, the majority of studies were not able to predict them. Similar drugs were identified by the DL approach of Beck et al.44, who identified ritonavir, lopinavir and remdesivir, which are being tested in multiple clinical trials. However, these antiviral drugs have not yet shown well-defined results in patients. For ritonavir/lopinavir, only four trials are completed73,74,75,76 and preliminary results suggest no difference in the outcome after treatment77,78,79. Further investigation is required80. For remdesivir, some trials have been completed and the preliminary results in patients81,82,83 and human cell lines84 showed that it could be effective in treating SARS-CoV-2 infection.
Antiviral drugs are always the top hits among a large selection of drugs from databases, indicating high accuracy of the methods. These drugs are good candidates for experimental screening or clinical trials, independently of how reliable the computational predictions are. More interesting candidates are the additional drugs identified by these approaches; however, little experimental validation is available for these drugs and the majority of them do not enter clinical trials. A similar situation is observed in the emerging field of DL approaches, where most studies focused on demonstrating the accuracy of their predictions and developing benchmarking datasets85,86. DL and docking simulation-based approaches are promising tools to identify repurposable drugs given their capacity to deliver results in a short time. While a standard workflow is already established for docking simulations, DL-based approaches might robustly deliver testable candidate drugs. However, docking studies in particular were rarely peer reviewed, found very different candidate sets and partially used different scores for evaluation and ranking. This makes it necessary to validate these results by systematic comparisons of experiments.
Evaluating host-targeting approaches
Host-targeting approaches typically involve integration and analysis of multiple omics types and employ data-driven network-based methods; thus, a major limitation is the lack of gold-standard datasets and the scarcity of data from the MERS-CoV (Middle East respiratory syndrome coronavirus) and SARS-CoV outbreaks. Prior to the availability of sufficient SARS-CoV-2-specific data, earlier studies utilized preliminary data or augmented the analyses using data from closely related viruses. While the quality of the predictions is highly data-dependent, continued generation of SARS-CoV-2-specific omics data and pending results on clinical studies are expected to improve the predictions. Clinical expert knowledge remains crucial for filtering the drug predictions based on criteria such as toxicity and pharmacological properties. However, the efficacy of these candidate drugs in trial remains to be established and firm conclusions cannot be made because of the limited data availability.
The degree of overlap with drugs in clinical trials was generally low (Supplementary Tables 7 and 8), but more than half of the drugs (26 out of 41) predicted by an ensemble method primarily based on knowledge graphs17 are also undergoing clinical trials. While it should be noted that the drugs registered for clinical trials were also used as their validation set at the time of writing, more of their predicted drugs were registered for clinical trials later on. We also noted several drugs that were predicted by both signature-based and network-based approaches and thus warranted further examination (Supplementary Table 9). Ribavirin was predicted by four out of six studies17,60,69,71, thereby providing a mechanistic basis for its predicted efficacy. Methotrexate, which is indicated for rheumatoid arthritis, was also predicted by three studies17,68,69.
It is worth noting that several predicted compounds are currently used to treat critically ill COVID-19 patients. An example is dexamethasone (predicted by one signature-based60 and two network-based studies17,69), which was supported by the RECOVERY trial87. Hydrocortisone (predicted by three studies17,68,69) has also demonstrated efficacy for critically ill patients88. Dexamethasone and hydrocortisone are corticosteroids that act by modulating an overactive immune response, which is typically observed in severely ill COVID-19 patients.
Notably, drugs reaching advanced phases in clinical trials were not selected based on in silico predictions, but were repurposed based on clinical experience with the previous SARS or MERS outbreaks89 and selected based on known effects in alleviating disease symptoms. Furthermore, the predictions were not followed-up by experimental validation in the majority of the studies reviewed. This translational gap between computational efforts for drug repurposing and clinical application is a major and widely recognized bottleneck in drug repurposing and medicine in general. Results from systematic validation efforts will also be important for identifying the algorithms and datasets that are specifically suitable for drug repurposing in the COVID-19 context. Given the urgency of identifying effective therapies in a pandemic, close collaboration between clinicians, experimental biologists and computational biologists is expected to address this gap.
A unified drug repurposing strategy
Although overlaps between computationally predicted drug repurposing and clinical trials exist, there are no indications that clinical trials were conducted based on computational predictions, despite their promising potential. For future pandemics, computational tools should be able to deliver promising sets of candidates, which could then be validated in trials or screenings. Therefore, a unified strategy is necessary. In the following, we identify important issues and discuss potential solutions to make computational drug repurposing more effective.
Availability of standardized data
Newly developed methods often rely on the same data types (Fig. 3a). The fast generation of different kinds of data in future disease outbreaks is a key initial step. Notable examples are the interaction data from Gordon et al.11 and the publication of the 3CLpro90 structure, which were both used by many subsequent studies. However, experimental replication of datasets obtained from different laboratories and the integration of different data types are crucial to increase robustness and require improvement.
Despite the large variety of computational tools and software, it has so far been of limited practical use to clinical researchers during the COVID-19 pandemic (Fig. 3b). For virus-targeting therapies, docking pipelines remain stable and a large amount of software has been developed; however, their corresponding outputs showed wide variability depending on the algorithm used, lowering comparability (standardization problem). For host-targeting therapies, the in silico pipelines are more methodologically diverse and several strategies were developed to target specific biological contexts. However, the general availability of computational tools and software in the context of the COVID-19 pandemic has been highly limited. Tool accessibility allows researchers to run custom analyses using the developed algorithms (for example, on newly available data). This will help non-computational scientists to use these tools and continue with validation routines, avoiding many preprint manuscripts that are never validated and consequently accelerating research.
Consolidation of predictions
Results from different approaches were not entirely integrated. In structure-based repurposing approaches, candidate drugs obtained from different docking tools or homology modeling methods could be consolidated to provide an ensemble of repurposable drugs (Fig. 3b). For host-targeting therapies, one study used rank aggregation to integrate results from different algorithms69. Another study derived the final predictions by combining the output of their model with results from gene set enrichment and expert knowledge68. While it should be noted that the drugs in clinical trials were used to develop the methods, these two studies predicted the highest proportion of overlaps with drugs being tested in clinical trials. The latter shows the potential of ensemble approaches, which are well known to output more robust results91,92. Consolidation of multiple approaches could significantly increase confidence for repurposing candidates and guide clinical researchers through the drug selection process. This requires a streamlined solution, considering tool accessibility and standardization, as in a standardized database that stores drug candidate predictions enabling meta-analyses.
Combinatorial treatment development
Computationally identifying synergistic drug combinations is an underexplored domain which could provide highly valuable information to augment clinical decision-making, since they have been demonstrated to be more effective than finding monotherapies91,92 (Fig. 3c). So far, targeting of viral and host proteins has been performed independently. There is a lack of methods aiming to find complementary drug groups while considering side effects. Combining drugs from both virus- and host-targeting categories is a promising strategy that acts by blocking the viral and host molecular machinery required for SARS-CoV-2 entry into cells and disrupting the host pathways involved in disease progression in combination with inhibitors for viral replication. While thousands of compounds can be evaluated in vitro90, combinatorial validations are considerably more challenging. Predicted combinatorial treatments could drastically reduce the search space for subsequent in vitro validation. Existing screening databases such as the NIH OpenData portal93 or the ReFRAME library94 have been sparsely used, but their potential has not been exhausted. By extending them with in silico predictions, they could link in silico and in vitro research, and help identify promising combinatorial treatments. Furthermore, screening results help verify computational predictions. Especially for docking simulations, model predictions and parameters can be easily released in a standardized format, which can be evaluated by experimental researchers. For host-targeting therapies, the study of Zhou et al.68 is an example of a combinatorial approach. Furthermore, several trials are registered for combination therapy that include candidate drugs from both categories; of these, ten drugs were included in the predictions from the reviewed studies (Supplementary Table 10). However, these drugs are either in the recruitment phase or limited results were reported; thus, data regarding their effectiveness has been inconclusive.
Limited understanding of the complex biological mechanisms underlying COVID-19 has required expert knowledge or manual curation in certain stages of the workflow, either at protein or pathway selection or at filtering of drug predictions (Fig. 3d). Expert vetting is mainly intended to uncover inconsistent or contradictory results while still allowing the identification of new predictions and can be crucial for filtering candidate drug lists for possible adverse side effects. To illustrate this, the antimalarial drug (hydroxy)chloroquine raised concerns regarding its potential toxicity. Chlorhexidine was found by a docking-based study40 as a potential drug targeting SARS-CoV-2 proteins; however, chlorhexidine is a widely used disinfectant whose mechanism of action is not SARS-CoV-2-specific and it is approved for topical or dental application only95. Consequently, the use of expert knowledge for careful evaluation of potential repurposable drugs would have been helpful to allocate limited experimental and computational resources on safe and effective drugs that have greater potential for widespread application. Close collaboration between computational and clinical researchers is therefore crucial, because computational approaches are still limited in side effect data and annotations for drug actions on the targets.
Drug repurposing studies usually validate the computational models by constructing their own ‘ground truth’; these can include data from in vitro screening of predicted compounds, in vivo experiments using animal models, ongoing clinical trials, electronic health records, literature mining or expert knowledge96 (Fig. 3e). Thus, there is considerable heterogeneity in the sources of these standards, but efforts are ongoing to address this. For instance, newly released databases, such as the NIH’s OpenData portal93, collect and continuously update SARS-CoV-2 in vitro screening data for thousands of compounds and other SARS-CoV-2-related assays. We encourage future studies to utilize such resources for further validation or filtering of in silico predictions. However, except for one study,69 no direct follow-up experimental validation has been performed in the drug repurposing efforts for COVID-19. In the reviewed studies, validation was implemented through several strategies. Some studies performed signature matching of drug profiles or gene set enrichment analysis17 to provide evidence of the potential effectiveness69,72. Others evaluated the performance of their pipelines using the drugs undergoing clinical trials for COVID-1917,69 or experimental results from in vitro drug screening69. However, an extensive list of candidate drugs remains experimentally invalidated; thus, systematic validation of candidate drugs would be required to provide a landscape of the accuracy of methods. Since this is infeasible in practice, combining the predictions with expert knowledge becomes even more important.
The proposed strategy in this work has the potential to address the gaps of previous studies and is intended to serve as a guideline on computational drug repurposing to accelerate research, promote standardization, and react faster and more precisely in the case of future pandemics.
Pushpakom, S. et al. Drug repurposing: progress, challenges and recommendations. Nat. Rev. Drug Discov. 18, 41–58 (2019).
Paranjpe, M. D., Taubes, A. & Sirota, M. Insights into computational drug repurposing for neurodegenerative disease. Trends Pharmacol. Sci. 40, 565–576 (2019).
Sanseau, P. & Koehler, J. Computational methods for drug repurposing. Brief. Bioinform. 12, 301–302 (2011).
Ciliberto, G. & Cardone, L. Boosting the arsenal against COVID-19 through computational drug repurposing. Drug Discov. Today 25, 946–948 (2020).
Benson, D. A. et al. GenBank. Nucleic Acids Res. 41, D36–D42 (2013).
Shu, Y. & McCauley, J. GISAID: Global initiative on sharing all influenza data – from vision to reality. Euro Surveill. 22, (2017).
The UniProt Consortium. The Universal Protein Resource (UniProt). Nucleic Acids Res. 35, D193–D197 (2007).
Berman, H. M. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Duan, Q. et al. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures. Nucleic Acids Res. 42, W449–W460 (2014).
Gordon, D. E. et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature 583, 459–468 (2020).
Navratil, V. et al. VirHostNet: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks. Nucleic Acids Res. 37, D661–D668 (2009).
Guirimand, T., Delmotte, S. & Navratil, V. VirHostNet 2.0: surfing on the web of virus/host molecular interactions data. Nucleic Acids Res. 43, D583–D587 (2015).
Kotlyar, M., Pastrello, C., Sheahan, N. & Jurisica, I. Integrated interactions database: tissue-specific view of the human and model organism interactomes. Nucleic Acids Res. 44, D536–D541 (2016).
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Percha, B. & Altman, R. B. A global network of biomedical relationships derived from text. Bioinformatics 34, 2614–2624 (2018).
Zeng, X. et al. Repurpose open data to discover therapeutics for COVID-19 using deep learning. J. Proteome Res. 19, 4624–4636 (2020).
Wishart, D. S. et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36, D901–D906 (2008).
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).
Sterling, T. & Irwin, J. J. ZINC 15 – ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Yoshino, R., Yasuo, N. & Sekijima, M. Identification of key interactions between SARS-CoV-2 main protease and inhibitor drug candidates. Sci. Rep. 10, 12493 (2020).
Al-Khafaji, K., Al-Duhaidahawi, D. & Taskin Tok, T. Using integrated computational approaches to identify safe and rapid treatment for SARS-CoV-2. J. Biomol. Struct. Dyn. https://doi.org/10.1080/07391102.2020.1764392 (2020).
Alamri, M. A. et al. Pharmacoinformatics and molecular dynamic simulation studies reveal potential inhibitors of SARS-CoV-2 main protease 3CLpro. J. Biomol. Struct. Dyn. https://doi.org/10.1080/07391102.2020.1782768 (2020).
Arya, R., Das, A., Prashar, V. & Kumar, M. Potential inhibitors against papain-like protease of novel coronavirus (SARS-CoV-2) from FDA approved drugs. Preprint at https://doi.org/10.26434/chemrxiv.11860011.v2 (2020).
Chang, Y.-C. et al. Potential therapeutic agents for COVID-19 based on the analysis of protease and RNA polymerase docking. Preprint at https://doi.org/10.20944/preprints202002.0242.v1 (2020).
Chen, Y. W., Yiu, C.-P. B. & Wong, K.-Y. Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CLpro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Res. 9, 129 (2020).
Elfiky, A. A. Ribavirin, Remdesivir, Sofosbuvir, Galidesivir, and Tenofovir against SARS-CoV-2 RNA dependent RNA polymerase (RdRp): A molecular docking study. Life Sci. 253, 117592 (2020).
Elfiky, A. & Ibrahim, N. S. Anti-SARS and anti-HCV drugs repurposing against the Papain-like protease of the newly emerged coronavirus (2019-nCoV). Preprint at https://doi.org/10.21203/rs.2.23280/v1 (2020).
Gao, K., Nguyen, D. D., Wang, R. & Wei, G.-W. Machine intelligence design of 2019-nCoV drugs. Preprint at https://doi.org/10.1101/2020.01.30.927889 (2020).
Gupta, M. K. et al. In-silico approaches to detect inhibitors of the human severe acute respiratory syndrome coronavirus envelope protein ion channel. J. Biomol. Struct. Dyn. https://doi.org/10.1080/07391102.2020.1751300 (2020).
Hall, D. C. & Ji, H.-F. A search for medications to treat COVID-19 via in silico molecular docking models of the SARS-CoV-2 spike glycoprotein and 3CL protease. Travel Med. Infect. Dis. 35, 101646 (2020).
Hosseini, F. S. & Amanlou, M. Simeprevir, potential candidate to repurpose for coronavirus infection: virtual screening and molecular docking study. Life Sci. 258, 118205 (2020).
Khan, R. J. et al. Targeting SARS-CoV-2: a systematic drug repurposing approach to identify promising inhibitors against 3C-like proteinase and 2′-O-ribose methyltransferase. J. Biomol. Struct. Dyn. https://doi.org/10.1080/07391102.2020.1753577 (2020).
Khan, S. A., Zia, K., Ashraf, S., Uddin, R. & Ul-Haq, Z. Identification of chymotrypsin-like protease inhibitors of SARS-CoV-2 via integrated computational approach. J. Biomol. Struct. Dyn. https://doi.org/10.1080/07391102.2020.1751298 (2020).
Li, Y. et al. Therapeutic drugs targeting 2019-nCoV main protease by high-throughput screening. Preprint at https://doi.org/10.1101/2020.01.28.922922 (2020).
Lin, S., Shen, R., He, J., Li, X. & Guo, X. Molecular modeling evaluation of the binding effect of ritonavir, lopinavir and darunavir to severe acute respiratory syndrome coronavirus 2 proteases. Preprint at https://doi.org/10.1101/2020.01.31.929695 (2020).
Muralidharan, N., Sakthivel, R., Velmurugan, D. & Gromiha, M. M. Computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with SARS-CoV-2 protease against COVID-19. J. Biomol. Struct. Dyn. https://doi.org/10.1080/07391102.2020.1752802 (2020).
Smith, M. & Smith, J. C. Repurposing therapeutics for COVID-19: supercomputer-based docking to the SARS-CoV-2 viral spike protein and viral spike protein-human ACE2 interface. Preprint at https://doi.org/10.26434/chemrxiv.11871402.v4 (2020).
Wang, J. Fast identification of possible drug treatment of coronavirus disease-19 (COVID-19) through computational drug repurposing Study. J. Chem. Inf. Model. 60, 3277–3286 (2020).
Wu, C. et al. Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods. Acta Pharm. Sin. B 10, 766–788 (2020).
Xu, Z. et al. Nelfinavir was predicted to be a potential inhibitor of 2019-nCov main protease by an integrative approach combining homology modelling, molecular docking and binding free energy calculation. Preprint at https://doi.org/10.1101/2020.01.27.921627 (2020).
Ton, A.-T., Gentile, F., Hsing, M., Ban, F. & Cherkasov, A. Rapid identification of potential inhibitors of SARS-CoV-2 main protease by deep docking of 1.3 billion compounds. Mol. Inform. 39, e2000028 (2020).
Talluri, S. Virtual high throughput screening based prediction of potential drugs for COVID-19. Preprint at https://doi.org/10.20944/preprints202002.0418.v1 (2020).
Beck, B. R., Shin, B., Choi, Y., Park, S. & Kang, K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 18, 784–790 (2020).
Forli, S. et al. Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nat. Protoc. 11, 905–919 (2016).
Halgren, T. A. et al. Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J. Med. Chem. 47, 1750–1759 (2004).
Gentile, F. et al. Deep Docking: a deep learning platform for augmentation of structure based drug discovery. ACS Cent. Sci. 6, 939–949 (2020).
Torres, P. H. M., Sodero, A. C. R., Jofily, P. & Silva, F. P. Jr Key topics in molecular docking for drug design. Int. J. Mol. Sci. 20, 4574 (2019).
Nguyen, D. D., Gao, K., Wang, M. & Wei, G.-W. MathDL: mathematical deep learning for D3R Grand Challenge 4. J. Comput. Aided Mol. Des. 34, 131–147 (2020).
Nguyen, D. D., Gao, K., Chen, J., Wang, R. & Wei, G.-W. Potentially highly potent drugs for 2019-nCoV. Preprint at https://doi.org/10.1101/2020.02.05.936013 (2020).
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Lee, S. M.-Y. & Yen, H.-L. Targeting the host or the virus: current and novel concepts for antiviral approaches against influenza virus infection. Antiviral Res. 96, 391–404 (2012).
Min, J.-Y. & Subbarao, K. Cellular targets for influenza drugs. Nat. Biotechnol. 28, 239–240 (2010).
Catanzaro, M. et al. Immune response in COVID-19: addressing a pharmacological challenge by targeting pathways triggered by SARS-CoV-2. Signal Transduct. Target. Ther. 5, 84 (2020).
Liao, J., Way, G. & Madahar, V. Target virus or target ourselves for COVID-19 drugs discovery?―Lessons learned from anti-influenza virus therapies. Medi. Drug Discov. 5, 100037 (2020).
Chen, L. et al. Clinical characteristics of pregnant women with Covid-19 in Wuhan, China. N. Engl. J. Med. 382, e100 (2020).
Lamb, J. et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).
Subramanian, A. et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171, 1437–1452.e17 (2017).
Loganathan, T., Ramachandran, S., Shankaran, P., Nagarajan, D. & Mohan, S. S. Host transcriptome-guided drug repurposing for COVID-19 treatment: a meta-analysis based approach. PeerJ 8, e9357 (2020).
Jia, Z., Song, X., Shi, J., Wang, W. & He, K. Transcriptome-based drug repositioning for coronavirus disease 2019 (COVID-19). Pathog. Dis. 78, ftaa036 (2020).
Law, J. N. et al. Identifying human interactors of SARS-CoV-2 proteins and drug targets for COVID-19 using network-based label propagation. Preprint at https://arxiv.org/abs/2006.01968 (2020).
Hoffmann, M. et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 181, 271–280.e8 (2020).
Messina, F. et al. COVID-19: viral–host interactome analyzed by network based-approach model to study pathogenesis of SARS-CoV-2 infection. J. Transl. Med. 18, 233 (2020).
Sadegh, S. et al. Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing. Nat. Commun. 11, 3518 (2020).
Gyöngyi, Z., Garcia-Molina, H. & Pedersen, J. Combating web spam with TrustRank. In Proc. 2004 VLDB Conference (eds. Nascimento, M. A. et al.) 576–587 (Morgan Kaufmann, 2004).
Brin, S. & Page, L. The anatomy of a large-scale hypertextual web search engine. 30, 107–117 (1998).
Cheng, F. et al. Network-based approach to prediction and population-based validation of in silico drug repurposing. Nat. Commun. 9, 2691 (2018).
Zhou, Y. et al. Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2. Cell Discov 6, 14 (2020).
Gysi, D. M. et al. Network medicine framework for identifying drug repurposing opportunities for COVID-19. Preprint at https://arxiv.org/abs/2004.07229 (2020).
List, M. et al. KeyPathwayMinerWeb: online multi-omics network enrichment. Nucleic Acids Res. 44, W98–W104 (2016).
Belyaeva, A. et al. Causal network models of SARS-CoV-2 expression and aging to identify candidates for drug repurposing. Preprint at https://arxiv.org/abs/2006.03735 (2020).
Ge, Y. et al. A data-driven drug repositioning framework discovered a potential therapeutic agent targeting COVID-19. Preprint at https://doi.org/10.1101/2020.03.11.986836 (2020).
Favipiravir plus hydroxychloroquine and lopinavir/ritonavir plus hydroxychloroquine in COVID-19. ClinicalTrials.gov https://clinicaltrials.gov/ct2/show/NCT04376814 (2020).
Baricitinib therapy in COVID-19. ClinicalTrials.gov https://clinicaltrials.gov/ct2/show/NCT04358614 (2020).
Lopinavir/ritonavir, ribavirin and IFN-beta combination for nCoV treatment. ClinicalTrials.gov https://clinicaltrials.gov/ct2/show/NCT04276688 (2020).
An investigation into beneficial effects of interferon beta 1a, compared to interferon beta 1b and the base therapeutic regiment in moderate to severe COVID-19: a randomized clinical trial. ClinicalTrials.gov https://clinicaltrials.gov/ct2/show/NCT04343768 (2020).
Cao, B. et al. A trial of lopinavir–ritonavir in adults hospitalized with severe Covid-19. N. Engl. J. Med. 382, 1787–1799 (2020).
Lopinavir-Ritonavir results. RECOVERY trial (2020); https://www.recoverytrial.net/results/lopinavar-results
‘Solidarity’ clinical trial for COVID-19 treatments. WHO (accessed November 2020); https://www.who.int/emergencies/diseases/novel-coronavirus-2019/global-research-on-novel-coronavirus-2019-ncov/solidarity-clinical-trial-for-covid-19-treatments
Trial of treatments for COVID-19 in hospitalized adults. ClinicalTrials.gov (2020); https://clinicaltrials.gov/ct2/show/NCT04315948
Beigel, J. H. et al. Remdesivir for the treatment of Covid-19. N. Engl. J. Med. 383, 1813–1826 (2020).
Grein, J. et al. Compassionate use of remdesivir for patients with severe Covid-19. N. Engl. J. Med. 382, 2327–2336 (2020).
Goldman, J. D. et al. Remdesivir for 5 or 10 days in patients with severe Covid-19. N. Engl. J. Med. 383, 1827–1837 (2020).
Wang, M. et al. Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-nCoV) in vitro. Cell Res. 30, 269–271 (2020).
Huang, N., Shoichet, B. K. & Irwin, J. J. Benchmarking sets for molecular docking. J. Med. Chem. 49, 6789–6801 (2006).
Wang, R., Fang, X., Lu, Y. & Wang, S. The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J. Med. Chem. 47, 2977–2980 (2004).
The RECOVERY Collaborative Group. Dexamethasone in hospitalized patients with Covid-19. New Engl. J. Med. https://doi.org/10.1056/nejmoa2021436 (2020).
WHO Rapid Evidence Appraisal for COVID-19 Therapies (REACT) Working Group. et al. Association between administration of systemic corticosteroids and mortality among critically ill patients with COVID-19: a meta-analysis. JAMA 324, 1330–1341 (2020).
Zhang, Y., Xu, Q., Sun, Z. & Zhou, L. Current targeted therapeutics against COVID-19: Based on first-line experience in China. Pharmacol. Res. 157, 104854 (2020).
Jin, Z. et al. Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors. Nature 582, 289–293 (2020).
Sun, W., Sanderson, P. E. & Zheng, W. Drug combination therapy increases successful drug repositioning. Drug Discov. Today 21, 1189–1195 (2016).
Liu, H. et al. Predicting effective drug combinations using gradient tree boosting based on features extracted from drug-protein heterogeneous network. BMC Bioinform. 20, 645 (2019).
Brimacombe, K. R. et al. An OpenData portal to share COVID-19 drug repurposing data in real time. Preprint at https://doi.org/10.1101/2020.06.04.135046 (2020).
Janes, J. et al. The ReFRAME library as a comprehensive drug repurposing library and its application to the treatment of cryptosporidiosis. Proc. Natl Acad. Sci. USA 115, 10750–10755 (2018).
Syed Shihaab, S. & Pradeep Chlorhexidine: its properties and effects. Res. J. Pharm. Technol. 9, 1755–1760 (2016).
Jarada, T. N., Rokne, J. G. & Alhajj, R. A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J. Cheminform. 12, 46 (2020).
J.B. was partially funded by his VILLUM young investigator grant no. 13154. M.S.A. received PhD fellowship funding from CONACYT (CVU659273) and the German Academic Exchange Service, DAAD (ref. 91693321). This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement nos 777111 and 826078. This publication reflects the authors’ views only and the European Commission is not responsible for any use that may be made of the information it contains. This project is funded by the Bavarian State Ministry of Science and the Arts in the framework of the Bavarian Research Institute for Digital Transformation (bidt).
The authors declare no competing interests.
Peer review information Fernando Chirigati was the primary editor on this Review and managed its editorial process and peer review in collaboration with the rest of the editorial team. Nature Computational Science thanks Arnab Chatterjee, Brian Shoichet, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Galindez, G., Matschinske, J., Rose, T.D. et al. Lessons from the COVID-19 pandemic for advancing computational drug repurposing strategies. Nat Comput Sci 1, 33–41 (2021). https://doi.org/10.1038/s43588-020-00007-6
This article is cited by
Medicine, Health Care and Philosophy (2023)
Multimedia Tools and Applications (2023)
Identification of Mulberrofuran as a potent inhibitor of hepatitis A virus 3Cpro and RdRP enzymes through structure-based virtual screening, dynamics simulation, and DFT studies
Molecular Diversity (2023)
Journal of Genetic Engineering and Biotechnology (2022)
npj Digital Medicine (2022)