ASGARD is A Single-cell Guided Pipeline to Aid Repurposing of Drugs

He, Bing; Xiao, Yao; Liang, Haodong; Huang, Qianhui; Du, Yuheng; Li, Yijun; Garmire, David; Sun, Duxin; Garmire, Lana X.

doi:10.1038/s41467-023-36637-3

Download PDF

Article
Open access
Published: 22 February 2023

ASGARD is A Single-cell Guided Pipeline to Aid Repurposing of Drugs

Bing He¹,
Yao Xiao¹,
Haodong Liang²,
Qianhui Huang¹,
Yuheng Du¹,
Yijun Li ORCID: orcid.org/0000-0003-0513-9565¹,
David Garmire³,
Duxin Sun⁴ &
…
Lana X. Garmire¹

Nature Communications volume 14, Article number: 993 (2023) Cite this article

13k Accesses
12 Citations
48 Altmetric
Metrics details

Subjects

Abstract

Single-cell RNA sequencing technology has enabled in-depth analysis of intercellular heterogeneity in various diseases. However, its full potential for precision medicine has yet to be reached. Towards this, we propose A Single-cell Guided Pipeline to Aid Repurposing of Drugs (ASGARD) that defines a drug score to recommend drugs by considering all cell clusters to address the intercellular heterogeneity within each patient. ASGARD shows significantly better average accuracy on single-drug therapy compared to two bulk-cell-based drug repurposing methods. We also demonstrated that it performs considerably better than other cell cluster-level predicting methods. In addition, we validate ASGARD using the drug response prediction method TRANSACT with Triple-Negative-Breast-Cancer patient samples. We find that many top-ranked drugs are either approved by the Food and Drug Administration or in clinical trials treating corresponding diseases. In conclusion, ASGARD is a promising drug repurposing recommendation tool guided by single-cell RNA-seq for personalized medicine. ASGARD is free for educational use at https://github.com/lanagarmire/ASGARD.

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Article 18 April 2024

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Article Open access 28 March 2024

Introduction

Heterogeneity, or more specifically, the diverse cell populations within the diseased tissue, is the leading cause of treatment failure for many complex diseases, such as cancers¹, Alzheimer’s disease², stroke³, and coronavirus disease 2019 (COVID-19)⁴, etc., as well as a major obstacle to successful precision medicine^5,6,7. Recent significant advances in single-cell technologies, especially the single-cell RNA sequencing (scRNA-seq) technology, have enabled the analysis of intercellular heterogeneity at a very fine resolution^8,9 and helped us to have many breakthroughs in understanding disease mechanisms¹⁰, such as breast cancer¹¹, liver cancer¹² and COVID-19¹³. However, its full potential for precision medicine has not been fulfilled^14,15.

Drug repurposing (also known as drug reposition, reprofiling, or re-tasking) is a strategy to identify new drug uses outside the scope of its original medical approval or investigation¹⁶. So far, few drug repurposing methods have been developed to utilize the highly valuable information contained in scRNA-seq data. The pipeline by Alakwaa identifies significantly differentiated genes (DEGs) for a specific group of cells, then predicts candidate drugs for DEGs using the Connectivity Map Linked User Environment (CLUE) platform, followed by prioritizing these drugs using a comprehensive ranking score system¹⁷. This pipeline identified didanosine as a potential treatment for COVID-19 using scRNA-seq data¹⁷. Another pipeline by Guo et al. uses a simple combination of Seurat¹⁸, a tool for scRNA-seq analysis, and CLUE to identify 281 FDA-approved drugs that can potentially be effective for treating COVID-19¹⁹. In general, the above pipelines predict drugs for each cell cluster within the patient. However, in heterogeneous diseases caused by multiple types of cells, efficient drugs should be able to address multiple cell clusters²⁰. Neither of these pipelines mentioned above can predict drugs for multiple cell clusters, limiting their utility in the era of precision medicine.

Here we propose A Single-cell Guided Pipeline to Aid Repurposing of Drugs (ASGARD) to overcome the issue above. ASGARD defines a drug score to predict drugs for multiple diseased cell clusters within each patient. The benchmarking results show that the performance of ASGARD on single drugs is more accurate and robust than other pipelines handling bulk and single-cell RNA-Seq data. We tested ASGARD on multiple cancer scRNA-Seq datasets, including patient-Derived Xenografts (PDXs) models for advanced metastatic breast cancers, Pre-T acute lymphoblastic leukemia patients, and primary tumors of Triple-Negative-Breast-Cancer (TNBC) patients. Additionally, with the ongoing worldwide COVID-19 pandemic, we applied ASGARD to scRNA-seq data from severe COVID-19 patients and predicted potential therapies to reduce deaths of severe COVID-19 patients.

Results

Summary of a Single-cell Guided Pipeline to Aid Repurposing of Drugs

Using scRNA-seq data, ASGARD repurposes drugs for disease by fully accounting for the cellular heterogeneity of patients (Fig. 1, Formula 1 in “Methods” section). In ASGARD, every cell cluster in the diseased sample is paired to that in the normal (or control) sample, according to “anchor” genes that are consistently expressed between diseased and normal cells. It then imports differentially expressed (DE) genes (adjusted P-value < 0.05) between the paired diseased and normal clusters in the scRNA-seq data, as determined by a DE detection method at the user’s choice. These individual clusters can be optionally annotated to specific cell types. To identify drugs for each single cluster (cell type), then ASGARD uses these consistently differentially expressed genes as inputs to identify drugs that can significantly (single-cluster FDR < 0.05) reverse their expression levels in the L1000 drug response dataset²¹. To identify drugs for multiple clusters, ASGARD defines a drug score (Formula 1 in “Methods” section) to evaluate the drug efficacy across multiple cell clusters selected by the user. The drug score estimates drug efficacy by taking into account the cell type proportion, the significance of reversing the differential gene expression pattern (single-cluster FDR) by the drug treatment in each selected cell cluster, and the ratio of significantly deregulated genes (adjusted P-value < 0.05) that the drug treatment can reverse in each selected cell cluster. Finally, ASGARD uses the drug score to rank and choose drugs for the disease.

**Fig. 1: The workflow of the ASGARD drug repurposing pipeline.**

We evaluated the power of the drug score by comparing ASGARD with traditional bulk-cell-based repurposing methods and single-cell-based repurposing methods using multiple independent scRNA-seq datasets, including PDX models from advanced metastatic triple-negative breast cancer (TNBC)¹¹, an acute lymphoblastic leukemia dataset²², and coronavirus disease 2019 (COVID-19) datasets^13,23 (see “Methods” section).

Comparing ASGARD to bulk-cell based repurposing methods

Before comparing ASGARD to bulk-cell-based repurposing methods, we first evaluated several external differential expression (DE) methods, on three datasets from three diseases: advanced metastatic breast cancer^11,24, acute lymphoblastic leukemia²², and coronavirus disease 2019^13,23 (see “Methods” section). We selected Limma²⁵, Seurat¹⁸, DESeq2²⁶, and edgeR²⁷ for DE methods, given that they were top-ranked methods in a benchmark study of confronting false discoveries in single-cell differential expression²⁸. We conducted systematic comparison of these methods under different modes (Fig. 2). For Limma, we compared three modes: empirical Bayes without trend (Bayes), empirical Bayes approach prior trend (trend), and precision weights (voom)²⁵. For Seurat, we compared three different DE tests: Wilcoxon rank-sum test (Wilcox), t-test, and logistic regression (LR). For DESeq2, we compared the Wald test (Wald) and the likelihood ratio test (LRT). For edgeR, we compared the likelihood ratio test (LRT) and the quasi-likelihood F-test (QLF). We identified DE genes using the above methods for each cell cluster as the inputs of ASGARD for drug repurposing. The subsequent drug prediction accuracies by ASGARD are determined by the receiver operating characteristic curves (ROCs) and the areas under the ROC curves (AUCs), using FDA-approved drugs and candidate drugs validated in advanced clinical trials as positive data (see “Methods” section).

The systematic comparison is shown in Supplementary Fig. 1, and the results from each method under the best-performing mode are shown in Fig. 2a. The Limma (Bayes) method yields the best AUC in all three datasets ranging from (0.90-0.92), significantly (P-value < 0.05, Student’s t test) better than other DE methods. Seurat (Wilcox test) and edgeR perform similarly overall, where edgeR has slightly higher AUC (0.83–0.86) than Seurat (0.80–8.86). DEseq2 on the other hand, tends to generate some of the lowest AUCs in comparison. Therefore, we used DE results from the Limma-Bayes package for the following analysis, while keeping other DE methods as options for the inputs to ASGARD.

To compare ASGARD with those drug repurposing methods using bulk RNA-Seq samples, we summarized scRNA-seq data into pseudo-bulk RNA-Seq data. We then applied bulk methods CLUE²⁹ and DrInsight³⁰ to the pseudo-bulk RNA-Seq query data and compared their results with ASGARD’s on predicting both drugs and compounds (Fig. 2b). We used the same scRNA-seq data from the same three datasets above. Since CLUE and DrInsight predict both drugs and compounds, we added compounds validated in animal models to the true positive dataset for the AUC evaluation of drug/compound predictions. As a result, the AUCs obtained from ASGARD on drugs and compounds (Fig. 2b) are slightly different from those on drugs only (Fig. 2a). On the breast cancer dataset, ASGARD yields an overall AUC of 0.92, much better than CLUE and DrInsight, with values of 0.74 and 0.81, respectively. On precursor T cell acute lymphoblastic leukemia data, ASGARD yields an AUC of 0.95 in drug/compound repurposing for leukemia patients, while CLUE and DrInsight achieve worse average AUCs of 0.82 and 0.73, respectively. For the COVID-19 datasets, ASGARD shows an AUC of 0.88 in drug/compound repurposing, while CLUE and DrInsight have lower AUCs of 0.85 and 0.73, respectively, for the same patients (Fig. 2b). In summary, by paying attention to heterogeneity at single-cell levels, ASGARD shows much better drug repurposing predictability than methods that rely on bulk samples.

Comparing ASGARD to other single-cell-based repurposing methods

We also compared single drug prediction using ASGARD with two other pipelines developed by Alakwaa et al.^17,19 and Guo et al.^17,19, which were reported to handle scRNA-Seq data. Note that ASGARD offers more functionalities than those two methods. Alakwaa’ and Guo’ pipelines can only repurpose drug/compounds for every cluster, but not on a multi-cluster level. On the other hand, ASGARD can compute both the single-cluster-level drug significance and the multi-cluster drug score (Formula 1 in “Methods” section). The above section shows that the ASGARD multi-cluster drug score shows AUCs of 0.92, 0.95, and 0.88 for breast cancer, leukemia, and COVID-19, respectively (Fig. 2b). For a fair comparison, we further tested the single-cluster-level drug prediction accuracies of these three methods (Fig. 3 and Supplementary Fig. 2). Even at the single-cluster-level, ASGARD still shows the best AUCs on every individual cluster from breast cancer, leukemia, and COVID-19 datasets (Fig. 3). On the 8 clusters of the breast cancer dataset, ASGARD yields an averaged AUC of 0.83 (0.80–0.86), significantly better (P-value = 0.0028, Student’s t test) than Alakwaa’ and Guo’ pipelines, with averaged AUC values of 0.76 (0.68–0.83) and 0.54 (0.47–0.56) respectively (Fig. 3a and Supplementary Fig. 2). On the 4 clusters of precursor T cell acute lymphoblastic leukemia data, ASGARD has an averaged AUC of 0.81 (0.76–0.85), again significantly better (P-value < 0.001, Student’s t test) than Alakwaa’ and Guo’ pipelines, with averaged AUC values of 0.51 (0.49-0.56) and 0.52 (0.49-0.55) respectively (Fig. 3b and Supplementary Fig. 2). Similar trends exist in the neutrophile, NK, T cell and monocytes that have increased cell proportions in the decreased severe vs. cured severe COVID-19 patients. While ASGARD achieves average AUCs of 0.82 (0.77–0.88), Alakwaa’ and Guo’ methods have reduced average AUCs of 0.72 (0.63–0.80), and 0.58 (0.55–0.62; Fig. 3c and Supplementary Fig. 2). These results support the conclusion that ASGARD predicts drugs more accurately than Alakwaa’ and Guo’ pipelines.

Additionally, given that sample size, cell population similarity, and proportion of disease cells impact significantly on differential gene analysis³¹, we further performed robustness assessments of the three pipelines across different sizes of single-cell populations, different differential levels of single-cell populations, and different proportions of diseased cells using simulation data based on “GSE123926” and “GSE113197” dataset (see “Methods” section). The AUCs of the three single-cell drug repurposing pipelines on the simulation data show that ASGARD, as well as the other two pipelines, have very robust performance across different sizes of single-cell populations (Supplementary Fig. 3A), different degrees of DE between disease and normal conditions (Supplementary Fig. 3B), and different proportions of disease cells among the scRNA-Seq data (Supplementary Fig. 3C).

We demonstrate that ASGARD is a promising drug recommendation pipeline through computational and clinical validation. In the following sections, we further illustrate the results of ASGARD applied to breast cancer, leukemia, and COVID-19, respectively.

Drug repurposing for breast cancers

We collected scRNA-seq data from 24,741 epithelial cells of advanced metastatic breast cancer Patient-Derived Xenografts (PDXs) models¹¹ and 16,998 epithelial cells from normal breast tissues²⁴. After preprocessing, all cancer cells and 16,954 normal cells were paired and clustered into 8 populations (Supplementary Fig. 4A). Cluster 1 (C1) is the largest one covering 33.68% of cells, while cluster 8 (C8) is the smallest one accounting for only 1.8% of cells (Supplementary Fig. 4A). The differentially expressed genes (adjusted P-value < 0.05, cancer vs normal) in the clusters are significantly enriched in 10 well-known breast cancer-related pathways, including apoptosis, cell cycle, estrogen signaling, IL − 17 signaling, neurotrophin signaling, NF − kappa B signaling, NOD − like receptor signaling, p53 signaling, PI3K − Akt signaling and TNF signaling pathways (Supplementary Fig. 4B). Cluster 7 (C7) has the largest number of 7 significant pathways, while C1 and C6 each have only 1 significant pathway.

We first applied ASGARD for multi-cluster drug repurposing prediction and predicted 11 drugs (FDR < 0.05 and overall drug score >0.99 quantiles) for advanced metastatic breast cancer (Supplementary Fig. 4C and Supplementary Data 1). Fostamatinib is the top 1 drug candidate (Supplementary Fig. 4C). It is a tyrosine kinase inhibitor medication approved for the treatment of chronic immune thrombocytopenia³². Colchicine, the second-best candidate, is an alkaloid approved for treating the inflammatory symptoms of familial Mediterranean fever³³. Both fostamatinib and colchicine have shown antitumor and anti-metastasis effects in animal models of breast cancer^34,35. Moreover, the 4th candidate fulvestrant, and 7th candidate neratinib have been approved by the Food and Drug Administration (FDA) for breast cancer treatment^36,37.

To explore the potential molecular mechanisms of the top 2 candidates, we next investigated the target genes and pathways of fostamatinib and colchicine across the eight cell clusters (Supplementary Fig. 4D). Fostamatinib and colchicine both target all the significant pathways in each cluster. Fostamatinib and colchicine are complementary in targeting genes of these pathways. Among the 143 target genes from these significant pathways, only 29 target genes are shared by fostamatinib and colchicine (Supplementary Fig. 4D). The fostamatinib and colchicine also show biologically synergistic targeting of multiple genes on the same significant pathways. For example, fostamatinib inhibits Cyclin D1 (CCND1) to produce G1 arrest in the p53 signaling pathway, while colchicine inhibits Cyclin-dependent kinase 1 (CDK1) to produce G2 arrest in the p53 signaling pathway and cell cycle pathway³⁸ (Supplementary Fig. 4D). Additionally, the drug scores of top drug candidates vary from one PDX model to another (Supplementary Fig. 4D), demonstrating that ASGARD is a forward-looking precision medicine strategy in silico.

To evaluate the reliability of ASGARD on breast cancer patient data, we downloaded four Triple Negative Breast Cancer (TNBC) samples along with four controls from the “GSE161529” dataset³⁹, in order to compare with the drug prediction results from the PDX models of TNBC described earlier. After the preprocessing procedure by Seurat, the TNBC samples contain an average of 5580 cells. We aligned all 8 samples, paired the cases vs. controls, and clustered them into 6 groups: B-cell, endothelial cell, epithelial cell, macrophage, T cell, and tissue stem cell (Fig. 4a). Epithelial cells are the largest group covering 45.98% of cells on average, while endothelial cells are the smallest group as expected, accounting for only 1.152% of total cells (Fig. 4a). ASGARD predicted 13 drugs with significant FDR p-values in at least one of the four TNBC patients (Fig. 4b). For comparison, we also performed drug predictions on the 2 PDX models of TNBC patients, using the same procedures (Supplementary Fig. 4c). Of great interest, four of the most significant drugs from TNBC patients overlap with those predicted by the two PDX samples. These drugs are mebendazole, crizotinib, neratinib, and vinblastine (Fig. 4b). Both neratinib and vinblastine have been proven by the FDA for the treatment of breast cancer^40,41. Mebendazole is a well-known anti-helminthic drug with wide clinical use. It has been reported to have anti-cancer properties in preclinical studies and has been in many clinical trial studies for treating various cancers, including liver, lung cancers, and glioma⁴². Crizotinib is a receptor tyrosine kinase inhibitor showing tumor-reducing effects in vitro and in vivo^43,44. It is now in a phase 2 clinical trial for treating patients with TNBC (ClinicalTrials.gov Identifier: NCT03620643).

**Fig. 4: Drug repurposing in triple-negative breast cancer (TNBC) patient samples.**

To show quantitatively that ASGARD prediction on the TNBC samples is valuable, we next conducted two additional sets of analyses. First, we compared its results with those using TRANSACT⁴⁵, another computational method to calculate drug sensitivity. Since the TRANSACT can only predict drugs existing in the GDSC dataset⁴⁶, thus we can only compare the subset of drugs predicted by ASGARD in the GDSC dataset. As shown in Fig. 4c, ASGARD and TRANSACT results are well correlated. As ASGARD’s drug score increases, the drug sensitivity in TRANSACT also increases. Second, we investigate the effect of the tumor microenvironment on the drug scores. Thus, we did in silico drop-one-out experiment, which excluded one cell type at a time. Among all cell types in the tumor microenvironment, T cell leads to the most drastic drug score changes, as well as the most variable drug score changes among different drugs (Fig. 4d). Moreover, the drug score changes also differ among the four TNBC patients, showing the sensitivity of ASGARD in personalized drug prediction.

To explore the potential molecular mechanisms of the top drug candidate, we next investigated the target genes and pathways of mebendazole across the six cell clusters (Fig. 4e, f). mebendazole targets many important genes and pathways in TNBC, such as signal transducer and activator of transcription 1 (STAT1) in Toll-like receptor signaling pathway, Vascular Cell Adhesion Molecule 1 (VCAM1) in NF-kappa B signaling pathway, Matrix Metallopeptidase 14 (MMP14) in TNF signaling pathway, signal transducer and activator of transcription 2 (STAT2) in NOD-like receptor signaling pathway, cyclin-dependent kinase inhibitor 1 A (CDKN1A) in PI3K-Akt signaling pathway, etc. These targeted genes and pathways are essential for the proliferation, migration, and invasion of TNBC cells^47,48, and were suggested as therapeutic drug targets for TNBC in previous studies^49,50.

Drug repurposing for precursor T cell acute lymphoblastic leukemia (Pre-T ALL)

We further applied ASGARD to the collected scRNA-seq data from 2 Pre-T ALL patients and three normal healthy controls²². ASGARD identifies eight types of cells (Fig. 5a), in which T cells are further clustered into four sub-populations (Fig. 5b). Cluster 1 (C1) is the largest one, covering 47.29% of cells, while cluster 4 (C4) is the smallest, accounting for only 2.11% of cells (Fig. 5b). The differentially expressed genes (adjusted P-value < 0.05, Pre-T ALL vs. normal) in the T cell clusters are significantly enriched in 6 pathways, including apoptosis, cell cycle, cGMP−PKG signaling, NF − kappa B signaling, p53 signaling, and T cell receptor signaling pathways (Fig. 5c).

Among the predicted drugs by ASGARD, the first candidate, tretinoin, has been approved for the treatment of leukemia⁵¹ (Fig. 5d and Supplementary Data 2). Tretinoin is a vitamin A derivative. We further explored the potential molecular mechanisms of the FDA-approved top1 candidate tretinoin. Tretinoin targets many leukemia-related genes and all the significant pathways in the 4T cell clusters, including: the regulator MDM4 in the p53 signaling pathway, cyclin D3 (CCND3) in cell cycle and p53 signaling pathways, G protein subunit alpha q (GNAQ) and phospholipase C beta 1 (PLCB1) in the cGMP−PKG signaling pathway, Fos protooncogene (FOS) and p21 (RAC1) activated kinase 2 (PAK2) in the T cell receptor signaling pathway, spectrin alpha non-erythrocytic 1 (SPTAN1) in the apoptosis pathway, and zeta chain of T cell receptor-associated protein kinase 70 (ZAP70) in apoptosis and NF − kappa B signaling pathways (Fig. 5e). All these genes and pathways were previously shown to have significance in the pathogenesis of Pre-T ALL^52,53,54. The drug target genes and pathways in the T cell clusters explain why ASGARD predicts tretinoin for leukemia and how tretinoin treats leukemia.

Drug repurposing for severe patients with coronavirus disease 2019 (COVID-19)

The immune response activated by the SARS-CoV-2 virus infection is a double-edged sword. It protects the human body from viral infection. But the deregulated immune response in severe COVID-19 patients damages the alveolar to cause respiratory failure that kills the patients^55,56. To find drugs that may help to reduce the mortality of severe COVID-19, we collected scRNA-seq data from the bronchoalveolar lavage fluid (BALF) of 15 severe COVID-19 patients^13,23. Among them, 11 patients were cured (cured severe patient), while four died (deceased severe patient) afterward. To identify immune cells that correlate with the death of severe patients, we compared the scRNA-seq data between deceased severe and cured severe patients. In total, there are seven types of cells, including six types of immune cells and epithelial cell types (Fig. 6a), in the BALF samples collected from severe COVID-19 patients. Monocyte is the largest T cell population in both deceased and cured severe COVID-19 patients (Fig. 6b). The population of neutrophil, NK cell, T cell, and monocyte increased in deceased severe COVID patients compared to the cured ones, suggesting the important role of these four types of cells in COVID-19-related death^57,58,59,60 (Fig. 6b). The differentially expressed genes (adjusted P-value < 0.05, deceased severe vs cured severe) in the four types of cells are significantly enriched (adjusted P-value < 0.05) in 8 pathways, including chemokine signaling, coronavirus disease−COVID − 19, IL − 17 signaling, JAK − STAT signaling, NF − kappa B signaling, T cell receptor signaling, TNF signaling and Toll−like receptor signaling pathways (Fig. 6c). Coronavirus disease−COVID − 19 pathway is the most significant pathway in these cells, as expected. Chemokine signaling, NF − kappa B signaling, TNF signaling, and Toll−like receptor signaling pathways are the most widely enriched pathways in all four types of cells. T cell receptor signaling pathway is only enriched in T cells.

**Fig. 6: Drug repurposing for reducing mortality of severe COVID-19 patients.**

We identified the differential gene expression profiles of the four cell types, including neutrophil, NK cell, T cell, and monocyte, by comparing decreased severe patients to cured severe ones. Then we put the differential gene expression profiles to ASGARD to identify drug candidates using the multi-cluster drug score. Among the predicted drugs, rescinnamine (2nd) and enalapril (4th) caught our attention (Fig. 6d, Supplementary Data 3). Both rescinnamine and enalapril are angiotensin-converting enzyme (ACE) inhibitors. Angiotensin-converting enzyme 2 (ACE2) mediates the SARS-CoV-2 cell entry. It’s interesting to see rescinnamine and enalapril are predicted by ASGARD for treating severe COVID-19. So, we further explored their target genes and pathways in the four cell types. Rescinnamine and enalapril share most of the key genes on all the significant pathways in monocyte, NK cell, neutrophil, and T cell, respectively (Fig. 6e). In monocyte, rescinnamine and enalapril share 47 key target genes, including Janus Kinase 1 (JAK1), Janus Kinase 2 (JAK2), C-C Motif Chemokine Ligand 2 (CCL2), C-C Motif Chemokine Ligand 4 (CCL4), and C-C Motif Chemokine Ligand 8 (CCL8), and all the 7 significant pathways. In NK cells, rescinnamine and enalapril share 35 key target genes from 6 significant pathways, such as JAK1, Janus Kinase 3 (JAK3), CCL4, tumor necrosis factor (TNF), and Signal Transducer and Activator of Transcription 2 (STAT2). In neutrophils, rescinnamine and enalapril share 16 key target genes, such as CCL2, CCL8, C-X-C Motif Chemokine Ligand 8 (CXCL8) and C-X-C Motif Chemokine Ligand 10 (CXCL10), and all the 5 significant pathways. In T cell, rescinnamine and enalapril share 30 key target genes, such as CCL2, CCL8, C-X-C Motif Chemokine Ligand 9 (CXCL9), JAK3, TNF, and Lymphocyte Cytosolic Protein 2 (LCP2), and all the 6 significant pathways of T cell. The shared target genes and pathways in corresponding cells were previously shown related to death from COVID-19^57,58,59,60.

Discussion

This study presents a Single-cell Guided pipeline to Aid Repurposing of Drugs (ASGARD) as a new generation of personalized drug recommendation system. To evaluate the accuracy of ASGARD in single drug repurposing, we compared ASGARD to other repurposing methods that utilize bulk cell RNA-Seq (CLUE and DrInsight) or single-cell RNA-Seq data (Alakwaa’s and Guo’s) on a variety of diseases, including breast cancer, leukemia, and COVID-19. ASGARD performs much better than all these methods in predicting drugs/compounds (Figs. 2, 3, Supplementary Fig. 2). The performance of ASGARD is also robust across different sizes and proportions of cell populations, as well as differential expression levels (Supplementary Fig. 3). Moreover, we highlight that ASGARD defines a drug score to summarize drug efficiency across multiple selected cell clusters. These important functions are missing in other simpler single-cell RNA-Seq drug reposition pipelines by Alakwaa and Guo. Both Alakwaa’ and Luo’s pipelines use the CLUE platform, a cloud-based platform developed by the LINCS Center for signature-gene based drug ranking⁶¹. Additionally, Luo’s method uses log fold change as the additional threshold to filter the gene query. ASGARD on each cluster is related to DrInsight, a concordantly expressed genes (CEG) based, enhanced drug repurposing method compared to other signature-based searching methods³⁰. It uses order statistics to directly measure the concordance (e.g. inverse association) between the disease data and drug-perturbed data and identifies concordantly expressed genes (CEGs). CEGS are used as features to further formulate an outlier sum statistic for drug selection, rather than the connectivity score (usually −90) based cut-off for drug selection. The CEG and outlier sum statistic contribute to higher performance in ASGARD.

ASGARD achieves drug ranking for the disease/patient by a drug score that evaluates the treatment efficacy across the user-selected cell clusters (Formula 1 in “Methods” section). The prediction using the multi-cluster drug score shows a significantly (P-value < 0.05, Student’s t test) better AUC than the prediction based on individual clusters (Figs. 2 and 3). It suggests that targeting an individual cell cluster is not sufficient for successful drug prediction. Instead, targeting multiple essential diseased cell clusters is a more appropriate strategy for drug prediction. On the other hand, it is not ideal to propose drug repurposing using bulk RNA-seq, a mixture of all cells, as done by traditional methods (e.g. CLUE and DrInsight). Significant heterogeneity exists in different T cell populations; not all these cells play equal roles in the diseases^62,63, reflected by different gene expression responses to drug treatment⁶⁴. ASGARD can distinguish more important T cell types from others and repurpose drugs accordingly, explaining why ASGARD has significantly (P-value < 0.05, Student’s t test) better AUC performance than traditional bulk methods (Fig. 2b). Moreover, ASGARD also demonstrates variations in drug scores across different patients (Figs. 4b, 5d, and 6d). This result stresses that personalized therapy is necessary for the best therapeutic effect, and utilizing single-cell sequencing information may help to achieve that.

Sparsity and heterogeneity are two major challenges in analyzing single-cell data, and usually cause false discoveries of differentially expressed genes⁶⁵. Previous benchmark study showed that Seurat¹⁸, DESeq2²⁶, edgeR²⁷, and Limma²⁵ are among the top methods in discovering differentially expressed genes using single-cell data²⁸. We here compared the effect of these methods and different parameterization on the downstream drug repurposing, using AUC metric. AUC performance still varies with methods for single-cell differential expression (Fig. 2a, Limma (Bayes) method showed the best average AUC performance compared to all other three methods. Within the Limma method, approaches that model mean-variance with the empirical Bayes approach (Limma Bayes and Limma trend) showed better AUCs than that with the precision weights approach (Limma voom). Similar observations were observed in some of the comparisons in a benchmark study of single-cell differential expression²⁸. The empirical Bayes approach is usually more powerful than the precision weights approach when the library sizes are not quite variable between samples⁶⁶. Seurat, a method widely used in single-cell studies, has the 2nd best AUCs in general. In particular, the default mode of Wilcoxon rank-sum test in Seurat has a slightly better average AUC than the t-test and logistic regression (LR) modes. Our comparison revealed that DE methods should be carefully selected according to the status of the dataset to achieve the best performance. Accordingly, ASGARD was designed as a flexible framework supporting various methods for single-cell differential gene expression analysis.

We chose breast cancer or leukemia datasets to illustrate the utilities of ASGARD, given the relative abundance of prior drug knowledge. FDA has approved many drugs predicted by ASGARD, such as neratinib and vinblastine for treating breast cancer^36,37 (Fig. 4b and Supplementary Fig. 4c), and tretinoin for treating leukemia⁵¹ (Fig. 5d). Vinblastine and neratinib were predicted for breast cancer in both TNBC patient and PDX datasets. Vinblastine is a vinca alkaloid that has been used in the treatment of metastatic breast cancer since the early 1980s. The regimen of vinblastine/mitomycin is an effective salvage regimen and an excellent first-line chemotherapeutic treatment for women with metastatic breast cancer⁴¹. Neratinib is a protein kinase inhibitor that was approved in July 2017 as an extended adjuvant therapy in breast cancer³⁷. Recently, a randomized phase III clinical trial of 621 patients from 28 countries showed neratinib significantly improved the progression-free survival of patients with advanced breast cancer⁶⁷. Tretinoin, also known as all-trans-retinoic acid (ATRA), is the first candidate predicted by ASGARD for leukemia. Tretinoin targets all significant pathways, such as p53 signaling, cell cycle, and apoptosis pathways, for each diseased cell cluster in leukemia patients (Fig. 5e). These pathways play important roles in the survival of leukemia patients⁵⁴. Consistent with our prediction, tretinoin was approved by the FDA to induce remission in patients with acute leukemia⁵¹. Tretinoin significantly improves the survival of acute leukemia⁶⁸. Tretinoin with chemotherapy has become the standard treatment for acute leukemia, resulting in cure rates exceeding 80%⁶⁹. The successful prediction of FDA-approved drugs supports the reliability of ASGARD.

Beyond the above described FDA-approved cases, ASGARD also predicts candidate drugs for breast cancer and leukemia. Crizotinib is a candidate from both TNBC patient and PDX model data (Fig. 4b and Supplementary Fig. 4c). Crizotinib is a receptor tyrosine kinase inhibitor that inhibits the growth, migration, and invasion of breast cancer cells in preclinical studies^43,44. A case report showed the TNBC patient harboring the ALK fusion mutation had a dramatic response to crizotinib treatment⁶⁹. It is also in a phase 2 clinical trial for treating patients with TNBC (ClinicalTrials.gov Identifier: NCT03620643). For leukemia, ASGARD predicts Vorinostat, a histone deacetylase (HDAC) inhibitor, as one of the candidate drugs. Vorinostat was approved by FDR for treating patients with progressive, persistent, or recurrent cutaneous T- cell lymphoma⁷⁰. It induces cell apoptosis in one T cell leukemia cell line in vitro⁷¹, and improves the outcome of acute T cell lymphoblastic leukemia in animal models⁷². The results of a recent clinical trial (ClinicalTrials.gov Identifier: NCT04467931) show vorinostat is a promising candidate drug for T cell acute lymphoblastic leukemia⁷³.

Since ASGARD repurposes candidate drugs to reverse “diseased” cells to “normal” cells, it’s important to set proper controls according to the aim of the study. The best controls are arguably from the normal tissues of the same patients, and the next best ones are from the patients without such a disease but matched on other major confounders. Although consortiums such as Human Cell Atlas aim to obtain normal tissues from clinically healthy samples, it may not be easy to obtain normal samples for some diseases. Under such scenarios, samples from the very early stage of the disease or controls from tissues with the most relevant origin could be substitutes, until the data from the normal tissues are available. Additionally, a recent report has proposed using a deep-learning based approach to identify the best reference control tissue, an attractive strategy that relies much less on prior-assumptions⁷⁴. Additionally, ASGARD built the drug reference using drug responses data from LINCS L1000 project²¹, which were collected from 98 cell lines. We divided the drug reference into several tissue specific drug references according to the tissue origin of the cell lines. It’s highly recommended to select tissue specific drug references when using ASGARD. For example, for drug repurposing in breast cancer, it is best to use drug responses collected from breast cell lines. If there is not a proper cell-type specific reference for the target disease, it might be worth identifying the cell line whose base-line gene expression profiles are most similar to “control” samples, after adjusting for systematic differences between cell line vs. primary tissues (e.g. using transfer learning).

Altogether, this study shows clear evidence that ASGARD defines a single-cell-based reliable drug score for repurposing confident drugs, which were approved or in clinical trials for breast cancer, leukemia, and COVID-19, respectively. It also provides new applications for drugs that warrant further clinical studies. In all, ASGARD is a single-cell guided pipeline with significant potential to recommend repurposeful drugs.

Methods

Single-cell RNA sequencing (scRNA-seq) data

We obtained multiple scRNA-seq datasets from the Gene Expression Omnibus (GEO) database. ScRNA-seq data of cells from 4 Triple-Negative-Breast-Cancer (TNBC) patients and 4 healthy controls are from “GSE161529”. Epithelial cells from Patient-Derived Xenografts (PDXs) models of 2 patients with advanced metastatic TNBCs and adult human breast epithelial cells from 3 healthy women are from GEO with accession numbers “GSE123926”¹¹ and “GSE113197”²⁴, respectively. Another scRNA-seq pediatric bone marrow mononuclear cells (PBMMC) dataset from 2 Pre-T acute lymphoblastic leukemia patients and three healthy controls is from GEO with accession number “GSE132509”²². The last set of scRNA-seq datasets are single cells from the bronchoalveolar lavage fluid (BALF) of 15 severe COVID-19 patients (4 deceased and 11 cured) from GEO with accession numbers “GSE145926”¹³ and “GSE158055”²³.

Processing of scRNA-seq data

ASGARD accepts processed scRNA-seq data from the Seurat package¹⁸. In this study, genes identified in fewer than three cells are removed from the dataset. We used the same criteria as in their original studies to filter cells^11,13,24. Preprocessing steps remove the following cells from the dataset: (1) epithelial cells from breast cancer PDXs and healthy breast tissues with fewer than 200 unique genes, (2) PBMC cells from leukemia patients and healthy controls with fewer than 200 unique genes, and (3) BALF cells from COVID-19 patients with fewer than 200 unique genes, more than 6000 unique genes or have a proportion of mitochondrial genes larger than 10%¹³. For consistency, cells from TNBC patients with fewer than 200 unique genes are also removed from the dataset³⁹. We used cell cycle marker genes and linear transformation to scale the expression of each gene and remove the effects of the cell cycle on gene expression.

Cell pairwise correspondences

ASGARD suggests using functions from Seurat for cell pairwise correspondences. In this study, gene counts for each cell were divided by the total counts for that cell and multiplied by a scaling factor (default is set to 10,000). The count matrix was then transformed by log 2(count+1) in R. To identify gene variance across cells, we first fitted a line to the relationship of log(variance) and log(mean) using local polynomial regression (loess). Then we standardized the feature values using the observed mean and expected variance (given by the fitted line). Gene variance was then calculated on the standardized values. We used the 2000 genes with the highest standardized variance for downstream analysis. Then we identified the K-nearest neighbors (KNNs) between disease and normal cells based on the L2-normalized canonical correlation vectors (CCV). Finally, we built up the pairwise cell correspondences by identifying mutual nearest neighbors¹⁸.

Cell clustering and annotation

We applied principal component analysis (PCA) from Seurat on the scaled data to perform the linear dimensional reduction. Then we used a graph-based clustering approach¹⁸. In this approach, we first constructed a KNN graph based on the euclidean distance in PCA space and refined the edge weights between any two cell pairs using Jaccard similarity. Then we applied the Louvain algorithm of modularity optimization to iteratively group cell pairs together. We further ran non-linear dimensional reduction (UMAP) to place similar cells within the graph-based clusters determined above together in low-dimensional space. To annotate clusters of cells, we ran an automatic annotation of single cells based on similarity to the referenced single-cell panel using the SingleR package⁷⁵. We used the dominant cell type (>50% cells) as the cell type of the cluster.

Drug repurposing

ASGARD supports importing differentially expressed genes calculated from multiple external methods, including Limma²⁵, Seurat (Wilcoxon Rank-Sum test)¹⁸, DESeq2²⁶, and edgeR²⁷. The differentially expressed gene list in disease is transformed into a gene rank list. ASGARD uses 21,304 drugs/compounds with response gene expression profiles in 98 cell lines from the LINCS L1000 project²¹. A differential gene expression list in response to drug treatment is also transformed into a gene rank list. ASGARD further identifies potential candidate drugs that yield reversed gene expression patterns from those of diseased vs. normal cells, using the DrInsight package³⁰ (version: 0.1.1). Specifically, it identifies consistently differentially expressed genes, which are up-regulated in cells from diseased tissue but down-regulated in cells with drug treatment, or down-regulated in cells from diseased tissue but up-regulated in cells with drug treatment. It then calculates the outlier-sum (OS) statistic⁷⁶, representing the effect of reversed differential gene pattern by the drug treatment. The Kolmogorov–Smirnov test (K-S test) is then applied to the OS statistic, to show the significance level of one drug treatment relative to the background of all other drugs in the dataset. The reference drug dataset contains gene rank lists of 591,697 drug/compound treatments from the LINCS L1000 data, as mentioned above. The Benjamini-Hochberg procedure is used to adjust P-values from the K-S test to control False Discovery Rate (FDR) of multiple hypothesis testing⁷⁷.

Drug evaluation

ASGARD defines a drug score at the individual patient level (Formula 1), which calculates the drug efficacy across all single-cell clusters in a given patient’s scRNA-Seq data. The drug score estimates drug efficacy using the cell type proportion, the significance of reversed differential gene expression pattern (FDR), and the ratio of reversed significantly deregulated genes over disease-related (or selected) single-cell clusters. The drug score is estimated by the following formula:

$${Drug}\,{score}=\mathop{\sum }\limits_{k=1}^{n}\left(\frac{{{Num}({Cell})}_{k}}{{Num}({Total}.{Cell})}*\left({-\log }_{10}^{{{FDR}}_{k}}\right)*\frac{{{Num}({ReversedGene})}_{k}}{{Num}({{DiseasedGene}})_{k}}\right)$$

(1)

In this formula, $k$ is a particular single-cell cluster, $n$ represents all disease-related (or selected) single-cell clusters, $\frac{{{Num}({Cell})}_{k}}{{Num}({Total}.{Cell})}$ represents cellular proportion of the cluster $k$ in all diseased cells, ${-\log }_{10}^{{{FDR}}_{k}}$ represents the significance of reversed differential gene pattern in the cluster $k$ by drug treatment, and $\frac{{{Num}({ReversedGene})}_{k}}{{Num}({{DiseasedGene}})_{k}}$ represents the ratio of reversed disease-related genes by drug treatment. Specifically, ${Num}({To}{tal}.{Cell})$ is the total number of cells in the sample and ${{Num}({Cell})}_{k}$ is the number of cells in the cluster $k$. ${{FDR}}_{k}$ is the drug’s FDR-adjusted p-value (significance of reversed differential gene pattern) for cluster $k$. ${{Num}({DiseasedGenes})}_{k}$ is the number of significantly deregulated genes in a cluster $k$, while ${{Num}({ReversedGenes})}_{k}$ is the number of significantly deregulated genes in a cluster $k$ that can be reversed by the drug. To allow a comparison of drug efficacy across patients, ASGARD also provides a standardized drug score, which has a scale of 0 to 1 (Formula 2).

$${Standardized}\,{Drug}\,{Score}\,=\,1-\frac{{Rank}({Drug})}{{Total}\,{Num}({Drug})}$$

(2)

Besides the drug score, ASGARD further provides Fisher’s combined P-value⁷⁸ over the original P value of every cluster. The combined p-value is calculated as the right-tail probability ${P}_{{x}^{2}\left(2n\right)}(T\, > \,t)$, where $t=-2{\sum }_{i=1}^{n}{\log }_{10}^{{P}_{i}}$. The BH FDR is used to adjust Fisher’s combined P-value. The adjusted Fisher’s combined P-value (FDR) is independent of the drug score. The FDR and drug score can be used together or independently for drug selection. By default, ASGARD uses the drug score for drug selection. Drugs with a higher value of drug score are supposed to have better therapeutic effects than those with a lower value.

Benchmarking ASGARD

We use the receiver operating characteristic curves (ROCs) and the areas under the ROC curves (AUCs) to compare the performance of ASGARD with those of the other two pipelines, as well as bulk methods. Since these pipelines/ methods report both drugs and compounds, we let ASGARD report both drugs and compounds in the comparisons with other pipelines/methods. ROCs and AUCs are calculated for each pipeline using the pROC package⁷⁹. In ROC and AUC estimation, we regarded FDA-approved drugs and compounds used in advanced clinical trials or have been proven effective in animal models as positive cases (Supplementary Data 4), and all other drugs as negative cases. To identify drugs and compounds used in advanced clinical trials or have been proven effective in animal models, we used three databases that are ClinicalTrials.gov, PubMed, and PubChem, and all the drugs and compounds we found are listed in Supplementary Data 4.

We determined the need to assess the robustness of the three pipelines on different (1) sizes, (2) similarities, and (3) unbalances of single-cell populations. For (1), the Bootstrapping method in R⁸⁰ generated simulation data of different sizes by randomly drawing the same number of disease and normal cells from “GSE123926” and “GSE113197”. For (2), additional simulation data are generated by adjusting the differential gene expression levels from 20% to 90% of the original differential levels of the single-cell cluster, based on “GSE123926” and “GSE113197”. For (3), simulation data is generated by randomly drawing 5000 cells with diseased cell proportions ranging from 20% to 90%, thereby yielding unbalanced populations.

Drug score analysis

To examine the impact of each cell type on a drug score, we conducted in silico drop-one-out experiments, excluding one cell type from the scRNA-Seq data at a time. The difference between the new drug score and the original drug score is then calculated to reflect the contribution of each cell type to the drug prediction.

To validate ASGARD, we compared it with another drug response prediction method TRANSACT⁸¹, on the TNBC dataset (“GSE161529”). Since TRANSACT is a method working on bulk gene expression data, we took the mean expression value of each gene across all cells as the pseudo-bulk expression value of that gene. To fit TRANSACT with the dataset we used, we changed two parameters, number_pc[‘target’] and n_pv, to 3 and maintained all other parameters at the same value as the authors’ original report⁸¹.

Statistics

All data are presented as mean ± standard deviation (SD), except otherwise stated. P-values are adjusted with Benjamini–Hochberg (BH) false discovery rate (FDR). Differences were considered significant when adjusted P-value < 0.05. The test used is mentioned in the figure legend.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

ScRNA-Seq data are available in Gene Expression Omnibus (Accession number: “GSE161529”, “GSE123926”, “GSE113197”, “GSE132509”, “GSE158055”, and “GSE145926”). Phase I LINCS L1000 data are available in Gene Expression Omnibus (Accession number: “GSE92742”). Phase II LINCS L1000 data are available in Gene Expression Omnibus (Accession number “GSE70138”). All other relevant data supporting the key findings of this study are available within the article and its Supplementary Information files or from the corresponding author upon reasonable request. Source data are provided with this paper.

Code availability

ASGARD is available as an R package on GitHub (https://github.com/lanagarmire/ASGARD)⁸² under the PolyForm Noncommercial License. The scripts used in this study are available on GitHub (https://github.com/lanagarmire/Single-cell-drug-repositioning)⁸³.

References

Dagogo-Jack, I. & Shaw, A. T. Tumour heterogeneity and resistance to cancer therapies. Nat. Rev. Clin. Oncol. 15, 81–94 (2018).
Article CAS PubMed Google Scholar
Devi, G. & Scheltens, P. Heterogeneity of Alzheimer’s disease: consequence for drug trials? Alzheimers Res. Ther. 10, 122 (2018).
Article PubMed PubMed Central Google Scholar
Caprio, F. Z. & Sorond, F. A. Cerebrovascular disease: primary and secondary stroke prevention. Med. Clin. North Am. 103, 295–308 (2019).
Article PubMed Google Scholar
Sominsky, L., Walker, D. W. & Spencer, S. J. One size does not fit all - patterns of vulnerability and resilience in the COVID-19 pandemic and why heterogeneity of disease matters. Brain Behav. Immun. 87, 1–3 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pectasides, E. et al. Genomic heterogeneity as a barrier to precision medicine in gastroesophageal adenocarcinoma. Cancer Discov. 8, 37–48 (2018).
Article CAS PubMed Google Scholar
Molinari, C. et al. Heterogeneity in colorectal cancer: a challenge for personalized medicine? Int. J. Mol. Sci. 19, 3733 (2018).
Article PubMed PubMed Central Google Scholar
Chen, B. et al. Harnessing big “omics” data and AI for drug discovery in hepatocellular carcinoma. Nat. Rev. Gastroenterol. Hepatol. 17, 238–251 (2020).
Article PubMed PubMed Central Google Scholar
Heath, J. R., Ribas, A. & Mischel, P. S. Single-cell analysis tools for drug discovery and development. Nat. Rev. Drug Discov. 15, 204–216 (2016).
Article CAS PubMed Google Scholar
Huang, Q., Liu, Y., Du, Y. & Garmire, L. X. Evaluation of cell type annotation R packages on single-cell RNA-seq data. Genom. Proteom. Bioinform. https://doi.org/10.1016/j.gpb.2020.07.004 (2020).
Article Google Scholar
Ortega, M. A. et al. Using single-cell multiple omics approaches to resolve tumor heterogeneity. Clin. Transl. Med. 6, 46 (2017).
Article PubMed PubMed Central Google Scholar
Merino, D. et al. Barcoding reveals complex clonal behavior in patient-derived xenografts of metastatic triple negative breast cancer. Nat. Commun. 10, 766 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Losic, B. et al. Intratumoral heterogeneity and clonal evolution in liver cancer. Nat. Commun. 11, 291 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Liao, M. et al. Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19. Nat. Med. 26, 842–844 (2020).
Article CAS PubMed Google Scholar
Lähnemann, D. et al. Eleven grand challenges in single-cell data science. Genome Biol. 21, 31 (2020).
Article PubMed PubMed Central Google Scholar
Garmire, L. X., Yuan, G.-C., Fan, R., Yeo, G. W. & Quackenbush, J. SINGLE CELL ANALYSIS, WHAT IS IN THE FUTURE? in Pacific Symposium on Biocomputing 2019. p.332–337 (WORLD SCIENTIFIC, 2018).
Pushpakom, S. et al. Drug repurposing: progress, challenges and recommendations. Nat. Rev. Drug Discov. 18, 41–58 (2019).
Article CAS PubMed Google Scholar
Alakwaa, F. M. Repurposing didanosine as a potential treatment for COVID-19 using single-cell RNA sequencing data. mSystems 5, e00297-20 (2020).
Stuart, T. et al. Comprehensive Integration of single-cell data. Cell 177, 1888–1902 (2019). e21.
Article CAS PubMed PubMed Central Google Scholar
Guo, K. et al. Identification of Repurposal Drugs and Adverse Drug Reactions for Various Courses of Coronavirus Disease 2019 (COVID-19) Based on Single-cell RNA Sequencing Data. arXiv https://doi.org/10.48550/arXiv.2005.07856 (2020).
He, B. et al. Combination therapeutics in complex diseases. J. Cell. Mol. Med. 20, 2231–2240 (2016).
Article MathSciNet PubMed PubMed Central Google Scholar
Subramanian, A. et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171, 1437–1452 (2017). e17.
Article CAS PubMed PubMed Central Google Scholar
Caron, M. et al. Single-cell analysis of childhood leukemia reveals a link between developmental states and ribosomal protein expression as a source of intra-individual heterogeneity. Sci. Rep. 10, 8079 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Ren, X. et al. COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas. Cell 184, 1895–1913.e19 (2021).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, Q. H. et al. Profiling human breast epithelial cells using single cell RNA sequencing identifies cell diversity. Nat. Commun. 9, 2028 (2018).
Article ADS PubMed PubMed Central Google Scholar
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Article PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
McCarthy, D. J., Chen, Y. & Smyth, G. K. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 40, 4288–4297 (2012).
Article CAS PubMed PubMed Central Google Scholar
Squair, J. W. et al. Confronting false discoveries in single-cell differential expression. Nat. Commun. 12, 5692 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Corsello, S. M. et al. The Drug Repurposing Hub: a next-generation drug library and information resource. Nat. Med. 23, 405–408 (2017).
Article CAS PubMed PubMed Central Google Scholar
Chan, J., Wang, X., Turner, J. A., Baldwin, N. E. & Gu, J. Breaking the paradigm: Dr Insight empowers signature-free, enhanced drug repurposing. Bioinformatics 35, 2818–2826 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ching, T., Huang, S. & Garmire, L. X. Power analysis and sample size estimation for RNA-Seq differential expression. RNA 20, 1684–1696 (2014).
Article CAS PubMed PubMed Central Google Scholar
Connell, N. T. & Berliner, N. Fostamatinib for the treatment of chronic immune thrombocytopenia. Blood 133, 2027–2030 (2019).
Article CAS PubMed Google Scholar
Johnson, L. et al. Novel colchicine derivatives and their anti-cancer activity. Curr. Top. Med. Chem. 17, 2538–2558 (2017).
Article CAS PubMed Google Scholar
Lv, X. et al. G-1 inhibits breast cancer cell growth via targeting colchicine-binding site of tubulin to interfere with microtubule assembly. Mol. Cancer Ther. 16, 1080–1091 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Shinde, A. et al. Spleen tyrosine kinase-mediated autophagy is required for epithelial-mesenchymal plasticity and metastasis in breast cancer. Cancer Res. 79, 1831–1843 (2019).
Article CAS PubMed PubMed Central Google Scholar
Farooq, M. & Patel, S. P. Fulvestrant. In StatPearls (2020).
Neratinib for breast cancer. Aust Prescr 42, 209–210 (2019).
Bower, J. J. et al. Patterns of cell cycle checkpoint deregulation associated with intrinsic molecular subtypes of human breast cancer cells. NPJ Breast Cancer 3, 9 (2017).
Article PubMed PubMed Central Google Scholar
Pal, B. et al. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J. 40, e107333 (2021).
Article CAS PubMed PubMed Central Google Scholar
Neratinib approved by FDA for breast cancer. National Cancer Institute https://www.cancer.gov/news-events/cancer-currents-blog/2017/neratinib-breast-cancer-fda (2017).
Sedlacek, S. M. First-line and salvage therapy of metastatic breast cancer with mitomycin/vinblastine. Oncology 50, 16–21 (1993). Suppl 1.
Article PubMed Google Scholar
Chai, J.-Y., Jung, B.-K. & Hong, S.-J. Albendazole and mebendazole as anti-parasitic and anti-cancer agents: an update. Korean J. Parasitol. 59, 189–225 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ayoub, N. M., Al-Shami, K. M., Alqudah, M. A. & Mhaidat, N. M. Crizotinib, a MET inhibitor, inhibits growth, migration, and invasion of breast cancer cells in vitro and synergizes with chemotherapeutic agents. OncoTargets Ther. 10, 4869–4883 (2017).
Article Google Scholar
Smith, B. et al. Single oral dose acute and subacute toxicity of a c-MET tyrosine kinase inhibitor and CDK 4/6 inhibitor combination drug therapy. Am. J. Cancer Res. 8, 183–191 (2018).
CAS PubMed PubMed Central Google Scholar
Mourragui, S. M. C. et al. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc. Natl Acad. Sci. USA 118, e2106682118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yang, W. et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41, D955–D961 (2013).
Article CAS PubMed Google Scholar
Costa, R. L. B., Han, H. S. & Gradishar, W. J. Targeting the PI3K/AKT/mTOR pathway in triple-negative breast cancer: a review. Breast Cancer Res. Treat. 169, 397–406 (2018).
Article CAS PubMed Google Scholar
Sasidharan Nair, V., Toor, S. M., Ali, B. R. & Elkord, E. Dual inhibition of STAT1 and STAT3 activation downregulates expression of PD-L1 in human breast cancer cells. Expert Opin. Ther. Targets 22, 547–557 (2018).
Article PubMed Google Scholar
Verhoeven, Y. et al. The potential and controversy of targeting STAT family members in cancer. Semin. Cancer Biol. 60, 41–56 (2020).
Article CAS PubMed Google Scholar
Ling, B. et al. A novel immunotherapy targeting MMP-14 limits hypoxia, immune suppression and metastasis in triple-negative breast cancer models. Oncotarget 8, 58372–58385 (2017).
Article PubMed PubMed Central Google Scholar
Wong, W. M. Tretinoin in the treatment of acute promyelocytic leukemia. Cancer Pract. 4, 220–223 (1996).
CAS PubMed Google Scholar
Sawai, C. M. et al. Therapeutic targeting of the cyclin D3:CDK4/6 complex in T cell leukemia. Cancer Cell 22, 452–465 (2012).
Article CAS PubMed PubMed Central Google Scholar
Nagata, K., Ohtani, K., Nakamura, M. & Sugamura, K. Activation of endogenous c-fos proto-oncogene expression by human T-cell leukemia virus type I-encoded p40tax protein in the human T-cell line, Jurkat. J. Virol. 63, 3220–3226 (1989).
Article CAS PubMed PubMed Central Google Scholar
Raetz, E. A. & Teachey, D. T. T-cell acute lymphoblastic leukemia. Hematol. Am. Soc. Hematol. Educ. Program 2016, 580–588 (2016).
Article Google Scholar
Elezkurtaj, S. et al. Causes of death and comorbidities in hospitalized patients with COVID-19. Sci. Rep. 11, 4263 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Hue, S. et al. Uncontrolled innate and impaired adaptive immune responses in patients with COVID-19 acute respiratory distress syndrome. Am. J. Respir. Crit. Care Med. 202, 1509–1519 (2020).
Article CAS PubMed PubMed Central Google Scholar
Bao, C. et al. Natural killer cells associated with SARS-CoV-2 viral RNA shedding, antibody response and mortality in COVID-19 patients. Exp. Hematol. Oncol. 10, 5 (2021).
Article CAS PubMed PubMed Central Google Scholar
Vanderbeke, L. et al. Monocyte-driven atypical cytokine storm and aberrant neutrophil activation as key mediators of COVID-19 disease severity. Nat. Commun. 12, 4117 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Swadling, L. & Maini, M. K. T cells in COVID-19 - united in diversity. Nat. Immunol. 21, 1307–1308 (2020).
Article CAS PubMed Google Scholar
Ondracek, A. S. & Lang, I. M. Neutrophil extracellular traps as prognostic markers in COVID-19: a welcome piece to the puzzle. Arterioscler. Thromb. Vasc. Biol. 41, 995–998 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lamb, J. et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).
Article ADS CAS PubMed Google Scholar
Zhu, X., Ching, T., Pan, X., Weissman, S. M. & Garmire, L. Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization. PeerJ 5, e2888 (2017).
Article PubMed PubMed Central Google Scholar
Cherry, C. et al. Intercellular signaling dynamics from a single cell atlas of the biomaterials response. bioRxiv https://doi.org/10.1101/2020.07.24.218537 (2020).
He, B. et al. Drug discovery in traditional Chinese medicine: from herbal fufang to combinatory drugs. Science 350, S74–S76 (2015).
Kharchenko, P. V., Silberstein, L. & Scadden, D. T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).
Article CAS PubMed PubMed Central Google Scholar
Law, C. W., Chen, Y., Shi, W. & Smyth, G. K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, R29 (2014).
Article PubMed PubMed Central Google Scholar
Saura, C. et al. Neratinib plus capecitabine versus lapatinib plus capecitabine in HER2-positive metastatic breast cancer previously treated with ≥ 2 HER2-directed regimens: phase III NALA Trial. J. Clin. Oncol. 38, 3138–3149 (2020).
Article CAS PubMed PubMed Central Google Scholar
Thomas, X. et al. Improvement of prognosis in refractory and relapsed acute promyelocytic leukemia over recent years: the role of all-trans retinoic acid therapy. Ann. Hematol. 75, 195–200 (1997).
Article CAS PubMed Google Scholar
Lo-Coco, F. et al. Retinoic acid and arsenic trioxide for acute promyelocytic leukemia. N. Engl. J. Med. 369, 111–121 (2013).
Article CAS PubMed Google Scholar
Vorinostat. in LiverTox: Clinical and Research Information on Drug-Induced Liver Injury (National Institute of Diabetes and Digestive and Kidney Diseases, 2020).
Gao, M. et al. Therapeutic potential and functional interaction of carfilzomib and vorinostat in T-cell leukemia/lymphoma. Oncotarget 7, 29102–29115 (2016).
Article PubMed PubMed Central Google Scholar
Jing, B. et al. Vorinostat and quinacrine have synergistic effects in T-cell acute lymphoblastic leukemia through reactive oxygen species increase and mitophagy inhibition. Cell Death Dis. 9, 589 (2018).
Article PubMed PubMed Central Google Scholar
Siddiqi, T. et al. Phase 1 study of the Aurora kinase A inhibitor alisertib (MLN8237) combined with the histone deacetylase inhibitor vorinostat in lymphoid malignancies. Leuk. Lymphoma 61, 309–317 (2020).
Article CAS PubMed Google Scholar
Zeng, B. et al. OCTAD: an open workspace for virtually screening therapeutics targeting precise cancer patient groups using gene expression features. Nat. Protoc. 16, 728–753 (2021).
Article CAS PubMed Google Scholar
Aran, D. et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 20, 163–172 (2019).
Article CAS PubMed PubMed Central Google Scholar
Tibshirani, R. & Hastie, T. Outlier sums for differential gene expression analysis. Biostatistics 8, 2–8 (2007).
Article MATH PubMed Google Scholar
He, B. & Garmire, L. Prediction of repurposed drugs for treating lung injury in COVID-19. F1000Res. 9, 609 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yoon, S., Baik, B., Park, T. & Nam, D. Powerful p-value combination methods to detect incomplete association. Sci. Rep. 11, 6980 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 12, 77 (2011).
Article Google Scholar
R Foundation for Statistical Computing, R. C. R: A language and environment for statistical computing. R Foundation for Statistical Computing (2016).
Mourragui, S. et al. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc. Natl. Acad. Sci. 118, e2106682118 (2021)
He, B. et al. ASGARD is A Single-cell Guided Pipeline to Aid Repurposing of Drugs. lanagarmire/Asgard. https://doi.org/10.5281/zenodo.7582790 (2023).
Article Google Scholar
He, B. et al. ASGARD is A Single-cell Guided Pipeline to Aid Repurposing of Drugs. lanagarmire/Single-cell-drug-repositioning. https://doi.org/10.5281/zenodo.7613982 (2023).
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Institute of Environmental Health Sciences through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative [K01ES025434]; the US National Library of Medicine [R01 LM012373, R01 LM12907]; and the National Institute of Child Health and Human Development [R01 HD084633; to L.X.G.].

Author information

Authors and Affiliations

Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, MI, USA
Bing He, Yao Xiao, Qianhui Huang, Yuheng Du, Yijun Li & Lana X. Garmire
Department of Statistics, College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, USA
Haodong Liang
Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, USA
David Garmire
Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, USA
Duxin Sun

Authors

Bing He
View author publications
You can also search for this author in PubMed Google Scholar
Yao Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Haodong Liang
View author publications
You can also search for this author in PubMed Google Scholar
Qianhui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yuheng Du
View author publications
You can also search for this author in PubMed Google Scholar
Yijun Li
View author publications
You can also search for this author in PubMed Google Scholar
David Garmire
View author publications
You can also search for this author in PubMed Google Scholar
Duxin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Lana X. Garmire
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.X.G. and B.H. conceived this project. L.X.G. supervised the study. B.H. wrote the ASGARD package. B.H., Y.X., H.L., Q.H., Y.D., Y.L., and D.G. performed the analysis. B.H., Y.X., H.L., D.S., and L.X.G. wrote the manuscript.

Corresponding author

Correspondence to Lana X. Garmire.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Dataset 1

Dataset 2

Dataset 3

Dataset 4

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

He, B., Xiao, Y., Liang, H. et al. ASGARD is A Single-cell Guided Pipeline to Aid Repurposing of Drugs. Nat Commun 14, 993 (2023). https://doi.org/10.1038/s41467-023-36637-3

Download citation

Received: 15 April 2021
Accepted: 10 February 2023
Published: 22 February 2023
DOI: https://doi.org/10.1038/s41467-023-36637-3

This article is cited by

Basal–epithelial subpopulations underlie and predict chemotherapy resistance in triple-negative breast cancer
- Mohammed Inayatullah
- Arun Mahesh
- Vijay K Tiwari
EMBO Molecular Medicine (2024)
Drug mechanism enrichment analysis improves prioritization of therapeutics for repurposing
- Belinda B. Garana
- James H. Joly
- Nicholas A. Graham
BMC Bioinformatics (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.