Integrative network analyses of transcriptomics data reveal potential drug targets for acute radiation syndrome

Recent political unrest has highlighted the importance of understanding the short- and long-term effects of gamma-radiation exposure on human health and survivability. In this regard, effective treatment for acute radiation syndrome (ARS) is a necessity in cases of nuclear disasters. Here, we propose 20 therapeutic targets for ARS identified using a systematic approach that integrates gene coexpression networks obtained under radiation treatment in humans and mice, drug databases, disease-gene association, radiation-induced differential gene expression, and literature mining. By selecting gene targets with existing drugs, we identified potential candidates for drug repurposing. Eight of these genes (BRD4, NFKBIA, CDKN1A, TFPI, MMP9, CBR1, ZAP70, IDH3B) were confirmed through literature to have shown radioprotective effect upon perturbation. This study provided a new perspective for the treatment of ARS using systems-level gene associations integrated with multiple biological information. The identified genes might provide high confidence drug target candidates for potential drug repurposing for ARS.

www.nature.com/scientificreports/ correlated genes in the same module. These hub genes can be predictive of novel drug targets. For example, such analyses have been used to characterize genes and pathways associated with Alzheimer's disease 7 , Schizophrenia 8 , Amyotrophic Lateral Sclerosis 9 , diabetic kidney disease 10 , and to determine potential drug targets in pathogens 11 , and other complex human diseases 12 .
To further address challenges with the discovery and development of treatments for complex diseases, repurposing of existing drugs to new diseases has become a preferred alternative to elucidating novel drugs due to high attrition rate and time. For instance, the FDA-approved radiation countermeasures Neupogen (CSF3R), Neulasta (CSF3R), and Leukine (CSF2RA) are repurposed drugs 3 . Neupogen and Neulasta are used in chemotherapyinduced neutropenia and Leukine is used in bone marrow transplant recipients 13 .
In this study, we used a network-based systems analysis of five transcriptomics datasets obtained under radiation treatment and integrated them with public drug information and disease association. This led us to predict eight targets with repurposable drugs/compounds available.

Results
The workflow used to identify drug targets for ARS is shown in Fig. 1. First, we collected ARS relevant data from public repositories. Second, we constructed gene coexpression networks from transcriptomics data across different species. We used them to identify important genes (hub genes). In the absence of relevant and highquality human data, we aimed to identify consensus across the available ARS transcriptomics data for different species to maximize the quality of predictions. Using a consensus method increases the likelihood that efficacy in a mouse model would also be observed in human trials, which may minimize the drug development costs. Next, to predict potential drug targets for ARS, these hub genes were mapped with external databases of drugs and diseases, differentially expressed genes under radiation treatments from 6 h to 7 days following radiation of 0.5-10 Gy (different doses and time). The human samples were selected from pre and post-irradiation subjects. These subjects received total body irradiation (TBI) prior to various transplantations. Finally, using literature mining, we predicted the drug targets for ARS.
Weighted gene coexpression network analysis revealed salient expression networks. We collected five transcriptomics datasets relevant to ARS that were obtained from mice and humans from the GEO database. Using these five datasets, we constructed five weighted gene coexpression networks. A minimum of 28 Biological function analysis reveals conserved pathways across networks. In the next step, we found the biological significance of modules identified in the constructed networks. Pathway enrichment analysis showed enrichment with biological processes in at least one module across five datasets (Supplementary Data 2). Next, we identified biological processes that were identified by multiple datasets. Among these, 23 gene ontology (GO), KEGG, and Reactome terms were enriched in more than one dataset ( Fig. 2A). These terms included biological processes and pathways related to immune response, xenobiotic stimulus, chemical carcinogenesis, blood coagulation, and white blood cell activation (Fig. 2B) (P-value < 0.05 and False Discovery Rate < 5%). The highest fold enrichment across all the networks (average fold enrichment) was found in chemical carcinogenesis (~ 16 fold), blood coagulation and hemostasis (~ 8 fold), and response to xenobiotic stimuli (~ 7 fold).
Hub genes found across multiple networks. In the constructed networks, intramodular connectivity was used as a measure to identify hub genes (see Materials and methods). Because of the limited data availability of ARS-relevant human data, we identified hub genes that appeared in multiple datasets as high confidence. To select the top hub genes, first, we used criteria of maximum overlap of hub genes across the different datasets. We identified the top 5, 10, and 20 percent hub genes and compared them across all five datasets. In our analysis, the top 20% resulted in more hub genes that were overlapped in the maximum number of datasets (Fig. 3A). Among www.nature.com/scientificreports/ these, we found five hub genes (Myeloid Cell Nuclear Differentiation Antigen (MNDA), Syntaxin 11 (STX11), intracellular tyrosine kinase (IKT), Slingshot Protein Phosphatase 2 (SSH2), Keratin 9 (KRT9) ) shared by all networks, 58 hub genes shared by four networks, and 412 hub genes that were shared by three networks. On the contrary, in both the top 5% and 10% hub genes, no hub genes were shared by all networks. Four networks shared only seven genes in the top 10% of hub genes, while not a single hub gen was shared by four out of five networks in the top 5%. Thus, the top 20% cutoff allowed optimal comparisons amongst the networks. Next, among the top 20% hub genes (Supplementary Data 3), we chose genes that were identified within more than one gene coexpression network. This analysis resulted in 2,351 hub genes identified within at least two coexpression networks (Supplementary Data 3). Enrichment analysis of the selected hub genes showed GO biological processes related to leukocyte migration, apoptosis, DNA replication, T cell signaling, NF-κB, and IFN-γ signaling (Fig. 3B).
Furthermore, we have investigated the biological processes that are enriched in differentially expressed in response to high radiation dose (i.e., 10 Gy). Many of the biological processes and pathways related to commonly enriched in multiple networks ( Network analysis predicts high confidence drug targets for ARS. To identify drug targets for potential drug repurposing, we identified genes that are targets of existing preclinical and FDA-approved drugs (Materials and methods) and then selected genes that were differentially expressed in response to radiation treatments.
Among the identified 2,351 hub genes, 520 genes were mapped with existing preclinical and approved drugs/compounds (see Materials and Methods). Next, the selected hub genes were compared with differentially expressed genes in ARS across 32 conditions (see Materials and Methods). To validate our findings, first, we investigated already known drug targets for ARS that are part of identified hub genes. The gene target CSF3R of two ARS drugs Neulasta and Neupogen appears in our list and they were differentially expressed in 21 different samples (upregulated in 12 samples and downregulated in 9 samples). Next, we investigated all the hub genes for differential expression. Of the 520 genes with existing drugs, 293 genes were differentially expressed in at least eight (25%) of 32 conditions (see Supplementary Data 5). Among these, we identified 81 genes that were consistently up-regulated, and 20 genes that were consistently down-regulated (Fig. 4A, Supplementary Data 5). We then analyzed these 101 differentially expressed hub genes with existing drugs and compounds. These genes include 75 genes involved in (based on Gene Ontology) metabolic processes, 59 genes in response to a stimulus, 30 genes in immune system processes, 5 genes in cell population and proliferation, and 4 genes in cell signaling (Fig. 4B).
Next, literature mining revealed that 20 (of 101) genes were relevant to the ARS (Table 2). Of these 20 genes, we identified eight genes whose response to radiation was incongruent with prognostic favorability (indicated with bold font in Table 2

Discussion
The onset of Acute Radiation Syndrome (ARS) upon exposure to high doses of radiation in a short period is a burgeoning problem as access to highly radioactive substances becomes ubiquitous. Thus, the importance of characterizing and treating this disease is an emerging endeavor. As such, we created gene correlation networks to help identify hub genes across multiple transcriptomic data for ARS to identify potential therapeutic options. Our transcriptomics-based gene networks were enriched with biological processes, including immune response and immune system process that were supported by another study 40 . While this study provided an overview of potential treatments for ARS there were significant limitations. The use of multiple species, multiple time points and multiple radiation doses means our predictions are a holistic screening of potential treatments for ARS. Within the immune response, pathways related to Blood Coagulation and platelet activation, signaling and aggregation were enriched processes in the networks. It has been shown previously that coagulopathy and lower platelet counts are physiologically-relevant symptoms of acute radiation 32 . Identification of genes that have been independently shown as relevant to ARS provides partial validation to the constructed networks. Evidence was found for eight hub genes identified as potential targets (BRD4, NFKBIA, CDKN1A, TFPI, MMP9, CBR1, ZAP70, IDH3B), that altering their expression may induce radioprotective effects. These eight genes and associated drugs that have the potential for repurposing are discussed below.
Upregulated genes. Bromodomain-Containing Protein 4 (BRD4): BRD4 Inhibition was shown to condense chromatin structure and decrease the DNA damage signaling pathway resulting in less DNA repair. Further, inhibition of BRD4 demonstrated a relaxation of the chromatin structure and more repair signalling 17 . Thus, inhibition of BRD4 using CPI-0610, a phase two clinical trial molecule, could result in a better prognosis for ARS patients. NF-kB inhibitor alpha (NFKBIA): NF-kB inhibition has been shown to increase apoptosis through the loss of regulation of the JNK pathway 22 . Additionally, increased apoptotic activity is induced by irradiation 41 . Thus inhibition of NFKBIA by Acetylsalicylic acid could increase NF-kB activity and confer an antiapoptotic state for the irradiated cells. Cyclin-dependent kinase inhibitor alpha (CDKN1A): It has been shown that active and overexpressed CDKN1A increases cell cycle arrest in the G1 phase 29 . Arrest in this phase of the cell cycle was seen in irradiated cells, and cell cycle arrest is a known promoter of apoptosis 30 . A preclinical compound GGTI-298 is under investigation that can inhibit CDKN1A. Tissue Factor Pathway inhibitor (TFPI): It has been shown that inhibition of TFPI reduces hemophilia bleeding through activating coagulation pathways 31 . Since hemophilia is an indication of Radiation-Induced Disseminated Intravascular Coagulopathy (RDIC) that fresh frozen plasma, platelets or cryoprecipitated antihemophilic factor (AHF) 30 . It is possible that inhibition of TFPI or its combination with human derived products such as AHF may improve the patients' outcome by mitigating the effects of hemophilia in ARS. FDA-approved drug, dalteparin, is an inhibitor available for TFPI. Matrix Metallopeptidase 9 (MMP9): has been suggested to increase fibrosis following damage 42 ; fibrosis is increased in cells that receive ionizing radiation 43 . Additionally, captopril is an FDA-approved and known inhibitor of MMP9 13 that improves the survival of mice when exposed to high doses of radiation 33 . This protection was highly dependent on the timing of the treatment and suggests further time based inquiries of the data may elucidate time dependent physiological manifestations of ARS.
Downregulated genes. Carbonyl Reductase 1 (CBR1) encodes a protein that reduces the level of reactive oxygen species, especially lipid aldehydes. Further, it has been shown that CBR1 confers radioprotectivity to   44 . Loss-of-function in ZAP70 has been shown to lead to a systemic loss of functional T cells 36 in mice post radiation 37 . Thus, we hypothesize that upregulating ZAP70 may lead to regaining functional T cells following radiation. Staurosporine is an experimental drug known to modulate the ZAP70 function and could be further explored as a potential repurposing option for ARS. Finally, Isocitrate Dehydrogenase (NAD( +)) 3 Beta (IDH3B) facilitates the transfer of Hydrogen from Isocitrate to NAD(+); exogenous NADH has been shown to protect mice intestines after radiation injury 38 . Thus, we hypothesize that the increase of IDH3B can lead to higher levels of NADH in the system and thus improve the prognosis following radiation injury. The compound coenzyme I (in Phase 2 of clinical trials) is indeed known to interact with IDH3B and could be explored for repurposing in ARS.
In addition to the above-discussed genes, all other reported genes elucidate a physiological state that is induced under various radiation regiments. We predicted the possible activation of 12 other genes that have drugs approved by the FDA or in preclinical and clinical phases. Based on published literature, the activation of these genes may have radioprotective properties, and thus the development of corresponding activators may be useful. These include MYD88 (Myeloid Differentiation Primary Response Protein), which is found as upregulated in response to radiation and has been found to protect against radiation injuries 16 . The upregulation of Aurora Kinase A (AURKA) may also lead to protective properties against radiation as its inhibition has been found to increase radiosensitivity in hepatocellular carcinoma 18 . Similarly, activation of Protein Tyrosine Phosphatase Non-receptor Type 1 (PTPN1) has been found to increase survival in mice after irradiation. Finally, Nicotinamide N-Methyltransferase (NNMT) has been found to protect cancer cells from radiation.
Further delineation based on timing and dose could further specify which treatment and response best suit the target. Additional studies can be conducted to determine the efficacy of modulating these genes individually or in combination. Additionally, the data was garnered primarily from mice which will heavily influence the predicted treatments. Finally, while this modulation can confer protection against acute radiation syndrome, some of these genes can also provide radioprotective properties to tumor cells and hence may not represent a long-term treatment option for patients with ARS.

Conclusions
This study uses systems-level gene associations integrated with multiple biological information levels and provides a new perspective for the treatment of Acute Radiation Syndrome (ARS). The study identified eight genes with existing drugs and relevance to ARS, which may serve as high confidence drug target candidates for drug repurposing, potentially providing treatment of ARS.
Dataset collection. Gene expression datasets GSE104121 45 , GSE10640 46 , and GSE6874 47 related to ARS were collected from the GEO database 48 . These datasets were generated from studies using humans and mice. In GSE104121, peripheral whole blood samples were collected from 48 mice following 1-12 Gy total body irradiation (TBI), at 6 h, 16 h, 24 h, and 48 h after the treatment. GSE10640 was generated using human and mouse samples. Human data were collected prior to and following the total body irradiation (TBI) of patients with 1.5-2 Gy. From this dataset we used 70 human samples (36 Pre Irradiation 34 post irradiation). These samples came from 6 h after TBI of 1.5-2.0 Gy in preparation for transplantation. The mouse data were collected following 6 h, 24 h, and 7 days of TBI with 0.5, 2, and 10 Gy. GSE6874 consists of two datasets GSE6871 (human) and GSE6873 (mouse). Human data were collected from 60 samples (33 pre-irradiation and 27 post-irradiation) pre-and post-irradiated (with 1.5-2 Gy TBI) patients. Mouse data were collected following TBI varied from 0.5 Gy to 10 Gy. Differential gene expression analysis. Differential gene expression analysis was conducted for each dataset using limma 49 and affy 50 R/Bioconductor packages. The comparison was made between healthy samples and the various conditions, including radiation dose and time post-irradiation. Additionally, within GSE10640,  www.nature.com/scientificreports/ there were multiple species of mice used (Balbc and Blk); hence, differential expression was performed between healthy Balbc vs. Irradiated and similarly for the Blk mice. A total of 32 comparisons for differential expression analysis were performed across all five datasets (Supplementary Data 5). Two-fold change was considered as a significant differential expression.
Weighted gene coexpression network construction. Prior to the construction of the network, datasets were quantile normalized using the limma package in R. Next, genes were filtered based on the coefficient of variation (CV) to remove spurious correlations among genes. A total of five networks were constructed consisting of two human (one from GSE10640 one from GSE6874), and three mice networks (GSE10640, GSE6874, GSE104121).
Pearson's correlation coefficient between genes within each dataset. Assuming the biological networks show scale-free topology, the absolute correlation between genes was raised with a soft thresholding www.nature.com/scientificreports/ power (β) (Supplementary Data 1). All networks were fitted using an R^2 cutoff of 0.80 6 . The clustering of genes was performed based on the dissimilarities of genes using topological overlap measure (TOM). Average Linkage Hierarchical clustering was used for module detection in each network 6 .
Hub genes identification. Hub genes were identified in each correlation network characterized by the genes with the highest intramodular connectivity value 6 . This value determines the level of module membership a gene has; thus, a higher value of intramodular connectivity corresponds to more significant connectivity in the module. Genes among the top 20 percent were selected to ensure the highest confidence in the genes selected while simultaneously not introducing false positives by lowering the threshold.

Identification of known drugs and correlation with diseases.
To investigate if a hub gene is an existing drug target, we used the repurposing tool 51 in the ConnectivityMap 52 and DrugBank databases 13 . Additionally, we utilized UniProt 39 and OMIM databases 53 to map genes to associated diseases.
Functional enrichment analysis. Gene Ontology (GO) biological processes 54 , KEGG pathways 55 , and Reactome pathways 56 terms were utilized for enrichment analysis. Enrichment of genes within each module in the networks was performed using the anRichment 57 and anRich-mentMethods packages in R. Pathways with Benjamini-Hochberg (FDR) value < 0.05 were considered to be enriched.
A functional analysis of hub genes was performed using DAVID 58 , STRING 59 and UniProt databases. In the DAVID database and STRING database, biological processes and pathways with FDR < 0.05 were considered as enriched.