Network Analysis Reveals TNF as a Major Hub of Reactive Inflammation Following Spinal Cord Injury

Spinal cord injury (SCI) leads to reactive inflammation and other harmful events that limit spinal cord regeneration. We propose an approach for studying the mechanisms at the levels of network topology, gene ontology, signaling pathways, and disease inference. We treated inflammatory mediators as toxic chemicals and retrieved the genes and interacting proteins associated with them via a set of biological medical databases and software. We identified >10,000 genes associated with SCI. Tumor necrosis factor (TNF) had the highest scores, and the top 30 were adopted as core data. In the core interacting protein network, TNF and other top 10 nodes were the major hubs. The core members were involved in cellular responses and metabolic processes, as components of the extracellular space and regions, in protein-binding and receptor-binding functions, as well as in the TNF signaling pathway. In addition, both seizures and SCI were highly associated with TNF levels; therefore, for achieving a better curative effect on SCI, TNF and other major hubs should be targeted together according to the theory of network intervention, rather than a single target such as TNF alone. Furthermore, certain drugs used to treat epilepsy could be used to treat SCI as adjuvants.

Scientific RepoRts | (2019) 9:928 | https://doi.org/10.1038/s41598-018-37357-1 access the source data for specific details about corresponding experiments. The inflammatory cytokines associated with SCI are neurotoxins 2 , and the toxicogenomics approaches can be used to identify them. TNF was first discovered in 1968 as a cytotoxic factor induced by lymphocytes and was referred to as a lymphotoxin (LT) 28 ; therefore, it might be favorable for us to use the CTD database for collecting the SCI-gene data, and the CTD in-house scoring system for screening that data. There are thousands of curated genes associated with SCI in CTD that are available for the required bioinformatics analyses 26 , such as in this study about the effects of GAS on SCI.
In an effort to assess the effects of GAS on SCI, we designed a network-based integration and bioinformatics analysis approach, incorporating the disease-gene toxicogenomics 26 , PPI networks 22,23 , and gene ontology (GO) enrichment analysis [29][30][31][32] and disease inference 26 . First, GAS were retrieved from CTD. Subsequently, the protein interactions involved in GASs were integrated from STRING database 22,23 , and visualized via Cytoscape [33][34][35] , a popular, open source bioinformatics software platform for network analysis. Finally, by using the interacting proteins, the functions and pathways associated with SCI were inferred. As a result, the most important as well as the top 30 interacting proteins were singled out; affected functions and pathways were identified; and diseases, including neurological and psychological disorders, were predicted, which provided better insight into the influence of GAS on SCI and related diseases. This analysis approach is also expected to be useful for studying neurotrophic factors and nerve growth factors involved in SCI and its consequences. Using biological data in system-level to study disease-gene associations is able to improve our current knowledge of disease relationships, leading to further improvements in disease diagnosis, prognosis and treatment.

Material and Methods
Genes/proteins associated with sCI (GAs). GAS data were obtained from CTD 26 by searching for genes involved in SCI, resulting in a list of 12,824 GAS or their protein products, which were then sorted by the CTD in-house "inference score" in descending order. The top 30 GAS (GAS30) with high scores (47.84-33.7) were taken as the core data in this study (i.e., in this context, the GAS30 represented the genes that were most closely associated with SCI). Then, Cytoscape (version3.4.0, 2016) 34 and STRING (version 10.5) 23 were conducted to query the protein-protein interactions of the GAS30. STRING is as an application (App, plugin) installed in Cytoscape. The data in STRING are weighted and integrated and a confidence score is calculated for all protein interactions according to the nature and quality of the supporting evidence. As a result, each of these interactions is assigned a confidence score between zero (no interaction) and one (high-confidence interaction), which indicates the probability that the interaction is authentic, given the available evidence. The default cutoff for confidence interactions is 0.4 18,19 . This study utilized this default value to screen PPIs and only the interactions whose confidence scores were >0.4 were considered for network analysis. Of the established PPI network of the GAS30, all nodes were from CTD and with CTD in-house inference scores of >33, and all edges were from STRING and with STRING in-house confidence scores of >0.4. Furthermore, the plugin NetworkAnalyzer 34 in Cytoscape was used to visualize molecular interaction networks and integration with gene expression profiles and other state data. Enrichment analysis of gene ontology, pathway, and disease. Gene ontology (GO) 31,32 , a controlled vocabulary describing gene products and a useful resource for studying gene functions 36 , consists of three domains termed cellular components (CC), molecular functions (MFs), and biological processes (BP). Identifying enriched GO terms from a given gene list enables insight into the over-represented functions of genes 29 . Enrichment analysis of pathways and diseases is also an approach to the further understanding of the implicated pathways and diseases associated with SCI. Several web services such as the BinGO 37 plugin of Cytoscape, OmicsBean 38 and Set Analyzer 26 of CTD can be employed for studying enriched GO terms, pathways and diseases, respectively. Among these services, the pathway-gene relationships for enrichment analysis are from the KEGG 21 and REACTOME 39 pathway databases, whereas the MEDIC disease vocabulary 24 that combines the Medical Subject Headings (MeSH) 40 and OMIM 20 databases is used for analysis of enriched diseases. Briefly, we input the gene list of GAS30 respectively into BinGO or OmicsBean for GO term analysis; OmicsBean for pathways analysis; and the set analyzer of CTD for diseases analysis, while using a p-value of <0.05.

Results
ppI network analysis. A total of 12,824 genes were identified as associated with SCI using CTD as of December 12, 2017. Among these, the top 30 genes (GAS30) with CTD inference scores >33 are listed in Table 1. After inputting the GAS30 gene list into Cytoscape, assigning a link to STRING, and assuming that the interactions between the molecules were nondirectional and with interacting confidence scores of >0.4 in STRING, we obtained a PPI network for GAS30. Figure 1A shows a GAS30 PPI network that consists of 30 nodes and 232 edges. A node represents a molecule and an edge represents an interaction between two connected nodes. These two nodes are called neighbors. The fact that no edge is connected to a node such as myeloperoxidase (MPO) indicates that interactions between this node and others do not exist (at least their interacting confidence scores were ≤0.4) and it should be deleted from this network. The number of edges/neighbors related to a node is referred to as the degree of the node 35,41,42 . A node with a number of edges that greatly exceeds the average is referred to as hub and these play crucial roles in the network 43 . Therefore, molecules in the GAS30 network could be re-sorted by their node degrees. By use of NetworkAnalyzer 34 , a Cytoscape plugin for network topology analysis, the degrees of each node in the GAS30 network were calculated and the top 10 are listed in Table 2. Among these, the TNF node exhibited the greatest degree, and is termed a major hub. Furthermore, the top 10 interacting molecules become a sub-network of the GAS30 cohort, and are denoted as GAS10 and shown in Fig. 1B. The sub-network GAS10 comprised 10 nodes and 44 edges, leading to an extremely high clustering coefficient 41 of 0.978 and an extremely small diameter 44 of 2. That is, the 10 nodes were all major hubs and highly interconnected.   Table 2. Smaller nodes indicate the proteins whose three-dimensional structures were undetermined;larger nodes indicate the determined or predicted proteins. Colors of lines (edges) represent different interaction types.   The enriched BP analysis revealed that GAS30 could interfere with cellular responses and metabolic processes. Specifically, the process of responding to lipopolysaccharide involved 57% of GAS30 members (Fig. 2) and was promoted to the highest GO level in a significant p-value (Table 3). Lipopolysaccharide is a cell wall component of gram-negative bacteria, and is a type of endotoxin 45 that is released only when bacterial cells are destroyed or when using an artificial method to kill the microorganisms. Considerable evidence has revealed the influence of lipopolysaccharides on central nervous system (CNS) diseases. For example, lipopolysaccharides can cause  learning and memory disorders in rats subsequent to CNS inflammatory responses 46,47 , which positively supports the outcome of our GO term enrichment analysis for BPs. The enriched GO terms for CCs of the interacting proteins were mostly related to the extracellular space components, in which the first two (i.e., extracellular space and extracellular region components), exhibited the most significant p-values in CCs (Table 3) and accounted for 50% and 63% of GAS30 members (Fig. 2), respectively. A hierarchical GO tree for CCs enriched in GAS30 is presented in Fig. 3.
The MFs influenced by the interacting GAS30 proteins were mostly related to the protein-binding and receptor-binding functions, according to the enriched GO terms. Notably, protein binding accounted for the highest percentage (93%) in GAS30 in all enriched GO terms as shown in Fig. 2. Figure 4 shows a hierarchical tree of important GO terms for MFs affected by the interacting GAS30 proteins.

Pathway enrichment analysis.
To further reveal the pathways affected by interacting GAS30 proteins, analyses were performed using OmicsBean 38 , a web service for processing biological data and with links to KEGG 21 and other public databases. Following the instructions of OmicsBean, Table 4 and Fig. 5 were generated. The top 10 with the most significant p-values are listed in Table 4 and shown in Fig. 5. Specifically, the TNF signaling pathway was ranked at the top of the list, which accounted for 47% of GAS30 members.   Nervous system diseases involving TNF inferred from CTD. The diseases associated with TNF were inferred using the toxicogenomics analyses of CTD, which yielded 570 nervous system diseases associated with TNF. The top 30 diseases sorted by CTD in-house inference scores are listed in Table 5. Notably, among these, seizures had the highest score (No. 1 in Table 5).

Discussion
We considered SCI-induced inflammatory mediators as a type of toxin that inhibits the regeneration of injured tissue/cells, and we identified the associated genes and interacting proteins from known biological medical databases, CTD 26 , STRING 23 , and others, and chose the top 30 genes/proteins, GAS30, that were useful for studying their effects on SCI at the levels of network topology, GO, signaling pathways, and disease inference to provide a new visual angle for finding potential methods by which SCI intervenes. According to this study, more than 10,000 genes associated with SCI and TNF achieved the highest score. TNF, FOS, IL6, and seven other of the top 10 nodes (Table 2) were the major hubs and highly interconnected in the GAS30 PPI network that were identified using CTD, STRING, and related databases. From the perspective of network topology 48 , such a network allows for a fault-tolerant behavior for which, if a hub-failure occurs, the network will generally not lose its connectedness because of the remaining hubs that will rapidly replace the failing hub. This suggests that although TNF negatively affected SCI repair, all other major hubs, such as FOS IL6, should be targeted simultaneously in the future for the development of new therapeutic approaches, rather than aiming at individual specific genes, one at a time, which might achieve better curative effects.
Furthermore, GAS30 members interfered mainly with cellular responses and metabolic processes, extracellular space and extracellular region components, protein-binding and receptor-binding functions, and TNF signaling pathways as identified by GO and pathway enrichment analyses. Notably, the TNF signaling pathways were promoted to the highest enriched level of GAS30 members and had the most significant p-value (Table 4). Although considerable evidence has revealed the influence of TNF as an inducer of inflammatory cytokines after SCI 7-9 (e.g., neurotoxic reactive astrocytes induced by activation of microglia through secreting Il-1α, TNF, and C1q 2 ), greater attention should be paid to TNF in the future and consider it to be a major signaling pathway and its use as a crucial and potential therapeutic target for SCI repair.
In addition, seizures were highly associated with TNF by CTD disease inference (Table 5). Clinically, seizures might occur after traumatic brain injury, and interestingly, repeated seizures might develop into post-traumatic epilepsy [49][50][51] . Seizures were also observed following SCIs 52,53 . More interestingly, the antiepileptic drug valproate was used as a supplement in stem cell transplantation for a mouse model of SCI, which dramatically enhanced the restoration of hindlimb function 54 . These suggest that certain drugs used to treat epilepsy could be employed as adjuvants in SCI treatment; however, these observations and suggestions were not directly linked to TNF by the original researchers, and the mechanisms proposed are unclear. Therefore, TNF, which is the most important hub identified in this study, could be further connected to the aforementioned findings and would be a direction for future SCI studies.
There are other aspects of this study that must be mentioned. In addition to CTD, the genes/proteins associated with SCI or other diseases that were searched to construct the PPI networks and subsequent bioinformatics analyses could be from OMIM 20,55 or other publicly available databases 56 , which would generate similar results because nearly all public databases are interconnected to the Internet; therefore, the primary data mostly overlap with each other. As such, CTD includes OMIM and 10 other databases 26 . We used CTD in this study for its full name (Comparative Toxicogenomics Database) as well as its functions that matched our requirements for treating inflammatory cytokines in SCI as a type of neurotoxin. In recent years, disease genome sequencing and other high-throughput studies of disease genomes have generated many notable discoveries 17 . Direct data on disease-genes are commonly derived from RNA-seq because it is superior to other high-throughput technologies, such as microarray in accuracy, dynamic range, and differential expression detection, and has nearly completely replaced microarray for conducting genetic tests. The entries curated in OMIM have referenced 57-60 the results from RNA-seq, and NCBI 27 online accepts RNA-seq data and shares it with other databases and researchers. In addition, the data in references 2,5,8,9,[13][14][15][16][17] in this study were primarily from RNA-seq. Furthermore, a combination 61 of using RNA-seq approach with PPI network analysis generated that TNF had the largest number of connected edges in the PPI network for contusive SCI in a mouse model, and the top ranked genes in the SCI gene list overlapped considerable with ours, supporting our current study; however, our present method hardly describes the dynamic effects of GAS on SCI. On the other hand, notably, because of its pleiotropic role, TNF shows, for example, a positive effect on regulatory T cells 62 and prevents neurons from death/apoptosis by activating NF-κB 63,64 ; therefore, suppressing TNF overexpression might not be a desirable intervention for SCI therapy, and this needs to be observed further. Furthermore, to more effectively predict the SCI drug targets, the patient-specific signaling networks for reactive inflammation from SCI could be constructed using the concept of "SCI hallmarks" based on individual genomic data and on regulatory functions, just as the signaling networks of "cancer hallmarks" [13][14][15][16][17] have been developed and substantially used for revealing molecular mechanisms of cancers and drug targets. This proposed approach to analyses is also expected to be useful for studying neurotrophic factors and nerve growth factors after SCIs.
With a constantly expanding repertoire of techniques, including RNA-seq, together with new information on genes and proteins, the current results will have more possibilities for examination and modification and will advance the current approaches to SCI analysis.

Data Availability
All data generated or analyzed during this study are included in this published article.

Disease Name
Disease ID Inference Score