Network analysis reveals a stress-affected common gene module among seven stress-related diseases/systems which provides potential targets for mechanism research

Chronic stress (CS) was reported to associate with many complex diseases and stress-related diseases show strong comorbidity; however, molecular analyses have not been performed to date to evaluate common stress-induced biological processes across these diseases. We utilized networks constructed by genes from seven genetic databases of stress-related diseases or systems to explore the common mechanisms. Genes were connected based on the interaction information of proteins they encode. A common sub-network constructed by 561 overlapping genes and 8863 overlapping edges among seven networks was identified and it provides a common gene module among seven stress-related diseases/systems. This module is significantly overlapped with network that constructed by genes from the CS gene database. 36 genes with high connectivity (hub genes) were identified from seven networks as potential key genes in those diseases/systems, 33 of hub genes were included in the common module. Genes in the common module were enriched in 190 interactive gene ontology (GO) functional clusters which provide potential disease mechanism. In conclusion, by analyzing gene networks we revealed a stress-affected common gene module among seven stress-related diseases/systems which provides insight into the process of stress induction of disease and suggests potential gene and pathway candidates for further research.

Chronic stress (CS) influences multiple systems and affects the generation and development of numerous complex disorders 1,2 , such as infectious and autoimmune disorders [3][4][5] , cardiovascular events 6,7 , cancers 8,9 , mental disorders [10][11][12] , and obesity 13 . Results from epidemiological literature show strongly that there is comorbidity among stress-related diseases 14,15 , and studies of molecular mechanisms also imply a tight relevance across these diseases 16 . Additionally, recent clinical tests suggest that psychological interventions can affect patients with other stress-related diseases 17,18 . Although increasing evidence has hinted at a strong association among different stress-related diseases, it remains unclear whether there is a common stress-induced biological process across these diseases.
In recent years, genetic and expressional studies have identified a significant number of disease-related genes, and the information has been organized into specific data resources [19][20][21] . Moreover, many gene-based bioinformatic approaches, such as gene network analysis that based on the interaction of proteins encoded genes were identified according to the following thresholds: BC > 0.05 and degree > 50 22 . The statistical significant difference between properties of nodes in disease/system networks and the entire interactome was examined by T-test.
Common gene module of stress-related diseases/systems. To examine whether a common gene module exists among different stress-related diseases/systems, nodes and edges were compared among the seven stress-related disease and system networks and a common sub-network was constructed using the overlapping nodes and edges. These interactive nodes in the sub-network constructed the common gene module. Three properties of the module were analyzed: 1) network topological parameters; 2) overlapping genes shared between the module and hub genes; and 3) overlapping genes shared between the module and the CS gene set and network. The statistical significance of the overlap between common module and the CS gene set and network was determined by the Fisher's exact test.
Gene ontology (GO) pathway cluster enrichment analysis. To identify common biological processes underlying stress-related diseases, a GO pathway cluster enrichment analysis was performed on nodes in the disease/system common module using the online analysis tool DAVID 32 . As recommended in DAVID, the cutoff for pathway cluster enrichment was set at a score > 1.3. The representative biological terms associated with significant clusters were manually selected. Because these clusters reflect interactive functional systems, the GO term network of genes in common module was also deciphered using the Cytoscape plug-in ClueGO 33 to provide a system-wide view.

Results
Summary of genes and networks. The seven stress-related disease/system gene sets included 4637 genes (summarized in Supplementary Table S1). The seven gene sets overlapped; however, there were no genes that occurred in all seven sets. The genes that occupied by more than four gene sets are shown in Supplementary Table S2. The CS gene set included 2606 genes (see in Supplementary Table S1). A total of 3941 disease/system or CS genes were found in STRING v9.1, and 8429 nodes were included in the disease/system or CS networks (see in Supplementary Table S1).
Hub genes. Node properties in each network were analyzed. As shown in Supplementary Table S3, the average degrees of the disease/system nodes were all significantly higher than the average degree of the entire STRING network. With a threshold degree > 50 and BC > 0.05, as shown in Table 1, 36 genes were identified as hub genes for seven diseases/systems and the genes ESR1, TP53, FOS, AKT1, and FRN were hub genes for more than one disease/system. Common gene module among stress-related diseases/systems. To explore the common biological modules underlying stress-related diseases/systems, the nodes and edges of the seven disease/ system networks were compared. A common sub-network including 561 genes and 8863 edges was observed in the network of all seven diseases/systems (Fig. 1b). The 561 interactive common genes (as shown in Table 2) constructed the common gene module among stress-related diseases/systems, they include 180 members of the CS gene set, and all genes in this module can be found in the CS network. Nodes of the CS network significantly overlapped with the common gene module (Fisher's Exact Test, p < 2.2E-16). The average degrees of genes in common module were significantly higher than other genes in disease/system networks (as shown in Supplementary Table S4). 33 hub genes were included in the common module; hub genes in Table 1 that were not included in the common module were ACTN2, CDC42, and OR6A2.

Discussion
In this study we constructed seven stress-related disease/system gene networks based on the interaction information of gene-encoded proteins. The average degrees of the disease/system genes are significantly higher than the average degrees of the entire human interactome, suggesting that disease/system genes and their first neighbors are more highly connected in the human interactome than random genes, so they may play roles in a tighter and more complex manner. The result also supports the hypothesis that disease genes tend to have higher degrees 34,35 . A total of 36 disease/system genes were identified as hub genes that occupy central positions in disease/system networks and may possess important biological functions. Although common genes were not identified among the stress-related diseases/systems compared in this study, a common sub-network was identified among the seven disease/systems. Genes in this sub-network were most enriched in GO pathways that related to chemical homeostasis (as shown in Table 3). This result may imply that there is a common interactive gene module that maintains homeostasis which is related to all stress-related diseases/systems, so this common module provides potential molecular fundaments of the comorbidity. Because most hub genes are included in the common module, the dysfunction of this module may play an important role in disease generation and development. The imbalance of homeostasis induced by aberrant expression of genes in the common module may trigger a pre-disease state 36 with the potential to develop into different pathological processes because of additional disease/system genes that were not found in the common module. In each disease/system network, the average degrees of nodes in common module were significantly higher than other nodes. Considering the reports that that disease genes tend to have higher degrees 34,35 , genes in common module may be more strongly associated to pathological processes than other genes in disease/system networks.
The CS gene set includes human genes whose rodent homologs were differentially expressed in CS rodent models. The significant overlap between the common gene module of seven human diseases/systems and CS network suggests that the stress environment may induce disease by influencing a common homeostasis system. Consequently, a pre-disease state progresses to different disease states as genetic factors and/or other environmental factors are stimulated. This potential mechanism may explain the concomitant strong association between stress and disease and high heterogeneity of pathological processes associated with stress-related diseases. Genes in the common module (as shown in Table 2) could be useful candidates for subsequent experimental study.
The GO pathway clusters enriched by genes in the common module indicate the biological systems that are influenced by stress environments and abnormal in diseases, so they may imply the biological mechanisms by which stress environments induce disease. Beyond that, these pathway clusters also provide specific potential targets for relevant research. As shown in Supplementary Table S5, most of the common nodes are located in the extracellular space and plasma membrane-related cellular components, which suggests candidate targets for disease intervention. The biological processes associated with common module (see in Supplementary Table S6) provide a series of candidates for mechanism research, such as processes related to response, metabolism, cell differentiation and migration, transport, and signaling transduction. The enriched pathway clusters of molecular function, such as peptide receptor activity, and phospholipase activity, suggest potential drug targets (Supplementary Table S7). These functional pathways are interactive systems and could be enriched in several groups (as shown in Supplementary Figure S1). Figure 1c shows the largest enriched interactive group that constructed by pathways of response, regulation, cell migration, transport, and signaling transduction. Besides of the system views, certain enriched biological process clusters provide detailed biological hypotheses for specific diseases. For example, the dysfunction of 37 genes in the function cluster "response to bacterium" (Supplementary Table S6 and Table S8) may directly mediate the process by which stress stimulates infectious disease. This function cluster also provides a potential explanation for the comorbidity among infectious diseases and other stress-related diseases.
In conclusion, we utilized stress-related disease/system genes to construct interactive networks. By analyzing these networks, we identified hub genes which may play roles in the pathological processes of stress-related diseases. We also identified a common sub-network among diseases/systems and the sub-network is significantly overlapped with the CS network. The common sub-network implies that different stress-related diseases/systems share a common gene module that may be influenced by stress environments. By analyzing this common gene module, the potential mechanism underlying the process by which stress induces diseases could be partially revealed.
In spite of above results, this study also has some limitations. First, we constructed network based on existing annotations database which are limited by our current knowledge of biology. Second, limited by the lack of data resource, only seven stress-related diseases or systems were selected to analyze. Third, the CS genes were obtained via homologous analysis on differentially expressed genes from CS rodent models, so result based on these genes need to be further validated in human study.  Table 3. Top 10 pathway clusters enriched by genes of disease/system common module.