Integrative analysis of human protein, function and disease networks

Liu, Wei; Wu, Aiping; Pellegrini, Matteo; Wang, Xiaofan

doi:10.1038/srep14344

Download PDF

Article
Open access
Published: 24 September 2015

Integrative analysis of human protein, function and disease networks

Wei Liu¹,
Aiping Wu^2,3,4,
Matteo Pellegrini⁵ &
…
Xiaofan Wang¹

Scientific Reports volume 5, Article number: 14344 (2015) Cite this article

5325 Accesses
22 Citations
7 Altmetric
Metrics details

Subjects

Abstract

Protein-protein interaction (PPI) networks serve as a powerful tool for unraveling protein functions, disease-gene and disease-disease associations. However, a direct strategy for integrating protein interaction, protein function and diseases is still absent. Moreover, the interrelated relationships among these three levels are poorly understood. Here we present a novel systematic method to integrate protein interaction, function and disease networks. We first identified topological modules in human protein interaction data using the network topological algorithm (NeTA) we previously developed. The resulting modules were then associated with functional terms using Gene Ontology to obtain functional modules. Finally, disease modules were constructed by associating the modules with OMIM and GWAS. We found that most topological modules have cohesive structure, significant pathway annotations and good modularity. Most functional modules (70.6%) fully cover corresponding topological modules and most disease modules (88.5%) are fully covered by the corresponding functional modules. Furthermore, we identified several protein modules of interest that we describe in detail, which demonstrate the power of our integrative approach. This approach allows us to link genes and pathways with their corresponding disorders, which may ultimately help us to improve the prevention, diagnosis and treatment of disease.

Assessment of community efforts to advance network-based prediction of protein–protein interactions

Article Open access 22 March 2023

Xu-Wen Wang, Lorenzo Madeddu, … Yang-Yu Liu

Assessment of network module identification across complex diseases

Article Open access 30 August 2019

Sarvenaz Choobdar, Mehmet E. Ahsen, … Daniel Marbach

Association study based on topological constraints of protein–protein interaction networks

Article Open access 01 July 2020

Hao-Bo Guo & Hong Qin

Introduction

Network methods are powerful tools for unraveling protein functions, protein-pathway associations, disease-gene and disease-disease associations. However, these disparate types of networks are more often studied independently of each other. To date, there has been great progress in the study of protein interaction networks. Previous research on protein networks^{1,2,3,4,5,6,7,8,9} mainly focused on analyzing the associations between genes, functional modules and pathways. Using these approaches, usually only a fraction of detected protein modules have good mapping to biological functions or pathway annotations. Similarly, previous studies of disease networks^{10,11,12,13,14,15,16,17,18,19,20,21,22,23,24} mainly focused on disease classification and the prediction of disease genes. Recently, several groups have studied human disease networks^25,26, to shed light on the relationship between disease genes and disease networks, as well as disease gene modules and their functional analysis. These methods start from diseasome²⁷, which is a bipartite gene-disease network, from which we can derive two different disease networks: disease-disease networks and disease gene networks. Disease networks may help us to understand phenotype associations between proteins and diseases. Thus, a direct strategy for integrating protein interactions, protein function and disease patterns is still absent and the interrelated relationships among these three levels have been poorly investigated.

To better understand the relationships between these three network types, we present a multi-network systematic analysis method. Using our approach, protein modules are determined directly from topological modules using the network topological algorithm we previously developed (NeTA²⁸). Traditionally, a protein module is defined as a group of proteins that carry out similar functions. These functions are associated with the same pathway and could be associated with a particular disease. Here we focus on three distinct protein modules: topological, functional and disease modules^25,26. Topological modules represent a locally dense structure in protein-protein interaction (PPI) networks; function modules represent the aggregation of proteins of related function in a function network; disease modules represents a group of proteins that share a common disease phenotype within a disease network. Though the three types of modules are derived from three different types of networks, they can be closely interrelated and highly overlapping²⁵.

Starting with the protein interaction dataset from Hippie²⁹, we identified 136 topological modules with NeTA, 136 corresponding functional modules (annotated using Gene Ontology³⁰) and 139 disease modules annotated using OMIM³¹ and GWAS³². To our surprise, most functional modules (70.6%) are highly consistent with the corresponding topological modules and most disease modules (88.5%) are fully covered by the corresponding functional modules and have significant pathway annotations. By systematically integrating the three levels of networks and protein modules, we found that our multi-level method for biological interpretation has distinct advantages over approaches that only consider subsets of data and annotations. Many interesting modules are found that could not be easily discovered by only one data type. For example, we identified several protein interaction modules that allowed us to connect inflammatory responses to Alzheimer’s disease, suggesting that this pathology may have a strong inflammatory component. Moreover, in many modules, we found that a subset of genes is associated with specific functions or diseases, allowing us to identify genes and pathways with their corresponding disorders. The approach we present here not only provides an avenue for network integration, but also promises to shed light on the prevention, diagnosis and treatment of complex diseases.

Results

The integrated multi-networks mapping method

Figure 1 shows a schematic of our overall approach, the framework of the integrated multi-networks mapping method, which consisted of three steps. First we determined the topological modules from a human PPI network. Next, we annotated all topological modules using Gene Ontology (GO), to obtain functional modules. Finally, we included OMIM and GWAS data to obtain disease modules. Thus, three levels of networks were constructed and modules were identified at each level, including a protein network and its topological modules, a function network and its functional modules and a disease network and its disease modules. Finally, we integrate the three types of networks and modules, to discover modules that have coherent function and disease interpretation, leading to new associations that are not evident when analyzing only a single type of network.

Protein Networks and Topological Modules

The human PPI network was constructed based on the HIPPIE²⁹ and IRefWeb³³ databases, which results in a network of 2484 direct physical interactions among 1830 proteins. We detected 136 large modules and 185 small modules (most of which only contain two proteins) by applying the network topology algorithm NeTA²⁸. Here we analyze the 136 larger topological modules (as shown in Supplementary Table 1). This PPI network contains 1390 proteins and 2228 interactions (Fig. 2b), which results in 76% of the proteins being associated with 89.7% of the interactions of the PPI network. As Fig. 2a shows, the size of larger modules runs from 3 to 88. In Fig. 2b, different colors represent different modules and we can clearly see that this network has a modular structure. The modularity Q³⁴ is 0.91385, which means these modules have significantly more community structure than a random zero-model.

Function Networks and Functional Modules

To build a function network, we mapped each topological module into Bingo³⁵ and analyzed the GO enrichments of each module at three levels of Gene Ontology (GO) slim, using annotation from Biological Process, Cellular Component and Molecular Function ontologies³⁰. A functional module was defined as a group of genes in a topological module that is associated with a specific GO term. In total, we found 136 functional modules (as shown in Supplementary Table 2) with at least three proteins. If we don’t consider the unannotated proteins (there is only one protein in our network that has no annotations in Bingo), we found that 96 (70.6%) of our topological modules are fully covered by functional modules (i.e. all the proteins map to the same function term). For example, topological module 3 consists of 8 proteins, COPA, COPB, COPD, COPE, COPB2, COPZ1, COPG2 and TMEDA (Fig. 3a). All eight proteins share the same BP function “Golgi vesicle transport” (p-value is 2.4E-16), as well as the same CC function “cytoplasmic vesicle membrane” (p-value is 9.71E-15). In general, all other modules are covered by at most two function modules. An example of this is topological module 9, which has four genes: BL1S1, BL1S2, BL1S3 and SNAPN (Fig. 3b), that are associated with two functional modules: BL1S1, BL1S2, BL1S3 (“cellular pigmentation”, p-value is 4.01E-7) and BL1S1, BL1S3 and SNAPN (“vesicle-mediated transport”, p-value is 3.64E-3).

Furthermore, each topological module was annotated using the DAVID^36,37 online analysis tool to identify pathway enrichment (see Methods) and construct protein-pathway networks. We found 88 topological modules (as shown in Supplementary Table 4) significantly associated with a pathway and pathway genes are closely related with corresponding functional module genes. For example, as Fig. 3c shows, topological module 23 has 17 proteins, all of which are annotated as “DNA Replication pathway”(p-value is 1.09E-25), as well as “nucleoplasm” (p-value is 3.06E-20).

Disease Networks and Disease Modules

To build the relationship between proteins and diseases, we mapped each topological module to the OMIM³¹ and GWAS³² databases. In total, 109 topological modules have disease genes and 139 significant disease modules (as shown in Supplementary Table 3) were identified. One topological module may corresponds to one or more than one disease module. For example, topological module 6 has six genes, of which EGLN, TGFB1, TGFR1 and TGFR2 are disease genes associated with Bone and Cardiovascular diseases, which we therefore label as a disease module (Fig. 4a). Another example, topological module 11, with 4 genes, contains MEIS1, MEIS2 and PBX1, which are disease genes associated with Cardiovascular, Neurological, Psychiatric, Endocrine and Respiratory diseases and these were also defined as a disease module (Fig. 4b).

The study of associations between diseases is also potentially interesting, as it could help understand relationships between complex syndromes. We constructed a disease-disease network for each topological module. Nodes are diseases that are associated with one gene or multiple genes in the topological module and edges between two diseases denote that they share at least one disease gene. Closely related diseases may be associated with complex syndromes. For example, Fig. 4c shows the disease-disease network of topological module 33: Thyroid carcinoma (papillary), carney complex (type 1), Adrenocortical tumor (somatic), Pigmented adrenocortical disease (primary) and Myxoma (intracardiac), which are all cancers and besides Myxoma, are also endocrine pathologies.

Integrative Analysis

Considering protein interaction, function and disease networks independently significantly limits our ability to carry out a systematic study of the data. As a result, we integrated protein, function and disease networks, in order to annotate protein modules according to their function and disease associations, to gain a systematic view of these relationships. In addition, to view the relationship between different types of modules, we also integrated topological modules with functional and disease modules. If a disease module is highly overlapping (over half of proteins) with a functional module, then we defined its corresponding topological module as a non-trivial protein module; if a disease module is highly overlapping (over half of proteins) with a pathway module, then we defined its corresponding topological module as a significant protein module. Using this integrative analysis, we identified 69 non-trivial protein modules and 47 significant protein modules in our PPI network. We discuss a few examples below.

Figure 5a shows an intriguing non-trivial protein module (Topological module 55) that connects leptin and the leptin receptor to the inflammatory cytokine receptor IL6RB. There is extensive literature implicating leptin to obesity and diabetes³⁸. However, this module shows us that these disorders are also associated with inflammation (through IL6). There is increasing recognition that many metabolic disorders, such as diabetes, are also associated with higher levels of inflammation³⁹. Thus this module suggests that anti-inflammatory treatments could be coupled with weight loss regimes to address metabolic disorders.

Figure 5b shows another non-trivial protein module (Topological module 82) with a number of factors that likely play a significant role in hematopoietic development. Specifically, Tal1 is a master regulator of T cell development and inhibits the production of cardiac cells⁴⁰. It is therefore interesting to see that several of the genes in this module are associated not only with T cell development, but also with heart disease and heart rate.

Figure 5c describes a complex of proteins associated with NfkB (Topological module 113), a master regulator of inflammatory responses. One interesting observation is that several genes in this module are associated with Alzheimer’s disease. This is of interest, as there is growing recognition that Alzheimer’s disease may be associated with inflammation and its risk is elevated by metabolic disorders, such as diabetes⁴¹. Thus this non-trivial protein module allows us to make the critical connection between these two important disorders and the basal inflammatory responses of cells.

Figure 6 shows a significant protein module (Topological module 24) with four genes, SYN1, SYN2, SYN3 and CAPON. Of these SYN1, SYN2 and SYN3 are all associated with psychiatric disease and the synaptic transmission and synaptic vesicle trafficking pathway.

In addition, topological module 3 and 51 are also interesting non-trivial protein modules. All the proteins of module 3 are involved in Golgi vesicle transport and most are also involved in the membrane trafficking pathway and associated with Alzheimer’s disease. All the proteins of module 51 are associated with translation initiation factor activity and in the Metabolism of proteins pathway and most are also associated with liver disease.

Comparison with existing methods

In recent years, a number of methods have been developed to identify functional modules^{1,2,3,4,5,6,7,8,9} and disease modules^{10,11,12,13,14,15,16,17,18,19,20,21,22,23,24} in PPI networks. Most methods to identify disease modules are disease protein prioritization methods. To evaluate the relative performance of our method, we compare our results with two representative methods that can identify functional and disease modules. One is the Markov Cluster Algorithm (MCL)⁴², which is based on random flow (We use the default settings, inflation parameter r = 2) and the other is random walker (RW)⁴³ that wanders from node to node along the links of the network. After every move the walker is reset to a randomly chosen seed gene with a given probability r (we use r = 0.4).

Figure 7(a) shows the modularity results from NeTA, MCL and RW, which can qualify the clustering quality of the topological modules. NeTA performs better than the other two methods. Figure 7(b) shows the mapping frequency of the three methods based on the topological modules that were identified. As expected, among the three kinds of modules, no matter what method was used, we identify more functional than disease modules. NeTA identified more functional modules than MCL and RW and also identified more pathway and disease modules. Figure 7(c) shows the average mapping frequency of the three methods based on topological modules that were identified. Fore each method, we count each functional/disease module mapping frequency and take the mean value, which measures the mapping accuracy of each method. We can see that the mapping accuracy of NeTA is higher than other two methods. Figure 7(d) describes the mapping frequency of non-trivial protein modules and significant protein modules. Again, NeTA finds more non-trivial and significant protein modules. Overall, we find that our method performs better than the other algorithms for all kinds of protein modules.

Systematic evaluation analysis

To systematically evaluate the power of our method to infer function-disease associations, we constructed a benchmark network based on the OMIM³¹ and MIPS human complex database⁴⁴. We filtered human complex PPI with disease genes from OMIM and constructed a network with 1460 proteins and 4107 protein–protein interactions. In this setting, there is at least one disease gene in each interaction. We use this as a benchmark network, to identify disease, protein complex and disease-complex modules and compare our results with the MCL and RW algorithms.

Figure 8(a) shows the resulting modularity of NeTA, MCL and RW against this network. Among them, NeTA has the highest modularity, which shows it can obtain better module structure than the other two methods. Figure 8(b) shows the number of different modules identified by the three methods. RW identified the most topological modules and MCL identified the most complex modules and NeTA identified the most disease modules and disease-complex modules. Figure 8(c) shows the mapping frequency of different modules identified by the three methods. Disease and protein complex modules can only map to approximately 20% of topological modules and even fewer disease-complex modules. Overall, we find that our method performs competitively with the other algorithms.

Discussion

Protein interaction, function and disease networks can be clustered into cohesive groups. Accordingly, these cohesive groups can be defined as topological, functional and disease modules. Most previously published approaches that analyze these datasets only focus on a subset of the three levels. For example, most work on PPI networks only focus on topological modules and their corresponding functional modules^{1,2,3,4,5,6,7,8,9}. Other approaches analyze pathway enrichment of modules. Similarly, most of the work on disease networks focuses on disease genes and their classification^{10,11,12,13,14,15,16,17,18,19,20,21,22,23,24}. As a result, an integrative analysis of all three levels of modules and networks has yet to be performed.

Here we present a systematic method for combining protein interactions, functions and disease networks, resulting in an integrative analysis that yields topological, functional and disease modules. Other integrative approaches start from Diseasome²⁷ to detect disease modules and then identify functional and topological modules based on these. In contrast, we start from a human PPI network and detect 136 topological modules (as shown in Supplementary Table 1) using NeTA. We then annotate these topological modules using GO, OMIM and GWAS and find corresponding functional and disease modules, leading to the construction of networks for each of the three levels. To visualize the associations among the three levels of modules and networks, we integrated the three levels together and found that they generate new insights into protein network analysis. This approach allowed us to identify many interesting modules, which can’t be fully annotated only using a single type of data. For example, we identified several protein interaction modules that allowed us to connect inflammatory responses to Alzheimer’s disease, suggesting that this pathology may have a strong inflammatory component.

In topological module 3, which includes eight proteins, we found that all the proteins are involved in Golgi vesicle transport and that COPA, COPB1, COPB2, COPD, COPE, COPG2 and COPZ1 belong to an octamer protein complex^45,46,47. In addition, COPA, COPB1, COPB2, COPD, COPE and COPZ1 are involved in the membrane trafficking pathway and COPA, COPB1, COPB2, TMED10 and COPG2 are associated with Alzheimer’s disease^45,46,47. Moreover, COPD is associated with increased risk for Mild Cognitive impairment, the earliest phase of Alzheimer’s disease and COPZ1 is involved in intracellular trafficking^45,46,47. Impairment of intracellular trafficking has been implicated in the pathogenesis of Alzheimer’s disease, so COPZ1 may be associated with Alzheimer’s disease. COPE is associated with depressive disorder, which is similar to the later phase of Alzheimer’s disease, suggesting that COPE may also be associated with Alzheimer’s disease.

Topological module 5 includes 11 genes, which are all located in the membrane. DMD, DTNA, NOS1, SNTA1, MAST2 and VAC14 are Type 2 diabetes disease genes^45,46,47, SCN5A is a diabetes mellitus disease gene, SNTB2 and UTRN are type 1 diabetes disease genes^46,47; SNTB1 controls glucose levels and could be a potential diabetes disease gene. MAST1 is an important paralog of MAST2⁴⁵ and phosphorylation of DMD or UTRN may modulate their affinities for associated proteins and thus may also be associated with diabetes mellitus.

Topological Module 51 includes 11 genes, which are components of the eukaryotic translation initiation factor 3 (eIF-3) complex, which is required for several steps in the initiation of protein synthesis⁴⁵. All these genes are related with translation initiation factor activity and in the Metabolism of proteins pathway. In fact, the eIF-3 complex is composed of 13 subunits and EIF3J and EIF3M are not included in this module⁴⁵. The most interesting observation is that all these genes are associated with liver diseases: EIF3A, EIF3B, EIF3C, EIF3D and EIF3G are all associated with Liver Failure⁴⁷, Acute Hepatitis; EIF3E, EIF3F and EIF3K are associated with Liver Neoplasms⁴⁷; EIF3H is associated with Carcinoma Hepatocellular⁴⁷; EIF3I is associated with clonorchiasis⁴⁷. Furthermore, EIF3L has a lower level of expression in liver cancer^45,46,47. Therefore, the eIF-3 complex may be associated with liver disease as well.

As these examples illustrate, our work has the potential to inform the prevention, diagnosis and treatment of disease. It is often difficult to accurately identify potential gene targets based on GWAS, even though many GWAS variants are strongly associated with diseases. Although GWAS to protein associations affect the number of disease modules we can identify, we do not expect that these uncertainties significantly change the analysis results we obtained. In conclusion, our integrative analyses are still far from providing important therapeutic breakthroughs, which require substantial follow-up investigation. Nonetheless, they provide a wealth of hypothesis that could lead to clinical improvements in the future. To make these hypotheses more robust, in subsequent work we intend to improve the method of noise reduction and data integration, split bigger modules into smaller ones and integrate more levels of data together to improve our system level understanding of these complex diseases.

Methods

Data source

HIPPIE²⁹ is a human PPI database and currently contains more than 156,000 interactions of ~14,500 human proteins. It integrates multiple major expert-curated experimental PPI databases and all interactions have an associated normalized confidence score. Here we selected six public human PPI databases: BioGrid⁴⁸, DIP⁴⁹, HPRD⁵⁰, IntAct⁵¹, MINT⁵² and BIND⁵³ as our data sources based on the HIPPIE database and identified high-confidence interactions based on the HIPPIE scoring system. To obtain more reliable interactions, we only keep those that are found in at least two pubic databases and are classified as high-confidence interactions.

Network Construction

Protein Network

We extracted high-confidence interactions from the HIPPIE database and took direct physical interactions that cross multiple species based on the IRefWeb³³ database to construct the final human PPI network. IRefWeb is a web interface to protein interaction data consolidated from 10 public databases. It can automatically crop the PPI dataset to produce a subset of higher-quality interactions, which aids the generation of more meaningful organism-specific interaction networks. In this network a node denotes a protein and a link represents a protein-protein interaction.

Function Network

There are two kinds of function networks: one is a Protein-function network and the other is a protein-pathway network. Protein-function networks are obtained by connecting proteins of each topological module (defined below) with corresponding GO biological processes, cellular localizations and molecular functions. In what follows we only used the third level under of GO slim terms³⁰. In these networks nodes are proteins or GO terms. Edges are drawn between a protein and a term when a significant association between them exists (based on a hypergeometroc test P value between the functional modules and protein module). Protein-pathway networks are constructed by connecting proteins of each topological module with corresponding pathway annotations (pathway sources see below).

Disease Network

By mapping each topological module into the OMIM and GWAS database, we constructed two types of disease networks: Disease-gene networks and disease-disease networks. Disease-gene networks connect the genes in each topological module with their associated diseases. Disease-disease networks connect pairs of diseases if they share at least one disease gene.

Protein Module Detection

Topological Modules

High aggregation is an essential characteristic of biological networks and it reflects high modularization of gene networks. The network we use was first clustered into different sizes of topological modules before further analysis. Accurately identifying topological modules of a biological network is still challenging. Here we detected topological modules based on a network topological algorithm NeTA²⁸ (NeTA can detect sparse and small modules and is competitive with other methods²⁸) and we only consider those topological modules that contain at least three proteins.

Functional Modules

To evaluate the biological significance of these topological modules, we analyzed Gene Ontology)³⁰ enrichment of each topological module with the Bingo³⁵ plugin in Cytoscape⁵⁴ with a threshold P-value < 0.05 based on the Hypergeometric test and corrected by the Benjamini & Hochberg False Discovery Rate (FDR). Bingo generates hierarchical functional annotations based on GO slim. To obtain coherent functional modules, we only chose functions in the third level of GO slim³⁰. We consider a group of proteins in a single topological module as a functional module if and only if at least one function can cover all these proteins.

Disease Modules

Online Mendelian Inheritance in Man (OMIM)³¹ is a comprehensive, authoritative compendium of human genes and genetic phenotypes. Genome-wide association studies (GWAS³²) examine common genetic variants in populations to see if they are associated with a trait. GWASdb⁵⁵ is a database that combines collections of GVs from GWAS together with their functional annotations and disease classifications. MalaCards⁵⁶ is an integrated searchable database of human maladies and their annotations, modeled on the architecture and richness of the popular GeneCards database of human genes. We detected disease modules based on known disease-gene associations extracted from OMIM and GWASdb and the disease classification of MalaCards online annotations. If more than two proteins have associations with the same disease type within certain topological module, then we take these proteins as a disease module. Here we classify diseases into 15 kind of phenotypes: Neurological, Ophthamological, Cardiovascular, Bone, Dermatological, Endocrine, Metabolic, Cancer, Immunological, Psychiatric, Hematological, Renal, Respiratory, Ear, Nose, Throat and Gastrointestinal by integrate MalaCards database and Barabási et al. method²⁷.

Pathway enrichment analysis

Information on the biological pathways that the module-related genes are involved in for each topological module was retrieved from DAVID^36,37 online analytical tools. We set a corrected P-value <0.05 as the threshold used for enrichment analysis of pathways. The pathway databases we used are KEGG⁵⁷ and REACTOME⁵⁸, PANTHER⁵⁹ and BIOCARTA⁶⁰.

Systematic Analysis Method

Here we use a systematic analysis method to discover significant modules. The specific steps are shown in Fig. 1. First we integrate 6 different human PPI databases; second, we integrate the HIPPIE database with IRefWeb database to obtain human protein “interactome” network; third, we divided the network into PPI sub-networks (topological modules) based on NeTA algorithm; fourth, construct corresponding function networks (based on detected functional modules) and disease networks (based on detected disease modules); lastly, to view the relationship of different types of modules more clearly, we integrate topological modules, functional modules and disease modules together, to generate an integrative analysis of different network levels, including protein, function and disease networks. We annotated the proteins within each module with the third level of GO slim.

Additional Information

How to cite this article: Liu, W. et al. Integrative analysis of human protein, function and disease networks. Sci. Rep. 5, 14344; doi: 10.1038/srep14344 (2015).

References

Pinkert, S., Schultz, J. & Reichardt, J. Protein Interaction Networks—More Than Mere Modules. PLoS Comput. Biol. 6, e1000659 (2010).
Article ADS MathSciNet Google Scholar
Schaefer, M. H. et al. Adding Protein Context to the Human Protein-Protein Interaction Network to Reveal Meaningful Interactions. PLoS Comput. Biol. 9, e1002860 (2013).
Article CAS Google Scholar
Juyong, Lee, Steven, P. Gross & Jooyoung, Lee Improved network community structure improves function prediction. Sci. Rep. 3, srep02197 (2013).
Sharan, R., Ulitsky, I. & Shamir, R. Network-based prediction of protein function. Mol. Syst. Biol. 3, 88, (2007).
Article Google Scholar
Cho, Y.-R., Lei, Shi & Aidong, Zhang Functional module detection by functional flow pattern mining in protein interaction networks. BMC Bioinformatics 9, S10/O1 (2008).
Yook, S. H., Oltvai, Z. N. & Barabási, A. L. Functional and topological characterization of protein interaction networks. Proteomics 4, 928–942 (2004).
Article CAS Google Scholar
Chen, J. & Yuan, B. Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics 22, 2283–2290 (2006).
Article CAS Google Scholar
Pu, S., Vlasblom, J., Emili, A., Greenblatt, J. & Wodak, S. J. Identifying functional modules in the physical interactome of Saccharomyces cerevisiae. Proteomics 7, 944–960 (2007).
Article CAS Google Scholar
Anna, C. F. Lewis et al. The function of communities in protein interaction networks at multiple scales. BMC Syst. Biol. 4, 100 (2010).
Article Google Scholar
Yunpeng, Zhang et al. Network Analysis Reveals Functional Cross-links between Disease and Inflammation Genes. Sci. Rep. 3, srep03426 (2013).
Bauer-Mehren, A. et al. Gene-Disease Network Analysis Reveals Functional Modules in Mendelian, Complex and Environmental Diseases. PLoS ONE 6, e20284 (2011).
Article CAS ADS Google Scholar
Gustafsson et al. Modules, networks and systems medicine for understanding disease and aiding diagnosis. Genome Med. 6, 82 (2014).
Article Google Scholar
Marinka, Zitnik et al. Discovering disease-disease associations by fusing systems-level molecular data. Sci. Rep. 3, srep03202 (2013).
Davis, D. A. & Chawla, N. V. Exploring and Exploiting Disease Interactions from Multi-Relational Gene and Phenotype Networks. PLoS ONE 6, e22670 (2011).
Article CAS ADS Google Scholar
Bauer-Mehren, A. et al. Gene-Disease Network Analysis Reveals Functional Modules in Mendelian, Complex and Environmental Diseases. PLoS ONE 6, e20284 (2011).
Article CAS ADS Google Scholar
Reyes-Palomares, A., Rodríguez-López, R., Ranea, J. A. G., Jiménez, F. S., Medina, M. A. Global Analysis of the Human Pathophenotypic Similarity Gene Network Merges Disease Module Components. PLoS ONE 8, e56653 (2013).
Article CAS ADS Google Scholar
Yang, P., Li, X., Wu, M., Kwoh, C.-K. & Ng, S.-K. Inferring Gene-Phenotype Associations via Global Protein Complex Network Propagation. PLoS ONE 6, e21502 (2011).
Article CAS ADS Google Scholar
Marc, Vidal, Michael, E. Cusick & Albert-László, Barabási Interactome Networks and Human Disease. Cell 144, 986–998 (2011).
Article Google Scholar
Suthram, S. et al. Network-Based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets. PLoS Comput. Biol. 6, e1000662 (2010).
Article Google Scholar
Chan, S. Y. & Loscalzo, J. The emerging paradigm of network medicine in the study of human disease. Circ. Res. 111, 359–374 (2012).
Article CAS Google Scholar
Kwang-Il, Goh & In-Geol, Choi Exploring the human diseasome: the human disease network. Brief Funct. Genomics 11, 533–542 (2012).
Article Google Scholar
Zhou, X. Z. et al. Human symptoms–disease network. Nat. Commun. 5, ncomms5212 (2014).
Frank, E.-S., Shailesh, T., Ricardo de, M. S., Ahmed, F. H. & Matthias, D. The human disease network. Syst. Biomed. 1, 20–28 (2013).
Article Google Scholar
Xiujuan, W., Natali, G. & Haiyuan, Y. Network-based methods for human disease gene prediction. Brief Funct. Genomics 10, 280–293 (2011).
Article Google Scholar
Barabási, A. L., Natali, G. & Joseph, L. Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011).
Article Google Scholar
Laura, I. Furlong. Human diseases through the lens of network biology. Trends Genet. 29, 150–159 (2013).
Article Google Scholar
Kwang-Il, Goh et al. Human disease network. Proc. Natl. Acad. Sci. USA 104, 8685–8690 (2007).
Article ADS Google Scholar
Wei, Liu, Matteo, Pellegrini & Xiaofan, Wang . Detecting Communities Based on Network Topology. Sci. Rep. 4, srep05739 (2014).
Schaefer, M. H. et al. HIPPIE: Integrating Protein Interaction Networks with Experiment Based Quality Scores. PLoS ONE 7, e31826 (2012).
Article CAS ADS Google Scholar
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Article CAS Google Scholar
Hamosh Ada, A. F. S., Amerger, Joanna, Bocchini, Carol, Valle, David & McKusick, Victor A Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl. Acids Res. 30, 52–55 (2002).
Article CAS Google Scholar
Beck, T., Hastings, R. K., Gollapudi, S., Free, R. C. & Brookes, A. J. GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies. Eur. J. Hum. Genet. 7, 949–952 (2014).
Article Google Scholar
Turinsky, A. L., Razick, S., Turner, B., Donaldson, I. M. & Wodak, S. J. Navigating the global protein-protein interaction landscape using iRefWeb. Methods Mol. Biol. 1091, 315–331 (2014).
Article CAS Google Scholar
Newman, M. E. J., Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004).
Article CAS ADS Google Scholar
Maere, S., Heymans, K. & Kuiper, M. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 21, 3448–3449 (2005).
Article CAS Google Scholar
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nature Protoc. 4, 44–57 (2009).
Article CAS Google Scholar
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucl. Acids Res. 37, 1–13 (2009).
Article Google Scholar
Jeffrey, S. Flier. Hormone resistance in diabetes and obesity: insulin, Leptin and FGF21. Yale J Biol Med. 85, 405–414 (2012).
Google Scholar
Donath, M. Y. & Shoelson, S. E. Type 2 diabetes as an inflammatory disease. Nat. Rev. Immunol. 11, 98–107 (2011).
Article CAS Google Scholar
Porcher, C. et al. The T cell leukemia oncoprotein SCL/tal-1 is essential for development of all hematopoietic lineages. Cell 86, 47–57 (1996).
Article CAS Google Scholar
Akiyama, H. et al. Inflammation and Alzheimer’s disease. Neurobiol. Aging. 21, 383–421 (2000).
Article CAS Google Scholar
Enright, A. J., Van Dongen, S., Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucl. Acids Res. 30, 1575–1584 (2002).
Article CAS Google Scholar
Kohler, S., Bauer, S., Horn, D. & Robinson, P. N. Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82, 949–958 (2008).
Article Google Scholar
Ruepp, A. et al. CORUM: the comprehensive resource of mammalian protein complexes—2009. Nucl. Acids Res. 38, D497–D501 (2010).
Article CAS Google Scholar
Barrett, T. et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 41, D991–D995 (2013).
Article CAS Google Scholar
Safran, M. et al. GeneCards Version 3: the human gene integrator. Database 2010, baq020 (2010).
Hruz, T. et al. RefGenes: identification of reliable and condition specific reference genes for RT-qPCR data normalization. BMC Genomics 12, 156 (2011).
Article CAS Google Scholar
Stark, C. et al. The BioGRID Interaction Database: 2011 update. Nucl. Acids Res. 39, D698–D704 (2011).
Article CAS ADS Google Scholar
Salwinski, L. et al. The Database of Interacting Proteins: 2004 update. Nucl. Acids Res. 32, D449–D451 (2004).
Article CAS Google Scholar
Keshava, Prasad T. S. et al. Human Protein Reference Database–2009 update. Nucl. Acids Res. 37, D767–D772 (2009).
Article Google Scholar
Aranda, B. et al. The IntAct molecular interaction database in 2010. Nucl. Acids Res. 38, D525–D531 (2010 ).
Article CAS Google Scholar
Ceol, A. et al. MINT, the molecular interaction database: 2009 update. Nucl. Acids Res. 38, D532–D539 (2010 ).
Article CAS Google Scholar
Bader, G. D., Betel, D. & Hogue, C. W. BIND: the Biomolecular Interaction Network Database. Nucl. Acids Res. 31, 248–250 (2003).
Article CAS Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 3, 2498–2504 (2003).
Article Google Scholar
Mulin, Jun Li et al. GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucl. Acids Res. 40, D1047–D1054 (2012).
Article Google Scholar
Noa, Rappaport et al. MalaCards: an integrated compendium for diseases and their annotation. Database (Oxford) 2013, bat018 (2013).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucl. Acids Res. 28, 27–30 (2000).
Article CAS Google Scholar
David, Croft et al. Reactome: a database of reactions, pathways and biological processes. Nucl. Acids Res. 39, D691–D697 (2011).
Article ADS Google Scholar
Mi, H. & Thomas, P. PANTHER pathway: an ontology-based pathway database coupled with data analysis tools. Methods Mol. Biol. 563, 123–140 (2009).
Article CAS Google Scholar
Darryl, N. BioCarta. Biotech Software & Internet Report. 2, 117–120 (2001).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant (No. 61374176) and the Science Fund for Creative Research Groups of the National Natural Science Foundation of China (No. 61221003).

Author information

Authors and Affiliations

Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
Wei Liu & Xiaofan Wang
Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100080
Aiping Wu
Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005
Aiping Wu
Suzhou Institute of Systems Medicine, Suzhou, 215123, Jiangsu, China
Aiping Wu
Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, 90055, CA
Matteo Pellegrini

Authors

Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Aiping Wu
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Pellegrini
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

W.L. analyzed data, designed and performed research. M.P., W.L., A.P.W. and X.F.W. discussed the results and wrote the manuscript. All authors reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Liu, W., Wu, A., Pellegrini, M. et al. Integrative analysis of human protein, function and disease networks. Sci Rep 5, 14344 (2015). https://doi.org/10.1038/srep14344

Download citation

Received: 10 May 2015
Accepted: 26 August 2015
Published: 24 September 2015
DOI: https://doi.org/10.1038/srep14344

This article is cited by

Whole-genome sequencing reveals novel ethnicity-specific rare variants associated with Alzheimer’s disease
- Daichi Shigemizu
- Yuya Asanomi
- Kouichi Ozaki
Molecular Psychiatry (2022)
Exploring the conservation of Alzheimer-related pathways between H. sapiens and C. elegans: a network alignment approach
- Avgi E. Apostolakou
- Xhuliana K. Sula
- Vassiliki A. Iconomidou
Scientific Reports (2021)
Disease association of human tumor suppressor genes
- Asim Bikas Das
Molecular Genetics and Genomics (2019)
A network pharmacology approach to investigate the pharmacological effect of curcumin and capsaicin targets in cancer angiogenesis by module-based PPI network analysis
- Sharath Belenahalli Shekarappa
- Shivananda Kandagalla
- Manjunatha Hanumanthappa
Journal of Proteins and Proteomics (2019)
Mapping biological process relationships and disease perturbations within a pathway network
- Ruth Stoney
- David L Robertson
- Jean-Marc Schwartz
npj Systems Biology and Applications (2018)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.