Main

Regulatory and signaling pathways connecting constituent parts of the cell (DNA, RNA, proteins and metabolites) coordinate multiple cellular functions, which can be studied using a network representation of physical molecular interactions. Cell decision processes are dependent on the structure and dynamics of, for example, gene-regulatory networks. Different steady-stable states (also called attractor states) appear to be related to distinct functional phenotypic states of the cell.1

Proteins exist as populations of states of different structures and energies, including native structural states, malfunctioning and highly unfolded species. Biological function might be altered by protein misfolding, impairment of protein-binding events and malfunctioning of the allosteric control mechanism. As transition between these states often requires crossing high-energy barriers, penetration of the full energy landscape under unperturbed conditions is low. However, specific perturbations, such as mutations, changes in pH and temperature, post-translational modifications and inter-molecular binding, modify the protein energy landscape and shift the protein population towards other conformations.2, 3

Malfunctioning proteins, for instance, have a role as the infectious agent in prion disease.4 Similar events are hypothesized to occur in other protein misfolding diseases such as Alzheimer's, Huntington's and Parkinson's in which protein agents may similarly induce pathological protein states.5, 6 Protein population shift as a result of external stimuli can be observed in the post-translational modifications of crystallin protein leading to cataract formation7 or the misfolding of S100 proteins caused by changing metal ion concentrations, which are implicated in cancer, neurodegenerative, inflammatory and autoimmune diseases.8

The metabolic syndrome describes a cluster of metabolic abnormalities encompassing elevated fasting glucose concentrations, increased waist circumference, increased triglycerides, low HDL cholesterol levels and high blood pressure.9 It occurs with a prevalence of 20–25% worldwide according to the IDF (International Diabetes Federation),10, 11 and is on the rise, especially in the elderly population, and more alarming even in children and adolescents.12

Total and cardiovascular mortality is increased in the metabolic syndrome, and the risk of developing overt type II diabetes is increased fivefold.13, 14 From a pathophysiological point of view, the metabolic syndrome is widely held to be caused by central adiposity that can lead to insulin resistance under given genetic and environmental circumstances.15

Peroxisome proliferator-activated receptor gamma (PPARγ), a versatile member of the nuclear receptor superfamily, is implicated in the metabolic syndrome, colon cancer and other diseases. The disease outcome is associated with PPARγ gene polymorphism or mutation (Table 1). PPARγ is most abundantly expressed in the adipose tissue and is of key importance in adipocyte differentiation and triglyceride storage.16 It also participates in glucose homeostasis and insulin sensitivity and thus, synthetic ligands for PPARγ, the so-called thiazolidinediones, have been developed and approved to treat type II diabetes. However, the exact mechanism of action of these drugs is not yet fully elucidated, primarily because these compounds bind to PPARγ not only in the adipose tissue. Moreover, it is still unclear how PPARγ controls the balance between obesity as result of increased adipogenesis and insulin sensitivity. Thiazolidinediones increase insulin sensitivity, but seemingly paradoxically also cause weight gain.17 Studies on naturally occurring mutations in humans have not been conclusive; some have shown increased PPARγ transcriptional activity with increased adipogenesis and obesity, whereas others have demonstrated decreased transcriptional activation and lower body mass and increased insulin sensitivity.18, 19

Table 1 PPARγ polymorphisms and mutations related to disease

Recent work has shown that high-fat feeding of mice activates protein kinase cyclin-dependent kinase 5 (CDK5) in adipose tissues, leading to the phosphorylation of PPARγ at serine 27320 (Figure 1). In in vitro experiments, CDK5-mediated phosphorylation of serine 273 led to the differential expression of a subset of genes, including a number of key metabolic regulators, which were consistent with genes differentially expressed in mice fed a high-fat diet.20 This finding suggests that phosphorylation alters the regulatory function of PPARγ and may be involved in the pathogenesis of insulin resistance in humans. Comparison of wild-type phosphorylated (putative disease state) with non-phosphorylated mutant (putative healthy state) cell lines reveals significant differences in the expression levels of genes, which may have a role in disease development. Hundreds of genes not expressed in the healthy state are upregulated when PPARγ is phosphorylated, leading to changes at the cellular level.

Figure 1
figure 1

PPARγ transcriptional regulation. In the normal state (left) and in the metabolic syndrome leading to dyslipidemia (right)

In this study, we investigated how the phosphorylation of serine 273 in PPARγ shifts the protein population towards a malfunctioning state and consequently perturbs cellular molecular networks, eventually leading to the development of insulin resistance. In a structural analysis of the PPARγ–retinoid X receptor alpha (RXRα) dimer complex, we examined changes between the two states that may constitute an important mechanism leading to the differential expression observed in the network. We studied the effects with a particular focus on the DNA-binding domain (DBD) of the receptor, using molecular dynamic (MD) simulations. PPARγ and RXRα form a highly cooperative functional dimer. Each monomer consists of a DBD and a ligand-binding domain (LBD) linked by long hinge fragment. Dimerization requires contacts between the LBD and the DBD of both receptors, as well as interaction between PPARγ's LBD and the first segment of RXRα's hinge. The binding of DNA and ligands has a role in a complex stabilization process: by binding to both receptors’ DBD, PPARγ LBD stabilizes DNA binding. The PPARγ–RXRα heterodimer binds to a PPAR response element (PPRE), which is composed of a direct repeat separated by one or two base pairs (PPREs that have one intervening base pair are often referred to as a direct repeat 1 or DR1). The minor grove of the DR1 protects a small polar DBD–DBD interface (PPARγ Asn188 and RXRα Arg209 and Gln206), from the solvent what fortifies inter-domain interactions and hence stabilizes the protein–DNA complex.21 Several disease-related polymorphisms and mutations located in the LBD might affect stability of the LBD directly (Phe388, Arg425, Pro495) or impair ligand binding (Gln314, Arg316, Ser317, Val318 and His477; see Table 1). The importance of LBD–DBD domain cooperation to assemble the functional complex was demonstrated by mutating the highly conserved Phe375 residue that led to weakened DNA binding.21 Serine 273 is possibly another example in which LBD residue modification may lead to altered DNA binding: phosphorylation occurs on the interface between the LBD of PPARγ and the DBD of RXRα and therefore may change RXRα-DNA binding.

Subsequently, we built a cell-specific network of molecular interactions, including direct physical interactions and indirect regulatory interactions, based on the comparison of the gene expression patterns produced by wild-type phosphorylated and mutated non-phosphorylated PPARγ.20 We analyzed the dynamics of bi-stable motifs in healthy and diseased states, theorizing that circuits exhibiting bi- and multi-stability may drive network state transitions associated with disease progression and then maintain networks in diseased states. Such bi-stable switches are found, for example, in protein–protein interaction or regulatory networks, allowing cells to make irreversible decisions and assume different fates depending on which genes are expressed or silent.22 In the network under study, these motifs should not exist in isolation, but assemble an interconnected core cluster of genes that regulate one another, ensuring the stability of the network. In addition, differentially expressed genes (DEGs) belonging to bi-stable switches should be central to the regulatory network, so that they can efficiently propagate perturbations to more distant regions of the network.

Results

The effects of the phosphorylation of serine 273 on the protein structure of PPARγ were studied by comparing the structures of phosphorylated serine 273 and the wild type, with a particular focus on the DBD of the receptor complex, using MD simulations. Although a detailed study on DBD dynamical properties under perturbing stimuli and its relation to DR1-binding specificity is not the subject of this paper, a simple comparison of time-averaged molecular motions can indicate regions of altered stability. Indeed, serine 273 phosphorylation affects the stability of RXRα, mostly in the zinc-finger helix which is in contact with DNA's major groove (Figure 2a). The general change in the structure of DBD-RXRα is relatively small: 0.5 Å of averaged atomic displacement (root-mean S.D.) (Figure 2b). Nevertheless, the whole domain became less stable, as shown in differences in the root mean square fluctuation profiles (Figure 2c), with the most remarkable fluctuation occurring in the helical region of one of the zinc fingers. This indicates that phosphorylation introduces changes in the DNA-binding domain, shifting the protein population towards a state which could contribute to an increase in disease-related gene expression. The LBD of PPARγ is not affected on a large scale; despite some differences, both profiles are following a similar trend (Figure 2d).

Figure 2
figure 2

Structural analysis of the disease-related form of PPARγ (phosphorylated) in comparison with healthy state (non-phosphorylated). (a) PPARγ–RXRα dimer crystal structure; the LBD of PPARγ (green) interacts with the DBD of RXRα (light blue). Phosphorylated serine 273 DNA perturbs the LBD–DBD interface changing the zinc-finger helix (red) interaction with DNA (orange); zinc ions are shown as purple spheres. (b) The DBD of RXRα. RMSD was calculated for each tenth frame of the molecular dynamics trajectory. Phosphorylation leads to a moderate but stable change in elevated mean deviation of DBD structure atoms. (c) The DBD of RXRα. Time-averaged RMS fluctuations for phosphorylated wild-type and non-phosphorylated mutant S273A wild-type PPARγ. The black bar highlights the highest difference in stability, which is located in the zinc-finger helix. (d) The LBD of PPARγ. Similar profiles of RMSF for both wild-type and mutant structures indicating that phosphorylation does not reduce the stability of the LBD

To study the effects of modified PPARγ gene regulation, we reconstructed an interaction network consisting of 235 DEGs, 152 of which were upregulated in the phosphorylated state (wild type) and 48 of which were upregulated in the mutated state (S273A mutant). We call this network the ‘global’ network.

Subsequently, we searched a sub-network that makes an important contribution to the stability of the global network. First, we searched for motifs with multi-stable modes. We then selected the top 10 statistically significant motif types consisting of 2, 3, 4 or 5 nodes using a z-value-based ranking, all having a P-value <0.01 (only 10 of 1000 randomly generated networks contained motifs at a higher frequency). We examined the expression profile of the genes in each motif to ensure that the expression profile was consistent in all members of the identified motif: among the top ranked 2-, 3-, 4- or 5-node motifs that exhibited multi-stable behavior, we identified a number of 3- and 4-node motifs that were consistent with expression values (Figure 3). We thus identified 39 genes involved in 55 switches forming a single cluster, which we call the ‘core’ network (Figure 4).

Figure 3
figure 3

Bi-stable motifs with matching gene expression values: n is the number of motifs of a given type found in the core network. A1 is attractor state 1, and A2 is attractor state 2

Figure 4
figure 4

The core network. (a) This network consists of 39 nodes and 55 bi-stable motifs. (b) Example of bi-stable switch motif, existing in ON/ON/ON or OFF/OFF/OFF states

Computation of the stability of the core network cluster reveals two attractors, in which all genes are either in an ON or an OFF state, which exactly match the expression pattern. The two most commonly found motifs are positive forward loop motifs with OFF/OFF/OFF changed to the ON/ON/ON states during the course of the disease.

Consistent with this, the in silico perturbation of some nodes of the core cluster trigger a transition from the OFF to the ON attractor, but the opposite transition (from ON to OFF) is not triggered by the perturbation of any node in the core. In all, 34 nodes in the cluster are capable of triggering the transition from OFF to ON when they are perturbed. In contrast, five nodes (COL1A1 (collagen type I-α1), COL1A2 (collagen type I-α2), kruppel-like factor 5 (KLF5), perforin-1 and runt-related transcription factor 1) constitute a set of genes that potentially could be in the ON state, taking part in different processes without leading to the activation of the cluster as a consequence. We can see in Figure 5 that perturbation of KLF5, involved in a motif with early growth response 1 (EGR1), does not trigger the transition of the cluster from OFF to ON (see Supplementary Figure 1 for all perturbation experiments).

Figure 5
figure 5

Two examples of core genes perturbations, with and without an effect on the core network. (a) Perturbation of EGR1: all the genes in the cluster change from the OFF to the ON state and remain there. (b) Perturbation of KLF5: only the gene in question changes its state temporarily from OFF to ON and quickly returns to the original state. Both EGR1 and KLF5 occur within the same motif

Furthermore, we compared properties of genes in the core network with remaining genes in the global network (Supplementary Table 1). Specificity of inter-gene interactions may be reflected in the network modularity. Using the Newman–Girvan algorithm, we identified 11 clusters of which clusters 1 and 2 are mainly occupied by core genes. To identify inter-cluster connectors, we calculated the participation coefficient that on average was significantly different when comparing nodes in the core and global networks. The median participation in the core and global networks was 0.45 and 0, respectively; the distributions in the two groups differed significantly (Mann–Whitney–Wilcoxon W=2272, n1=44, n2=191, P-value=1.66e-07).

The betweenness centrality of a gene is a centrality measure that is proportional to the number of shortest paths between genes in the network that go through the gene in question. High betweenness centrality corresponds to high level of inter-node communication, and is therefore an appropriate measure for highlighting which genes link different molecular processes and pathways. Median betweenness centrality in the core and global networks are 243 and 0, respectively; the distributions in the two groups differ significantly (Mann–Whitney–Wilcoxon W=1430, n1=44, n2=191, P-value=3.65e-14). We identified a group of genes, which act as potential hubs and exist in the core network (having a betweenness centrality score ≥0.10 in the core network, as well as ≥2900 in the global network, being the top four ranking genes in both cases).

The genes with the highest betweenness centrality were HIF1A (hypoxia-inducible factor 1α-subunit), EGR1, STAT1 (signal transducers and activators of transcription 1) and CXCL12 (chemokine (C–X–C motif) ligand 12) and they are all overexpressed in the case of phosphorylated PPARγ (the putative disease state). We examined these genes in more detail with regard to their association with adipogenesis, for which PPARγ is the master regulator under any condition, to confirm that our core network is consistent with experimental studies. We found that this was indeed the case: the role of the transcription factor HIF1A in adipocyte differentiation has been described previously,23 EGR1 functions as a transcriptional regulator the expression of which is rapidly induced during the differentiation of murine 3T3-L1 adipocytes,24 STAT1 also acts as a transcription activator, which is rapidly activated in the 3T3-L1 adipocyte cell culture model,25 and CXCL12 is a chemokine, which demonstrates a significant increase in expression in differentiating 3T3-L1 preadipocytes.26

Discussion

The metabolic syndrome is a common disease and is associated with increased incidence of type II diabetes mellitus and cardiovascular mortality. Its treatment is designed to target the different components of the syndrome (such as high triglyceride, cholesterol and glucose levels and high blood pressure). Causally, the metabolic syndrome is linked to obesity-related insulin resistance.27 Thus, the discovery of PPARγ as a master regulator of adipogenesis, the decreased activity of which was also shown to be linked to insulin resistance,28, 29 was very important. It was assumed that increasing PPARγ activity would potentially represent a unifying strategy to target the metabolic syndrome as an entity. Unfortunately, unwanted side effects such as weight gain, heart failure and even osteoporosis, as well as discrepant results between genetic and pharmacological modulations of PPARγ, outbalance the observed clinical benefits, which mostly rely on improved blood sugar control and lipid metabolism.30

PPARγ still represents a desirable drug target, but research over the last decade showed that its biological function with regard to adipogenesis is a balanced and thrifty response.31 Aspects such as co-factor recruitment and function and/or post-translational modifications have to be taken into consideration, before more specific/distinct new approaches to modify PPARγ action can lead to new compounds.

PPARγ expression is also influenced by GATA-binding protein 2 (GATA2), a transcription factor specifically expressed in the stromal vascular fraction of the adipose tissue, which is also where preadipocytes are located. The transition from preadipocyte to adipocyte is blocked, at least in part, by GATA2's suppression of PPARγ; insulin has downstream effects on the kinase v-akt murine thymoma viral oncogene homolog 1 (Akt) which phosphorylates GATA2, impairing GATA2's translocation to the nucleus, leading to a reduced suppression of PPARγ in the nucleus, thus promoting adipogenesis. In the insulin-resistant state, the phosphorylation of GATA2 by Akt is disturbed and consequently adipogenesis regulation is dysregulated. PPARγ-driven adipogenesis is reduced, whereas the remaining preadipocytes start expressing pro-inflammatory cytokines, an observation contributing to the understanding of the puzzling phenomenon of improved metabolic outcome of adipogenesis-promoting PPARγ agonists. GATA2 was not among the DEGs in our study, but this is not unexpected as GATA2's activity is regulated through phosphorylation.32

Metalloproteinase inhibitor 3 (TIMP3) is a member of our core network and was originally linked to Sorsby's fundus dystrophy, which is an autosomal dominant macular degeneration disorder. This gene has been linked to a number of other diseases including, more recently, the metabolic syndrome, in particular with insulin resistance, hepatic inflammation, dyslipidemia and atherosclerosis in diabetic subjects.33, 34 A possible mechanism of how PPARγ and TIMP3 are connected may be that TIMP3 expression influences C/EBPβ (CCAAT/enhancer-binding protein beta) expression, which in turn induces C/EBPα (CCAAT/enhancer-binding protein alpha) and PPARγ during adipocyte differentiation.35

Post-translational modification can be seen as perturbations of the environment in which a protein exists. The effect of such a post-translational modification on a protein is a good situation to examine the hypothesis of the population shift of molecules, which proposes that the external stimuli stabilize and therefore populate pre-existing, although previously non-overrepresented molecule states. In biological terms, a population of protein conformers may be shifted towards malfunctioning, diseased-related conformation(s), that is, the sustained perturbation of protein structure is a potential trigger for changes of molecular networks associated with specific disease phenotypes.

Choi et al.20 studied the role of PPARγ in the context of post-translational modification, that is, the phosphorylation of serine 273 in the LBD of PPARγ; a number of genes were identified the differential expression of which is caused by a change in the conformation of the PPARγ–RXRα dimer when serine 273 is phosphorylated. This modification was shown to influence gene expression of disease (obesity, insulin resistance)-relevant genes independently from the general receptor transcriptional activity.11

To study the effects of perturbation on a structural level, we performed a MD simulation to examine the impact of phosphorylation on the PPARγ–RXRα dimer, demonstrating that phosphorylated PPARγ destabilizes the PPARγ–RXRα dimer, which may be a mechanism explaining the altered transcriptional regulation of PPARγ target genes. Interestingly, phosphorylation at another residue (Ser112) modulates PPARγ activity with positive effects on glucose and lipid metabolism in vivo possibly by decreasing its affinity for endogenous ligands.36 This finding corresponds well to our finding of a destabilized PPARγ–RXRα dimer.

Furthermore, we analyzed perturbations of biological networks associated with changes in PPARγ, by examining the DEGs comparing wild-type phosphorylated PPARγ with mutant non-phosphorylated PPARγ. A network was constructed connecting these DEGs using the biomedical literature-based ResNet mammalian database. This database only includes interactions previously reported in humans, mouse or rat. Therefore, it can be expected that there will be little bias introduced by orthologs from exotic species, that is, species distant to mouse in the phylogenetic tree.

We found an enrichment of bi-stable switches in wild-type phosphorylated state containing genes that significantly change their expression levels in an adipogenesis-dependent manner. Of particular interest were the genes with the highest betweenness centrality: HIF1A, EGR1, STAT1 and CXCL12. These genes are all overexpressed in the putative disease state and all have been previously described to have a role in adipogenesis.23, 24, 25, 26 Betweenness centrality is a measure of the influence that a node has over the spread of the information through the network,37 and our finding that the hub genes with the highest levels of betweenness centrality consist of three transcriptional regulators and one chemokine is consistent with them functioning as information propagators in the network.

The bi-stable motifs we identified assemble an interconnected core network of genes, which have two steady-stable states corresponding to gene functionality states. This stable core network gives robustness to the cellular network by keeping the system in the non-phosphorylated putative healthy state; however, once the system is pulled out of this state, it stabilizes the network in a phosphorylated putative disease-related state.

Materials and Methods

Molecular modeling

The therapeutic ligand present in the intact structure of the PPARγ–RXRα–DNA complex (PDB ID: 3DZY)21 was replaced with docosahexanoic acid, which is a natural ligand of PPARγ. MD simulations were performed with Gromacs38 and served as a tool to generate perturbed states of proteins. Implicit solvent generalized born simulations with the Amber99sb-ildn force field were carried out for wild-type and mutant proteins: energy minimizations were performed and the production runs were 50 ns in length.

Network analysis

For our analysis, we extracted DEGs from the results of gene expression analysis experiments by Choi et al.,20 in which PPARγ-null mouse embryonic fibroblasts were transfected with wild-type PPARγ (phosphorylated) or the S273A PPARγ mutant (not phosphorylated). The cutoffs for selecting the DEGs were uncorrected P-values ≤0.008 and corrected false discovery rate P-values ≤0.15. This resulted in 577 DEGs, which we used in our subsequent analysis.

Subsequently, the ResNet mammalian database from Ariadne Genomics (http://www.ariadnegenomics.com/) was used to construct an interaction network of 235 DEGs with directed interactions (the ‘global’ network). This database includes biological relationships and associations, which have been extracted from the biomedical literature using Ariadne's MedScan technology. MedScan processes sentences from PubMed abstracts and produces a set of regularized logical structures representing the meaning of each sentence. The ResNet mammalian database stores information harvested from the entire PubMed, including more than 715 000 relations for 106 139 proteins, 1220 small molecules, 2175 cellular processes and 3930 diseases. The focus of this database is solely humans, mouse and rat.

Motif detection was performed using the FANMOD algorithm39 for the global network and limited to motifs consisting of 2, 3, 4 or 5 nodes each. Each resultant topology was analyzed in comparison with 1000 separately randomized versions of the initial network. Using both the original and the randomized versions, z-scores and P-values could be calculated for all motifs discovered in the original network. The z-score of a motif is the original frequency minus the random frequency divided by S.D. The P-value of a motif is the number of random networks in which it occurred more often than in the original network, divided by the total number of random networks. Of the bi-stable switch motifs we identified, we also examined the expression profile of the genes in each motif to ensure that the expression profile was consistent in all motif members. We constructed a network (the ‘core’ network) consisting of 39 DEGs that occur in bi-stable motifs, which are significantly overrepresented in the global network and the genes of which are in the stable state, matching the experimental expression values.

To compute the attractors of the core network, we used the program SQUAD (www.enfin.org).40 The program converts the network into a continuous dynamical system based on ordinary differential equations. In the absence of detailed kinetic parameters, the program interpolates a sigmoid curve between the states completely ON and completely OFF for each node. SQUAD first calculates the steady states found in a discrete dynamical system (Boolean model) and then uses these states as a guide to localize the steady states in the continuous model. Perturbations were also simulated using SQUAD. Each perturbation is a single pulse that changes the state of the node from 0 to 1 in the OFF attractor or from 1 to 0 in the ON attractor as initial states of the system.

The DEGs of the core and global networks were divided into groups by topological modularization using the Newman–Girvan algorithm. Participation (P) of a node in intra-modular communication is calculated as follows:

where NM=number of modules, Ks=number of connections with module s, ki=total degree of module k. Analysis of betweenness centrality was calculated with igraph library in R comparing the core and global networks. The participation coefficient and betweenness centrality results for the core and global network groups were compared with the Mann–Whitney–Wilcoxon test to check the statistical significance of the differences.

We also examined these genes in more detail with regard to their association with adipogenesis, for which PPARγ is the master regulator under any condition, to confirm that our core network is consistent with experimental studies.