Introduction

Insulin resistance (IR) and obesity are major risk factors for a range of diseases including type 2 diabetes (T2D) and cardiovascular disease. However, some obese people are insulin sensitive as lean individuals,13 whereas a subgroup of IR individuals are lean.4,5 Hence, there is a need to improve diagnosis of IR because lifestyle changes can prevent or delay such disease onset. However, this is a challenge as IR persists for many years without obvious perturbation in glucose homeostasis and so IR individuals may not present to their physician before T2D or cardiovascular disease onset. This is compounded by the fact that current clinical measures such as body mass index, waist/hip ratio, and fasting glucose have limited diagnostic value.

Most accurate methods for assessing IR such as the hyperinsulinemic-euglycaemic clamp6 or surrogate measures such as liver lipid analysis are not feasible in routine clinical use. Metabolite biomarkers that correlate with insulin sensitivity and T2D have been reported,711 but their diagnostic value is unclear. Gene expression (GE) represents a potential alternative for measuring disease risk as it represents an integrated response of multiple system parameters. For example, where the functional consequence of a single-nucleotide polymorphism (SNP) is altered transcription, GE can accurately predict an individual’s SNP genotype,12 which may be the true determinant of disease risk.

Although gene expression has been explored as a biomarker for IR this has been limited by the inability to detect large numbers of differentially expressed genes.13 This could be explained by a range of factors: IR could represent a multitude of diseases; GE could be too dynamic with an underlying rhythmicity that masks IR-specific changes; technical variance may exceed biological variance. This requires additional approaches to overcome these potential barriers. One approach to circumvent variation in human data is the use of cross-species data integration.1416 The rationale here is that model systems can be used to recapitulate human diseases with much less variance than is found in human data sets. Robust changes in GE in such model systems can then be tested as a positive training set in human data. This will reveal shared features between the model system and human disease and aid identification of human disease-associated signatures that would not have been isolated from analysis of human data alone.

High fat diet (HFD) fed rodents are widely used as models for human IR.1720 As in humans, exposure of rodents to HFD results in increased adiposity, glucose intolerance, hyperinsulinemia and peripheral IR in muscle, adipose tissue, and liver.1724 This is comparable to phenotypes seen in prediabetic humans.

We have conducted a proof of principle study to evaluate the feasibility of developing a molecular profile using cross-species muscle GE analysis that can accurately diagnose IR in humans. We describe a gene expression motif (GEM) comprising 93 genes that substantially improves IR diagnosis when tested in 115 individuals across 3 separate studies. In addition, we identified pathways including β-catenin (Wnt Signaling) and Jak1 (Jak-Stat Signaling) to be up-regulated in muscle biopsies of insulin resistant individuals. Inhibition of these proteins in muscle cells showed significant defects in glucose uptake indicating an important role in muscle insulin action.

Materials and Methods

The following sections are contained within the Supplementary Information.

Study design

Supplementary Methods 1.1.

Human subject and mouse phenotype studies

Supplementary Methods 1.2.1.

Microarray studies

Supplementary Methods 1.2.2-1.2.5.

Statistical analyses

Supplementary Methods 1.3-1.9.

Experimental validation

Supplementary Methods 1.10.

R-code for data analysis

(SI.zip: Main_Analysis_Chaudhuri_et_al.zip).

Results

Our goal was to identify a GE signature that could accurately diagnose whole body insulin sensitivity. This is important because IR is one of the earliest determinants of T2D. We designed our study with the following innovative attributes: use of state of the art methods to define metabolic status including a hyperinsulinemic glucose clamp (ClampIS), the gold standard for measuring whole body insulin sensitivity, and liver lipid fat; while IR is associated with obesity many obese people are not IR and so we wanted to identify a signature that predicts IR independently of obesity. Thus, we studied individuals who were either obese insulin sensitive or matched for adiposity with IR; we sought a signature that predicted IR and not T2D per se because of the complications associated with T2D like hyperglycemia. Because of inherent variance in human GE data we first identified IR-associated genes and pathways in a mouse IR model and used these data in a weighted analysis of the human GE data (see Figure 1 for workflow summary and Supplementary Methods (SM) 1.1 for details).

Figure 1
figure 1

Study workflow. Diagram shows the gradual inclusion of different types of data sets (gene expression, human clinical data such as office-based clinical measures (OCM) and extensive metabolic phenotypic parameters (EMP), external gene expression data sets from publically available databases, biological pathways and protein–protein interaction networks) into the workflow at different stages of the novel analysis approach presented here and their respective outcomes.

Five days of HFD was sufficient to cause glucose intolerance in mice24 (Supplementary Figure S1A and B). Muscle GE was measured after 0, 5 and 42 d of HFD feeding. IR develops rapidly in liver24 and more slowly in muscle, reaching a plateau by 1 or 3 wks on HFD, respectively. Hence, this time course of HFD feeding was selected to enable identification of genes that do not just correlate with IR but that may also be causal. A cohort of 81 age-matched humans was selected to comprise four distinct groups: obese insulin sensitive (OIS), obese IR (OIR), T2D and lean.25 Human GE data was derived from muscle biopsies of a subset of these individuals. OIS, OIR, T2D groups had higher adiposity, body mass index, waist/hip ratio and visceral adiposity than lean subjects.25 Lean and OIS groups were more insulin sensitive than OIR and T2D groups, measured by the clamp or non-oxidative glucose disposal (NOGD), a semi quantitative estimation of glycogen synthesis, and fasting insulin (Supplementary Figure S1C–K and SM 1.2; in-silico regression analysis is able to reproduce this as shown in Supplementary Figure S2).

Cross-species integrated analysis identifies 129 genes implicated in IR

It has been previously reported that there is limited signal in GE data from human muscle from which to generate a signature of muscle IR.13 Therefore, we set out to use a cross-species integration approach, using GE data from a mouse model of insulin resistance to guide analysis of the human GE data. A weighted analytic approach was performed to integrate human-mouse GE data, whereby differentially expressed mouse genes were assigned prior importance and tested for differential expression in human data. Two stages of analysis were implemented to identify differentially expressed genes under conditions of IR/T2D and obesity: (a) individual level analysis, through a standalone human GE analysis, a weighted linear model where significantly altered mouse genes are tested in human data and MA-plots in human GE. Supplementary Results (SR) 1.1A and SR 1.2 provides details for all 43 genes altered at the single gene level in humans (Figure 2a, b); (b) group level assessment, such as Gene Set Test on human GE profiles using co-regulated gene clusters from mouse data (Figure 2c, d). Further details can be found in SR 1.1B. These analyses resulted in identification of 129 differentially expressed genes (43 from (a) and 86 unique genes from (b)) (Supplementary Data S1). Pathway analysis of this gene set revealed an enrichment of the insulin signaling pathway and other pathways linked to insulin action and IR, including PPAR signaling,26,27 sphingolipid and ceramide metabolism,28 amino acid metabolism,11 and TCA cycle29 (Supplementary Data S3).

Figure 2
figure 2

Single gene and gene-set cross-species analysis reveals 129 genes implicated in IR and/or obesity. Red indicates upregulation and blue indicates down-regulation of genes. (a) Transformation of statistical significance (P value) of genes from mouse DE analysis to standardized weights (b) the fold change (log2 scale) of genes across the different comparisons from the single gene-by-gene analysis in humans. Fold change1.5, significance of 0.05 and average gene expression>7 was used for identifying DE genes from the human analysis. (c) Tight clustering of co-regulated genes in the mouse time course GE data identified 10 tight clusters (d) shows the proportion, magnitude, and direction of regulation of these 10 gene clusters from mouse GE data in humans across the six group comparisons. All heatmaps were generated using R package gplots (http://CRAN.R-project.org/package=gplots). DE, differentially expressed; GE, gene expression; IR; insulin resistance.

Optimization

We performed weighted gene co-expression network analysis30,31 to optimize these 129 genes by defining modules/clusters with similar expression patterns, and relating them to benchmark phenotypic traits for OIR and OIR individuals only, leading to identification of 12 distinct clusters (Figure 3a). We next considered how well these 12 modules correlated with metabolic traits: ClampIS/NOGD, which is quantitative measure of whole body insulin sensitivity, visceral adiposity and liver fat32 correlate with insulin resistance, and Homeostatsis Model Assessment - Insulin Resistance (HOMA-IR)33 is a crude measure of insulin sensitivity. Six gene modules comprised of 90 genes were positively correlated with ClampIS/NOGD (* in Figure 3a). No module correlated with liver fat. We also calculated the individual gene-phenotype correlations for each gene; 64 were significantly associated with at least one clinical trait (Figure 3b, Supplementary Data File S1) and 48 correlated with whole body insulin sensitivity measured by clamp (details in SR 1.3).

Figure 3
figure 3

Gene-network and single gene level correlation analysis between clinical EMP measures and the gene sets obtained from our integrative analysis. Red indicates positive correlation and blue indicates negative correlation. (a) The 12 gene modules obtained from clustering the 129 genes using weighted gene co-expression network analysis are represented through the 12 colors that index the rows to the matrix (right). Correlation coefficient (value on top) and the P value significance (bottom value) of each of the 12 clusters with four clinical measures (ClampIS, NOGD, HOMA-IR, visceral fat (L2/3 and L4/5 cm2) and liver fat) are shown. Six clusters (turquoise: 25 genes, brown: 19 genes, black: 10 genes, blue: 20 genes, yellow: 13 genes and gray: 12 genes) that significantly correlate (i.e., ρ>0.55 and P value<0.05) with the EMP measures such as ClampIS and NOGD are highlighted (*). The brown module is expanded to list the genes contained within this module and their GE pattern across the obese insulin sensitive and obese insulin resistant (OIS and OIR) subjects in the heatmap (left). (b) The magnitude of correlation is shown by the size of the circles of the 64 single genes within 129 DE gene set that reached significance (ρ>0.55 and P value<0.05) with the clinical measures of interest, as shown in the correlation plot generated using package corrplot in R. EMP, extensive metabolic phenotypic parameters; GE, gene expression.

Overall, we identified six candidate gene modules from the weighted gene co-expression network analysis and five individual genes (DMD, MAOA, PFKFB3, XIST and SLC1A4) from a linear regression analysis using ClampIS as response variable34 (SR 1.3). This subset of genes was closely related to whole body insulin sensitivity for OIS and OIR subjects. This optimized set of 93 genes is referred to as GEM. The functional enrichment of GEM is shown in Supplementary Data S3.

Validation

To quantify the utility of the information contained within the GEM as a diagnostic for IR, we undertook a systematic validation approach to measure its performance as a signature to segregate individuals based on their whole body insulin sensitivity (see Supplementary Methods (SM) 1.7–1.8). The prediction accuracy of whole body insulin sensitivity measured by clamp is 100%, as the OIS and the OIR individuals were segregated based on these criteria, and the random outcome is ~50%. Thus, the expected performance ratio (prediction accuracy/random) of the clamp in classifying OIR from OIS people is ~2 (i.e., 100/50), (Figure 4) whereas a ratio of 1 corresponds to random chance. In our simulations, whole body insulin sensitivity (ClampIS/NOGD) reached a median performance ratio of 1.94. The routine clinical measures (waist/hip ratio, blood sugar level and body mass index) had a median ratio of ~1.2, while the more sophisticated clinical measures liver density, visceral fat and HOMA-IR had a value of ~1.33. As the GEM was derived from the TonksS1 GE data we tested its prediction accuracy in TonksS2, an additional cohort of GE data from the same original study25 comprised of a total of 16 individuals and the GEM prediction ratio was ~1.6, which was a considerable improvement over office-based clinical measures. To extend the validation of GEM we examined its ability to diagnose IR in 2 external GE data sets. The performance median ratio score of GEM in both external GE data sets was ~1.65. It is highly significant that GEM was highly diagnostic outside of the training dataset and that it outperformed alternate measures of insulin sensitivity (i.e., visceral fat and liver fat).

Figure 4
figure 4

Comparative evaluation of the diagnostic power of metabolic status by the gene sets obtained from our integration approach with routine clinical measures in internal and external GE data. The performance ratio of the three gene signatures (GEM, T2DKEGG, and genes from Väremo et al.) in three external GE data sets (n=115) when compared with office-based clinical measures (fasting blood sugar level, waste hip ratio, and BMI), more sophisticated measures such as visceral fat, liver fat, and HOMA-IR and extensive metabolic phenotypic (EMP) measures of insulin sensitivity such as ClampIS and NOGD are shown. The median performance ratios reflect the performance of each gene set or clinical variables in classifying IR from IS individuals along with their confidence limits. The performance ratio is directly proportional to the classification accuracy of the features; value of 1 indicates random chance. BMI. body mass index; GE, gene expression; GEM, gene expression motif; IR; nsulin resistance; NOGD, non-oxidative glucose disposal.

We next compared the performance of the GEM to two external gene signatures: (i) a publically available T2D gene set that was assembled by literature curation; and (ii) an external set of 12 T2D biomarkers from muscle recently reported by Väremo et al.,35 GEM outperformed both of these signatures in distinguishing between OIR and OIS subjects (Figure 4). The median values for the diagnostic accuracy for all three gene signatures across the cross-validation rounds in the 3 GE data sets (Supplementary Table S1; external datasets described in Supplementary Data File S2) highlights the substantial difference between the gene-signature based median accuracy and random diagnosis in all data sets. There was good individual prediction concordance between GEM and Väremo et al., whereas T2DKEGG predicted poorly in two out three GE data sets. There was only one common gene (HK2) between the 45 T2DKEGG gene set and the 93 GEM genes and pathway analysis revealed one overlapping pathway (the insulin signaling pathway). There were no genes in common between GEM and genes from Väremo et al.35 Therefore, this validates GEM as an independent identifier of human IR.

To characterize the subject-specific classification response rate of GEM, we calculated the misclassification rate of all subjects in our internal data sets (TonksS1 and TonksS2) and refer to this as a GEM score (Supplementary Figure S3A). Subjects frequently misclassified by GEM (high GEM score) were of interest. These included B13 and D16 (TonksS1) and B6 and D17 (TonkS2). B13 and B6, were classified as OIR by clamp but OIS by GEM; D16 and D17 were reciprocally classified. B13 and B6 had the lowest visceral/subcutaneous fat ratios and highest NOGD in the OIR group, whereas the visceral/subcutaneous fat ratios for D16 and D17 were in the top 25% (Supplementary Figure S3D, E, G). Various studies have suggested that visceral fat is a more important contributor to IR than body mass index36,37 and accumulation of subcutaneous fat provides protective effects against IR.3840 Recently, several new genetic loci were reported that links insulin biology and IR to body fat distribution.41 The respiratory quotient during the clamp for D17 and D16 were the lowest in the OIS group, a measure of metabolic flexibility. Hence, despite being insulin sensitive, they displayed indications of metabolic impairment (Supplementary Figure S3F) (see SR 1.4 for other details). These findings suggest that GEM classification may take into account global metabolic dysfunction associated with IR, rather than just impaired whole body glucose clearance.

Network level analysis

Our original gene set of 129 genes and GEM enriched for several metabolic transcription factors, mitochondrial proteins and genes with SNPs associated with T2D/obesity or altered lipid and metabolite levels (SR 1.5). To further interrogate our gene sets we examined their connectivity with the insulin signaling pathway (ISP) proteins as this is the dominant driver of insulin action in muscle. A network-based enrichment analysis42 identified genes such as JAK1, β-catenin (CTNNB1), 14-3-3-gamma (YWHAG), SCD, PPP3R1, NR3C1, PFKFB3, GRB14, ACLY and RETN (resistin) within the original 129 that potentially cross-talk with those comprising the ISP through a protein–protein interaction map (Figure 5a) (see SM 1.9 and SR 1.6).

Figure 5
figure 5

The protein–protein interaction wheel between the insulin signaling pathway (ISP) and 129 DE genes highlight β-catenin and Jak1 to be top connectors and network bottlenecks. (a) The right hand side of the wheel lists all members of the 129 DE gene set and the left hand side of the wheel shows members of the ISP. At the bottom, a small section marked ‘OVERLAPPING WITH ISP’ shows the nine genes within our DE gene set that overlap with the 137 ISP genes. Thick blue lines on the right hand side of the wheel are indicative of high connectivity (degree) with the proteins in the ISP. The diagram highlights the interactions of β-catenin (CTNNB1) on the right hand side with the ISP genes in red; both β-catenin and Jak1 are labeled with arrows to indicate their extent of connectivity to the ISP genes. (b) The genes that comprise each of the six significantly correlated gene modules (represented by their module colors from weighted gene co-expression network analysis analysis) with EMP measures are listed here. The bar plot quantitatively represents the degree of connectivity of each gene within these six GEM gene modules with the ISP. The hubs of each module are highlighted (*), if the hub is a member of ISP then the next ranking non-ISP member is marked with (+) (c) Bottleneck or between-ness distribution of the genes comprising the GEM modules. Similar to (a) top ranking bottlenecks within each module are marked with (*) and second ranking non-ISP members are marked with (+).

Next, we focused on genes comprising the GEM. Only the hub proteins (degree) in yellow (β-catenin: member of Wnt signaling pathway) and gray (Jak1: member of Jak-Stat pathway) modules were not members of the ISP and hence novel nodes in insulin action (SM 1.9) (Figure 5b). Similarly, we identified five novel bottlenecks, β-catenin, 14-3-3-gamma, PFKFB3, GRB14, and APOA2 (Figure 5c). Although there is extensive evidence relating the above proteins to IR (SR 1.7); the link between Jak1 and β-catenin to IR is relatively unexplored. This prompted us to interrogate the role of these proteins in insulin action in muscle cell.

β-catenin and Jak1 have a role in insulin action

We tested in-silico to confirm if Jak1 and β-catenin were the bottlenecks of communication between their respective source pathways, Jak-Stat and Wnt signaling pathways, and the ISP (SM 1.9 and Supplementary Figure S4). Through between-ness distributions of the protein interaction network of Wnt signaling and the ISP, we determined β-catenin to be the top bottleneck of communication (Figure 6a). Similarly, Jak1 was one of the top five most important proteins enabling interactions between Jak-Stat pathway and the ISP, (Figure 6b).

Figure 6
figure 6

Detailed network maps of Wnt signaling and Jak-Stat signaling pathways with insulin signaling pathway reveals strong connections of β-catenin and Jak1 to selected insulin signaling proteins; experimental validation of these two proteins provides evidence of their role in muscle insulin action. (a and b) Shows the between-ness distributions of proteins in the Wnt signaling and insulin signaling pathway (ISP) in (a) and the distribution of Jak-Stat and ISP in (b). Both β-catenin and Jak1 are one of the top communication bottlenecks in these networks. (c and d) Edge between-ness derived graph communities (>5 members) in the Wnt:ISP and Jak-Stat:ISP networks are visualized through the different colors. The first-degree neighbors of β-catenin are expanded in (c) and Jak1 in (d), where the size of the nodes reflects the degree of connection and the red circles denote members of the ISP. (e and f) Shows the experimental validation of Jak1 and β-catenin in insulin stimulated glucose uptake assays. 2-deoxyglucose (2DOG) uptake into L6 muscle cells was measured in the absence or presence of insulin. 2DOG uptake in the absence or presence of Jak1 inhibitor GPLG0634 (e) or the β-catenin inhibitor pyrvinium (f).

We further interrogated the Wnt:ISP and Jak-Stat:ISP networks by detecting community structures within them through a hierarchical decomposition process (SM 1.9). We focused on the communities that contained β-catenin and Jak1, respectively, and obtained a map of their first-degree neighbors (Figure 6c, d). We observed rich connections between β-catenin and insulin signaling proteins (red circles) such as GSK3B, FOXO1, PTPN1, PIK3R1, PTPRF, and IKBKB. Similarly, through the first-degree neighbor map of Jak1, we found dense connections with key insulin signaling proteins such as INSR, IRS1/2, GRB2, SH2B2, RAF1, PRKCZ, and PIK3R1. Using gene-set tests, we also found the Wnt and Jak-Stat Signaling Pathways to be up-regulated in OIR compared with OIS individuals in both our internal GE data sets (TonksS1: Wnt P value 8.86e-06; Jak-Stat 0.1 and TonksS2: Wnt 0.1, Jak-Stat 0.006). Based on these observations we predicted that β-catenin and Jak1 likely have an important role in skeletal muscle insulin action. To test this, we examined the consequences of perturbing these nodes on insulin action in L6 muscle cells (Figure 6e, f). Treatment of L6 myotubes with pyrvinium pamoate, an inhibitor of β-catenin and Wnt signaling,43 inhibited insulin stimulated 2-deoxyglucose uptake (Figure 6e). Similarly, treatment with the Jak1 inhibitor GPLG0634,44 inhibited insulin stimulated 2-deoxyglucose uptake in a dose-dependent manner (Figure 6f). These effects were not due to direct inhibition of glucose transporter activity (Supplementary Figure S5 and SM 1.10). This highlights the power of such approaches for identifying not only novel disease signatures but also novel regulatory nodes that act upstream of well established processes.

Discussion

IR is one of the earliest risk factors for metabolic disease yet clinically it often remains undiagnosed. Most methods for assessing IR are either expensive or too specialized, emphasizing the need for improved and simplistic approaches. Variation in human GE data can preclude identification of GE signatures that describe a phenotype. To circumvent this we devised a cross-species integration framework. Here robust GE changes in mice progressing to IR were used to augment GE differences in humans. This enabled identification of a 93 gene expression motif that diagnoses individuals with IR more accurately than commonly used clinical measures both in internal and external GE data sets (Figure 4). One of the key regulatory nodes associated with this GEM was the β-catenin pathway.

A major confounder for IR diagnosis is obesity; while many IR people are obese this is not always the case. To circumvent this problem, in the present study we compared OIS with OIR enabling us to exclude a potential role for obesity per se. GEM was able to predict outliers within the obese groups that had been classified based on the hyperinsulinemic euglycemic clamp. For example, the GEM score identified two individuals who were classified as OIR by clamp and OIS by GEM, they had a very low visceral to subcutaneous adipose tissue ratio that is indicative of insulin sensitivity.3640 Such features can now act as a future predictor of the performance of GEM in patient-specific diagnosis i.e., individuals with alternate clinical features such as low visceral/subcutaneous fat ratio and borderline NOGD are likely to be diagnosed as OIS instead of OIR by GEM. Hence, these individuals may represent a discrete IR subgroup(s). It will also be exciting to determine if these subgroups also segregate based on other features such as lifestyle or genetics. Overall, this suggests that multi-gene signatures like GEM may be diagnostic of combinatorial features making this approach potentially more clinically useful than single physiological measures.

The ability of GEM to identify subgroups of IR individuals also emphasizes the multifactorial/heterogeneous nature of IR, the clinical manifestations of which remain to be explored. It is important to point out that the GEM described here emanated from a cross-species integrated approach using HFD fed mice, however, it is well established that many other perturbations are known to trigger IR including exposure to dietary components other than fat, physical inactivity and steroids. Thus, it will be of interest to determine if GEMs driven by alternate mouse models can segregate different forms of IR in humans.

GEM arose from an unbiased systems biology approach and so it is of interest to explore the inherent biology within this signature. GEM possesses a number of genes and pathways previously implicated in IR and/or insulin action (SR 1.5). This validates this approach supporting the utility of GEM not just as a classifier but also as a discovery tool. Using network analysis we identified β-catenin and Jak1 as two potentially novel components of the IR nexus and a role for both of these pathways in insulin action was confirmed using functional analysis in muscle cells in vitro.

Jak1 has previously been implicated in adipocyte IR,45 and Jak-Stat signaling has been proposed to be involved in the development of IR in cardiac myocytes;46 to our knowledge this is not the case for β-catenin. The expression of β-catenin was highly correlated with the strongest phenotypic predictors of IR, such as whole body insulin sensitivity, at the single gene level (Figure 3b). In humans, abnormal Wnt signaling has been associated with early onset obesity and T2D.47 Recently, it has been shown that loss of β-catenin renders mice resistant to HFD due to increased energy expenditure and insulin sensitivity due to hyperactivity.48 Further, β-catenin binds to, and regulates the activity of both FOXOs and T-cell factor (TCFs) transcription factors.49,50 In addition, SNPs in the TCF7L2 gene, which is the transcription factor that regulates β-catenin, significantly increases the risk of developing T2D.51,52 Intriguingly, TCF7L2 was originally thought to be a beta cell specific gene that controls insulin secretion.52,53 However, subsequent studies have found that TCF7L2 principally functions as a regulator of insulin action in liver and possibly other tissues.54,55 This is important, as overall most SNPs associated with T2D are thought to principally have a role in beta cell dysfunction and very few SNPs have been found in genes that control insulin action in tissue such as muscle. As we observed β-catenin to be a major regulatory node in a GEM associated with IR we speculate that this pathway may represent a major biomarker for defects in insulin action.

In conclusion, systems biology approaches as described here involving cross-species data integration have enormous potential in preventive medicine. Obviously, longitudinal study designs will be required to evaluate the true worth of such approaches and their application to the clinical workplace will be required. The value of this GE motif needs to be ascertained in a more readily available clinical sample such as peripheral blood cells. Nevertheless, from this cross sectional analysis of a relatively modest number of subjects we delineated a multi-gene signature that encapsulates a complex array of biological pathways to diagnose IR with comparable accuracy to the best currently available physiological/clinical methods. This justifies more focus on systems biology approaches to tackle the cause and prevention of complex metabolic diseases.

Data availability

The microarray data for the human and mouse study have been deposited in the Gene Expression Omnibus under identifiers GSE73034 and GSE73036, respectively (both part of super series reference GSE73037).