Meta- and cross-species analyses of insulin resistance based on gene expression datasets in human white adipose tissues

Ample evidence indicates that insulin resistance (IR) is closely related to white adipose tissue (WAT), but the underlying mechanisms of IR pathogenesis are still unclear. Using 352 microarray datasets from seven independent studies, we identified a meta-signature which comprised of 1,413 genes. Our meta-signature was also enriched in overall WAT in in vitro and in vivo IR models. Only 12 core enrichment genes were consistently enriched across all IR models. Among the meta-signature, we identified a drug signature made up of 211 genes with expression levels that were co-regulated by thiazolidinediones and metformin using cross-species analysis. To confirm the clinical relevance of our drug signature, we found that the expression levels of 195 genes in the drug signature were significantly correlated with both homeostasis model assessment 2-IR score and body mass index. Finally, 18 genes from the drug signature were identified by protein-protein interaction network cluster. Four core enrichment genes were included in 18 genes and the expression levels of selected 8 genes were validated by quantitative PCR. These findings suggest that our signatures provide a robust set of genetic markers which can be used to provide a starting point for developing potential therapeutic targets in improving IR in WAT.


Functional annotation of IR using gene set enrichment analysis.
To determine the biological functions related to the results of the meta-analysis in human IR datasets, we performed Gene Ontology (GO) analysis using gene set enrichment analysis (GSEA) based on Z-scores from the meta-analysis. GO analysis provides functional annotation and categorization into 3 domains: cellular components, biological processes, and molecular function. The results showed that most immune system activity, inflammation, and some lysosome-and extracellular region-related terms were positively enriched, while almost all mitochondrial terms were negatively enriched ( Fig. 2A). In addition, we found that epithelial mesenchymal transition (EMT) (MSigDB: hallmark) was positively enriched, while valine, leucine and isoleucine degradation (KEGG: hsa00280) was negatively enriched ( Confirmation of the robustness of the meta-signature using cross-species analysis. To address whether the meta-signature is a set of robust genetic markers of IR, cross-species analysis was conducted using GSEA. Because GSEA allows mapping from various platforms, obtained from other animal models to human gene symbols, our human IR meta-signature gene sets, consisting of 824 up-regulated genes and 571 down-regulated genes, were applied to microarray studies on other model organisms, such as mice, rats, dogs, and swine (Fig. 3). The results of cross-species analysis showed that the meta-signature was significantly enriched in microarray studies on subcutaneous adipose tissue in a mouse model of IR induced by feeding a high fat diet (HFD) (Fig. 3A). Notably, the result was consistent in microarray studies using other types of WAT, epididymal and perigonadal fat tissues, from a HFD-induced IR model and ob/ob mice, respectively ( Fig. 3B and C). In addition, the meta-signature gene sets were significantly enriched in microarray data from rat abdominal fat (Fig. 3D), dog subcutaneous fat (Fig. 3E), and an in vitro IR model derived from 3T3-L1 cells treated with tumor necrosis factor α (TNF-α) (Fig. 3F). Our human IR meta-signature was highly enriched in the three model organisms except for swine, because the fat metabolism in pig is profoundly different from those in other model organisms. Therefore, we excluded the swine data from our analysis 12 .
Next, we identified the leading-edge subset to determine core enrichment genes in all IR models. Notably, among the 1,413 genes which made up the meta-signature, only 12 (10 up-regulated and 2 down-regulated) genes were consistently enriched in all IR models we used in Fig. 3 (Table 2). These results indicate that the human IR meta-signature obtained in our study is a set of robust genetic markers of IR in various types of WAT from different in vitro and in vivo IR models.
Applying the meta-signature to pharmacogenomics and gain-or loss-of-function studies. Several recently published studies directly compared molecular signatures of disease with the molecular effects of drug target genes using an inverse correlation approach for successful drug repositioning and disease therapies [13][14][15][16][17][18][19][20][21] . Thus, we hypothesized that our IR meta-signature may also be inversely enriched in pharmacogenomics datasets for improvement of IR. We carried out inverse enrichment analysis using pharmacogenomics datasets for thiazolidinediones (TZDs) and metformin, which are approved for the treatment of T2DM by the Food and Drug Administration. Metformin is a biguanide that is recommended by the American Diabetes Association as a first-line agent to treat T2DM patients 22,23 and TZD is another class of anti-diabetic medication which includes pioglitazone and rosiglitazone, and troglitazone 24 . Consistent with our hypothesis, our meta-signature gene sets were inversely enriched in microarray datasets from IR patients treated with TZD drugs (Fig. 4A), and a mouse IR model treated with pioglitazone ( Fig. 4B) and with metformin (Fig. 4C). Additionally, the meta-signature gene sets were inversely enriched in microarray datasets from luteolin-treated mice (Fig. 4D), interleukin-37-overexpressing transgenic mice (Fig. 4E), and double knock-down of transforming growth factor beta-like stimulated clone (TSC) 22 D4 and lipocalin (LCN) 13 in db/db mice (Fig. 4F). These results also correspond with those of recent studies showing that the flavonoid luteolin, the anti-inflammatory agent interleukin-37, and double inhibitions of TSC22D4 and LCN13 ameliorate IR [25][26][27] . Overall, our data suggest that the meta-signature is a set of robust genetic markers which can be used as a screening tool to determine the effects of putative candidate and/or new drugs on improving IR condition in WAT.
Identification of potential drug targets. The expression levels of a subset of meta-signature genes changed with the treatment of TZDs and metformin ( Fig. 4A-C) which have different mechanisms of action against IR and T2DM 24,28 . Therefore, we identified overlapping genes that were co-regulated in both TZD-treated humans (Fig. 4A) and metformin-treated mice (Fig. 4B) to identify potential drug target genes that could ameliorate IR. We carried out co-regulated gene analysis using the core enrichment genes of metformin and TZDs from the GSEA results, which were the leading-edge subset within our meta-signature. The leading-edge subset derived from the GSEA results can be interpreted as the set of core enrichment genes 29 . Among the meta-signature, a total of 211 genes identified across the species were co-regulated by metformin and TZDs, and these genes made up the drug signature which is a set of genes related to drug-induced gene expression changes 30 (Fig. 5A  GO terms for biological processes were significantly enriched in the defense response (p = 4.84e-10), inflammatory response (p = 2.16e-08), and cellular signaling cascade (p = 2.58e-06), while the enriched GO terms for the 78 down-regulated drug-signature genes were response to insulin stimulus (p = 5.32e-07), response to hormone stimulus (p = 6.82e-05), glucose catabolic process (p = 2.83e-03), and oxidation reduction (p = 3.68e-03) (Fig. 5C). Collectively, our data suggest that the cross-species drug signature, which was made up of genes that were co-regulated in response to metformin and TZDs comprise a subset of genes that are potential drug targets for improving IR status among the meta-signature.
Clinical relevance of expression levels of the drug signature in IR and obesity. Our results suggested that the genes in the cross-species drug signature may contain direct/indirect results of improved IR condition, target genes of TZD-and metformin-treatments, or potential target genes for novel therapeutics. In any case, we hypothesized that the expression levels of the genes in drug signature may significantly correlate with clinical traits involved in IR. Because BMI, which is used as an indicator of obesity, is also an important factor in the pathogenesis of IR and T2DM 31 (Supplementary Figure S3A), homeostasis model assessment2-IR (HOMA2-IR) score, which quantifies IR, and BMI were used to identify the relationship between the expression levels of 211 genes in cross-species drug signature and clinical traits. To verify this hypothesis, the expression levels of all 211 genes in GSE32512, which contained gene expression data from subcutaneous fat biopsies of 200 human subjects with clinical traits, were used. Among the 211 genes which made up our drug signature, the expression levels of 195 genes (92.4%) were significantly correlated with both HOMA2-IR and BMI (p < 0.05) (Fig. 6). Genes that were highly correlated with BMI and HOMA2-IR had significantly higher Z-scores in the meta-analysis (Supplementary Figure S3B). The results suggest that the expression levels and Z-scores of the drug signature are strongly associated with obesity and IR.
Network construction and validation of drug signature. Since the expression levels of the genes in the drug signature were significantly related to IR, protein-protein interaction (PPI) networks were constructed to understand the interactions of the biological processes related to the drug signature. From the drug signatures, 42 genes (22 up-and 20 down-regulated genes) which were highly correlated (|r| > 0.4) with both BMI and HOMA2-IR (Fig. 6) were used to establish PPI networks. Eighteen genes (network degree ≥9) among drug signature were shown in the PPI networks, contacting the up-regulated drug-signature genes, such as PSAP (prosaposin, degree = 40), GRN (granulin precursor, degree = 34), and S100A4 (S100 calcium binding protein A4, degree = 31) and the down-regulated drug-signature genes, such as CYB5A (cytochrome b5 type A (microsomal), degree = 19), S100A1 (S100 calcium binding protein A1, degree = 16), and MCCC1 (methylcrotonoyl-CoA carboxylase 1, degree = 10) (Fig. 7A). The expression levels of the genes showed either a positive or negative correlation with HOMA2-IR (Figs 7B and Supplementary Figure S4). Notably, among the 12 core meta-signatures (Table 2), proteins encoded by 4 genes (LBP, S100A4, CSF1R, and MCCC1) were in the network (Fisher's exact test, p < 0.01). We randomly selected 8 genes, S100a4, purinergic receptor P2X 7 (P2rx7), Psap, Grn, S100 calcium binding protein A1 (S100a1), Cyb5a, pyruvate dehydrogenase kinase 2 (Pdk2), and Mccc1 and quantitative PCR (qPCR) was performed to validate the expression levels obtained from PPI networks using mouse epididymal fat treated with metformin (Fig. 7C). The results indicated that the expression levels of S100a4, P2rx7, Psap, and Grn in epididymal fat were significantly up-regulated in HFD-fed mice, while those of S100a1, Cyb5a, Pdk2, and Mccc1 were significantly down-regulated.

Discussion
In this study, we show that meta-analysis provides valuable information for generalizing the results of multiple studies and investigating the underlying mechanisms of IR in human WAT. One of the major advantages of meta-analysis is that it allows us to overcome the shortcomings of the limited number of available samples in a single study by increasing statistical power and determine whether there is a consensus in transcriptome changes (Fig. 1). For the meta-analysis techniques used in this study, the method of Choi et al. 32 described by the GeneMeta R package was first chosen for a two-class comparison, disease vs. normal state 33 . Because the meta-signature based on gene-wise combined Z-scores can directly be applied to GSEA using the GSEAPre-ranked method. These methods are very useful tools for conducting meta-analysis and functional enrichment analysis. Furthermore, GSEA provides built-in tools for conversion between several organisms' microarray identifiers (i.e., platform probe identifiers) to human gene symbols, which can then be easily applied in cross-species analysis. WAT was initially regarded as a passive energy repository, but now is considered to be an endocrine organ that secretes adipokines such as leptin, adiponectin, TNF-α, and interleukin-6 (IL-6), which are common mediators of adipose tissue, inflammation, and immunity 34 . In humans and mice, obesity changes adipose tissue metabolism and increases accumulation of macrophages and other immune cells in WAT 35 . Macrophages are then stimulated by monocyte chemoattractant protein-1 to infiltrate WAT 36 . In the state of obesity, excessive energy triggers NF-κB signaling pathways and may disrupt insulin signaling pathway via induction of TNF-α secretion by WAT, which in turn causes IR 37 . As expected, our results indicated that the up-regulated human IR meta-signature was significantly enriched in immune system activity and inflammation GO terms ( Fig. 2A) as well as TNF-α signaling via NF-κB (data not shown). NF-κB can promote cell progression and oncogenesis and its signaling pathway is fundamental to the induction and maintenance of EMT 38 . We found that EMT-related genes are significantly associated with the up-regulated meta-signature (Fig. 2B), suggesting that macrophage infiltration in WAT may be associated with EMT because TNF-α signaling via NF-κB is closely related to inflammation and macrophage infiltration.

Gene ID Symbol
Gene name Z-score FDR  Table 2. The list of core meta-signatures consistently enriched across all models in Fig. 3. The genes in the network in Fig. 7A were shown in bold. Although we performed our meta-analysis using gene expression datasets in subcutaneous WAT from human IR patients, cross-species analysis confirmed that our human IR meta-signature can be applied to animal IR or T2DM model datasets derived from overall WAT, including visceral adipose tissue as well as an in vitro IR model (Fig. 3). Our meta-signature can also be utilized to perform inverse correlation approaches using drug candidates and gain-or loss-of-function microarray datasets for attenuating T2DM or IR (Fig. 4). Even though subcutaneous and visceral fat is known to have different roles in IR 34 , our data showed that the gene expression patterns in the two types of fat are similar. Because the enrichment score of mouse IR model data was the highest among model organisms (Fig. 3), the mouse IR model can be optimally applied in human IR studies. Overall, our results sugget that the human IR meta-signature might be utilized as a pre-screening tool for developing diabetes treatments or insulin-sensitizing drugs.
The American Diabetes Association recommends metformin as a first-line agent to treat T2DM patients 22,23 ; however, the mechanism of action of metformin is not completely understood 39 . Recent studies suggested that the anti-diabetic effect of metformin may involve 5′ AMP-activated protein kinase (AMPK)-dependent mechanisms, but there remains some controversy as to whether the activation of AMPK by metformin is direct 40,41 . TZDs were approved for treatment of T2DM because they decrease IR and are known to be PPARγ activators 24 . Even though some members of TZDs may have toxicity 42,43 , there already is ample evidence that metformin-and/or TZD-treatment alter whole body or adipose tissue insulin sensitivity in humans as well as in mice 24,28,44 . Together, these findings show the necessity for a deeper understanding of the underlying mechanisms of action of the drugs currently in use. Through identification of the cross-species drug signature that is co-regulated by TZDs (Fig. 4A) as well as metformin (Fig. 4B) and through considering clinical relevance via correlations between the expression of genes in the drug signature and HOMA2-IR, we were able to determine gene sets which are responsive to IR condition in overall WAT. Although we could not conduct a correlation test between gene expression responses and the efficacy of the metformin/TZD due to the insufficient clinical information on the data, we tried to validate the correlation between drug signature expression responses and HOMA2-IR using GSE32512 which is the largest dataset with clinical information (Fig. 6).
Consistent with the results of functional enrichment analysis using up-regulated meta-signature genes ( Fig. 2A), we found that the up-regulated drug signature is also closely related to inflammation and immune response (Fig. 5). Among 8 selected genes from 18 genes in the network (degree ≥9) (Fig. 7), Grn is known to be an important regulator of obesity and IR 45 . Knock-out mice of Grn were prevented from becoming IR induced by HFD, whereas the expression levels of the protein and mRNA are significantly increased in WAT derived from HFD-fed mice. Moreover, treating pioglitazone to ob/ob mice reduced the expression level of Grn, suggesting that GRN can directly regulate IR condition and is a target of a TZD. Among 8 genes shown in Fig. 7C, PDK2, CYB5A, and MCCC1 are known to have direct relationships with IR [46][47][48][49] . Additionally, the cross-species drug signature in the PPI network included calcium-binding S100 family proteins, among which S100A4 and S100A1 had indirect relationships with IR 50-54 and the expression patterns were validated by qPCR (Fig. 7C).
We would like to comment that our study is not without limitations. First, we could not control all the variables that could affect adipose gene expression apart from IR, but we tried to minimize the possible contribution of obesity on the signatures by including lean and obese subjects in both insulin sensitive and IR datasets. The expression levels of the genes in our drug signature were compared with both BMI and HOMA2-IR only, because we were more focused on transcriptome levels in WAT than other clinical characteristics such as age, gender, and serum levels of metabolites. Second, we assessed the validity of our results with minimal experimentation, thus more extensive experiments may be necessary to validate our study.
Despite these limitations, however, we strongly believe that our meta-analysis successfully generalized the results of seven studies and identified robust genetic markers of IR. The results may also provide a starting point for identifying potential therapeutic targets. Using meta-analysis based on gene expression microarrays in subcutaneous WAT from human IR patients, we herein showed that the results of our meta-analysis, meta-signature, are robust genetic markers for in vitro and in vivo IR models in overall WAT. Among the meta-signature, we identified the cross-species drug signature that are co-regulated by both metformin and TZDs and are closely associated with both HOMA2-IR and BMI, suggesting that our results can be a good starting point for identifying genetic markers, potential drug targets, or therapeutic targets for IR.

Methods
Data collection for meta-analysis. The NCBI Gene Expression Omnibus (GEO) was queried for one-channel (one color) microarray gene expression datasets that contain human IR subjects who had not been given drug treatment before biopsy sampling. If the IR status of a given individual in the datasets was not given, HOMA2-IR based on fasting glucose and insulin levels was calculated using the HOMA2-IR calculator (https:// www.dtu.ox.ac.uk/homacalculator/). The cutoff value for classification of IR subjects was HOMA2-IR ≥1.7, based on a previous study 55 . To make a clear distinction between IR and insulin sensitive (normal), subjects with HOMA2-IR <1 were regarded as insulin sensitive.
Microarray preprocessing and meta-analysis. Affymetrix microarray datasets were preprocessed and normalized following Jung et al. 56 . For Illumina and Agilent datasets, quantile normalization was performed using the limma R package 57 . After preprocessing, the datasets were combined using probe's Entrez ID and ComBat 58 function and the surrogate variable analysis (SVA) R package 59 was applied to adjust for batch effects. Finally, a meta-analysis was carried out using the random effects model (REM) method in the GeneMeta R package proposed by Choi et al. 32 to compute Z-scores and the false discovery rate (FDR).
Enrichment and cross-species analysis using gene set enrichment analysis. Functional enrichment analyses were performed based on Z-scores using the GSEAPreranked method in gene set enrichment analysis (GSEA) 29,60 . Gene ontology (GO) gene sets in the Molecular Signatures Database (MSigDB) 61 were applied to GSEA. The results of GO enrichment analysis were visualized using an Enrichment Map 62 with default settings.  as a pre-ranked list for cross-species analysis. For co-expression analysis using thiazolidinedione (TZD)-and metformin-treated microarray datasets, leading-edge subset derived from GSEA results were used. This subset of genes made the most contribution to the enrichment results 29 . In addition, the database for annotation, visualization and integrated discovery (DAVID) 64,65 tool was used to perform functional enrichment analysis for selected genes.
Epididymal tissue sampling and RNA extraction. Mouse epididymal adipose tissue samples were obtained from a previous study conducted at Kyung Hee University with the approval of the Institutional Animal Care and Use Committee (IACUC) of Kyung Hee University (IACUC approval No. KHUASP(SE)-15-069). Experimental procedures were carried out in accordance with relevant guidelines and regulations. During the study, five-week-old C57BL/6 mice (Envigo, Indianapolis, IN, USA) were housed in a temperature-(22 ± 2 °C) and humidity-controlled (50 ± 5%) room with a 12/12 h light/dark cycle and free access to food and water. Tissue samples used were from mice in three different treatment groups: a regular diet-fed (RD) group, a high fat diet-fed (HFD) group, and an HFD plus 300 mg/kg of metformin treatment (MET) group. The RD (10% kcal from fat; D12450B) and HFD (60% kcal from fat; D12492) were purchased from Research Diets (New Brunswick, NJ, USA). Mice were orally administered with either vehicle or metformin once a day for 12 weeks. Epididymal adipose tissue was dissected from the mice after an overnight fast, snap-frozen in liquid nitrogen, then stored at −75 °C until analysis. Total RNA was extracted from epididymal adipose tissue of the RD, HFD, and MET groups using an RNeasy Mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.
Microarray and network analysis. The purity and integrity of total RNA extracted from epididymal adipose tissue were evaluated by the ratio of absorbance at 260 and 280 nm and by using an Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA, USA), respectively. The RNA samples from each group were subjected to global gene expression analysis using Affymetrix GeneChip Mouse Gene 2.0 ST arrays (Affymetrix, Santa Clara, CA, USA). The raw and processed microarray data reported in this analysis is deposited in the GEO database as GSE102540. The microarray datasets were preprocessed and normalized using the RMA method. A biological protein-protein interaction (PPI) network was constructed by NetworkAnalyst (http://www.networkanalyst. ca/) 22,23 using the Z-scores of our meta-signature and the interactome database from InnateDB 24 .