Elevated circulating follistatin associates with an increased risk of type 2 diabetes

The hepatokine follistatin is elevated in patients with type 2 diabetes (T2D) and promotes hyperglycemia in mice. Here we explore the relationship of plasma follistatin levels with incident T2D and mechanisms involved. Adjusted hazard ratio (HR) per standard deviation (SD) increase in follistatin levels for T2D is 1.24 (CI: 1.04–1.47, p < 0.05) during 19-year follow-up (n = 4060, Sweden); and 1.31 (CI: 1.09–1.58, p < 0.01) during 4-year follow-up (n = 883, Finland). High circulating follistatin associates with adipose tissue insulin resistance and non-alcoholic fatty liver disease (n = 210, Germany). In human adipocytes, follistatin dose-dependently increases free fatty acid release. In genome-wide association study (GWAS), variation in the glucokinase regulatory protein gene (GCKR) associates with plasma follistatin levels (n = 4239, Sweden; n = 885, UK, Italy and Sweden) and GCKR regulates follistatin secretion in hepatocytes in vitro. Our findings suggest that GCKR regulates follistatin secretion and that elevated circulating follistatin associates with an increased risk of T2D by inducing adipose tissue insulin resistance.

The hepatokine follistatin is elevated in patients with type 2 diabetes (T2D) and promotes hyperglycemia in mice. Here we explore the relationship of plasma follistatin levels with incident T2D and mechanisms involved. Adjusted hazard ratio (HR) per standard deviation (SD) increase in follistatin levels for T2D is 1.24 (CI: 1.04-1.47, p < 0.05) during 19-year follow-up (n = 4060, Sweden); and 1.31 (CI: 1.09-1.58, p < 0.01) during 4-year follow-up (n = 883, Finland). High circulating follistatin associates with adipose tissue insulin resistance and non-alcoholic fatty liver disease (n = 210, Germany). In human adipocytes, follistatin dose-dependently increases free fatty acid release. In genome-wide association study (GWAS), variation in the glucokinase regulatory protein gene (GCKR) associates with plasma follistatin levels (n = 4239, Sweden; n = 885, UK, Italy and Sweden) and GCKR regulates follistatin secretion in hepatocytes in vitro. Our findings suggest that GCKR regulates follistatin secretion and that elevated circulating follistatin associates with an increased risk of T2D by inducing adipose tissue insulin resistance.  ollistatin is a secreted protein that is expressed in almost all tissues. It is linked to metabolic diseases 1,2 , with elevated plasma levels in patients with type 2 diabetes (T2D) 1 . Evidence suggests that follistatin has multiple auto-and paracrine functions in various tissues. Follistatin binds and neutralizes TGF-β family members 3 . Follistatin is essential for the formation and growth of muscle fibers 4,5 and is involved in the development of muscle fiber hypertrophy [6][7][8][9][10][11] . Furthermore, follistatin was found to enhance thermogenic gene expression in differentiated mouse brown adipocytes 12 . Short-term follistatin treatment reduced glucagon secretion from islets of Langerhans, whereas long-term follistatin treatment prevented apoptosis and induced proliferation of rat β cells 13 . Local overexpression of follistatin in the pancreas from diabetic mice resulted in increased serum insulin levels 14 . In humans, circulating follistatin derives predominately from the liver and its expression and secretion are upregulated by a high glucagon-to-insulin ratio 13 .
In mice 15 , follistatin was identified as a mediator of diabetes by promoting white adipose tissue insulin resistance. In hyperglycemic and high-fat-fed obese mice, disruption of follistatin restored glucose tolerance, white adipose tissue insulin signaling and suppression of hepatic glucose production by insulin. In obese individuals with diabetes who underwent gastric bypass surgery, serum follistatin decreased in parallel with HbA 1c levels 15 . Circulating follistatin levels were also found to be increased in individuals with T2D, associating positively with HbA 1c and fasting blood glucose levels. However, follistatin is unaffected by acute alterations in blood glucose and blood insulin concentrations during an oral glucose tolerance test (OGTT) 1 . So far it is unknown whether elevation of plasma follistatin associates with the risk of T2D, independently of established diabetes risk markers. Furthermore, it needs to be established whether and to what extent genetics explains the variability of circulating follistatin levels and whether the mechanisms of follistatin to induce insulin resistance in mice may also be operative in humans.
In this work, we show that circulating follistatin associates with an increased risk of T2D, independently of established risk factors. A possible mechanism may involve an effect of follistatin to induce adipose tissue insulin resistance, resulting in increased adipocyte free fatty acid (FFA) release, which also promotes nonalcoholic fatty liver disease (NAFLD). Secretion of follistatin from the liver cells is regulated by GCKR, in addition to insulin and glucagon.

Results
Plasma follistatin levels associate with the risk of T2D. We evaluated the association of circulating follistatin levels at baseline with incident T2D in the Malmö Diet and Cancer Cardiovascular Cohort (MDC-CC). Of 4195 participants, 577 (13.75%) individuals developed T2D during a mean (±SD) follow-up time of 19.07 (±5.09) years (Table 1, Supplementary Fig. 1, see Supplementary Information for cohort details). Subjects who developed T2D during follow-up had higher plasma follistatin levels at baseline, compared to those who did not progress to diabetes (Table 1). Circulating follistatin associated with an elevated risk of diabetes, hazard ratio (HR) per 1-SD increase in follistatin levels for T2D is 1.29 (CI: 1.19-1.40, p < 0.001), adjusted for age and sex; and 1.24 (CI: 1.04-1.47, p < 0.05) adjusted for multiple risk factors (Table 2). When subjects were divided into quartiles (Q) based on plasma follistatin level, the HRs adjusted for age and sex for Q4 vs. Q1 for incident diabetes was 1.97 (CI: 1.55-2.50, p for trend < 0.001); and 1.35 (CI: 1.04-1.74, p for trend = 0.02) adjusted for multiple risk factors ( Table 2, Supplementary Table 1 and Fig. 1). Plasma follistatin levels correlated with measures of glucose tolerance, insulin secretion, and insulin sensitivity, before and after adjustment for multiple risk factors (Supplementary Table 2). The C-statistics value, which is a measure of the adequacy of fit for the binary outcomes, for model 1 adjusted for age and sex, was 0.5419 (CI: 0.5184-0.5655) and increased significantly to 0.5832 (CI: 0.5603-0.6060) when follistatin was added to the model (difference in C-statistics, 0.041; CI: 0.0187-0.0637; p < 0.001). The C-statistics value adjusted for multiple risk factors was 0.7701 (CI: 0.7510-0.7892) and increased to 0.7718 (CI: 0.7527-0.7909), when follistatin was added (difference in C-statistics, 0.0017; p = 0.113), which may suggest that follistatin does not improve diabetes prediction on its own beyond established risk factors.
We also analyzed an independent cohort, IMI-DIRECT-METSIM (see Supplementary Information for cohort details). Among 1079 subjects, 53 (4.91%) developed T2D during the follow-up period (4-year). Individuals with diabetes incidence had higher baseline plasma follistatin levels ( Table 3). The HRs per SD of plasma follistatin for diabetes was 1.35 (CI: 1.13-1.61,  (Table 4). Thus, our findings in longitudinal cohorts indicate that circulating follistatin associates with the risk of developing T2D.
Relationships of follistatin with adipose tissue insulin resistance and related traits. In the MDC-CC cohort, we observed an association between the elevated follistatin and insulin resistance (Supplementary Table 2). In animals follistatin was found to contribute to diabetes by promoting adipose tissue insulin resistance and attenuating insulin-inhibited white adipose tissue lipolysis, resulting in increased circulating levels of free fatty acids (FFAs) 15 . Here we investigated relationships of circulating follistatin with adipose tissue insulin resistance and related traits in humans. In subjects of the Tübingen Diabetes Family Study (TDFS) cohort without diabetes (n = 210, see Supplementary Information for cohort details), plasma follistatin levels correlated positively with FFAs, measured before and during the OGTT, and visceral fat mass. Importantly, the relationships of plasma follistatin with fasting (std. ß = 0.17, p = 0.009), 60 minutes (std. ß = 0.26, p < 0.0001), and 120 minutes (std. ß = 0.27, p < 0.0001) FFAs were independent of age, sex, and total body fat mass. Furthermore, plasma follistatin levels correlated negatively with adipose tissue insulin sensitivity and leg fat mass (Supplementary Tables 4, 5 and Fig. 2a). Elevated follistatin levels also associated with higher liver fat content, after adjustment for age, sex, and total body fat mass. Furthermore, circulating follistatin levels, adjusted for age, sex, and total body fat mass, were found to be elevated in patients with nonalcoholic fatty liver disease (NAFLD, Fig. 2b). The relationship of follistatin levels with liver fat content disappeared after further adjustment for visceral fat mass and leg fat mass, or for adipose tissue insulin sensitivity (Supplementary Table 6). Thus, these findings suggest that circulating follistatin independently associates with adipose tissue insulin resistance in humans, and that the relationship of follistatin with fatty liver is possibly explained by effects of follistatin on insulin sensitivity of adipose tissue in the leg and the visceral compartment.
Effect of follistatin on insulin-mediated suppression of lipolysis in adipocytes. The effect of follistatin on insulin-mediated suppression of lipolysis was investigated in vitro. Human adipocyte-derived stem cells were differentiated into adipocytes, and treated with 0, 0.3, 3, or 30 µg/mL follistatin for 2 h before exposure to insulin for 3 h. Insulin-inhibited lipolysis was attenuated by follistatin dose-dependently, measured by stepwise increase of glycerol release, reflecting FFA release from adipocytes into the culture medium from the breakdown of stored triglycerides (Fig. 2c).

Number at risk
Regulation of follistatin secretion in the human hepatocyte cell line HEPG2 BY GCKR. It has been previously shown that the liver is the major site of follistatin secretion and that follistatin release from the liver is increased by glucagon and inhibited by insulin 13 . Here we studied how GCKR regulates liver follistatin secretion in the human hepatocyte cell line HepG2. GCKR forms a tight complex with GCK in the nucleus, and dissociation of the GCK-GCKR binding leads to increased GCK translocation from the nucleus to the cytoplasm, which regulates liver cell glucose uptake and glycolysis [18][19][20] . Furthermore, GCKR rs1260326 (p.P446L) has a decreased degree of nuclear localization, enabling to sequester GCK and to directly interact with GCK, which elevates hepatic glucose uptake and disposal, by increasing active cytosolic GCK 19 . Here we applied GCKR-GCK transfection in combination with a chemical disruptor AMG-3969, to model the proposed effects for the functional SNPs that we identified in GWAS, i.e., effects on GCKR-GCK binding and GCKR expression.
HepG2 cells were transfected with GCK or co-transfected with GCK and GCKR expressing plasmids ( Supplementary Fig. 3).
Forty-eight hours after transfection, the cells were treated with and without AMG-3669, a GCK-GCKR complex disruptor that promotes translocation of disassociated GCK from the nucleus to the cytoplasm 21 . The addition of AMG-3969 induced GCK translocation and produced a similar localization pattern to that of free GCK. The nuclear localization of GCK was highest in the absence of the disruptor (Supplementary Figs. 4,5). In the presence of the GCK-GCKR complex and the complex disruptor AMG-3969, glucagon increased follistatin secretion by 40%, compared to control (Fig. 4a). This increase was inhibited by insulin (Fig. 4b). Transfection with GCK alone, or GCK-GCKR co-transfection without AMG-3969, had no effect on follistatin secretion. Thus, our results support the notion that follistatin secretion is regulated by GCKR in addition to insulin and glucagon.

Discussion
In this study, we observed that plasma follistatin levels were elevated many years prior to the onset of T2D, and that circulating follistatin at baseline associated with incident T2D, independently of established diabetes risk markers. We also found that follistatin associated with adipose tissue insulin resistance and related traits, and that follistatin attenuated insulin-mediated suppression of lipolysis in adipocytes. Furthermore, we performed GWAS and identified variants in the GCKR to be the  genetic regulators of circulating follistatin levels (summarized in Fig. 5).
In the MDC-CC cohort, elevated plasma follistatin levels associated with the incidence of diabetes up to 19 years before the onset of the disease. While it is important that this association was found during a long period of follow-up, parameters other than elevated follistatin levels, which may become relevant just prior to the onset of diabetes, or also associate with elevated follistatin levels, may contribute to this relationship. Therefore, we analyzed an independent IMI-DIRECT-METSIM cohort, with 4-year follow-up. The independent association between the elevated follistatin and an increased risk of diabetes was also confirmed in this shorter follow-up period. We also included fasting glucose and CRP to the models to ensure that factors in the subclinical phase of diabetes development were not responsible for the raised levels of follistatin. These data suggest that increased circulating follistatin may serve as a marker of diabetes risk, which may be relevant to indicate an increased risk of diabetes, up to 19 years prior to the manifestation of the disease. However, it is worth noting that our C-statistics analysis also suggests that follistatin on its own may not improve diabetes prediction beyond established risk factors.
Previous investigations have presented rather contradictory evidence on the effects of follistatin under physiological and pathophysiological conditions [13][14][15] . Nevertheless, the physiological regulation of follistatin secretion and pathological effects of abnormally elevated follistatin may represent different avenues. Under normal physiological conditions, disruption of the GCKR-GCK complex, triggered by e.g., glucose and fructose 22 , stimulates follistatin secretion, which is regulated by insulin and glucagon. However, in an insulin-resistant state, attenuated insulin signaling in the liver may lead to elevated follistatin secretion as previously shown in mouse models 15 . Abnormally elevated follistatin secretion may further exacerbate liver insulin resistance by promoting FFA production from adipose tissue and ultimately lead to NAFLD, possibly aggravating diabetes. The previous finding that follistatin increases beta-cell proliferation during normal physiological conditions 13 is perfectly in line with the need for increased insulin secretion to compensate for insulin resistance, and furthermore raises the intriguing possibility that follistatin plays a key role in mediating this signal from the liver to the pancreatic beta cell. Further studies are needed to understand the role of follistatin in the cross-talk between liver and pancreas in normal, as well as pathophysiological conditions. After investigating mechanisms by which follistatin may be involved in the pathophysiology of T2D, our data suggest that the effect of follistatin in promoting adipose tissue insulin resistance in mice 15 may also be operative in humans. We found that circulating follistatin correlated negatively with adipose tissue insulin sensitivity and positively with circulating FFA levels and liver fat content. Interestingly, the relationship of follistatin with liver fat was affected by adjustment for the percentage of leg fat mass, which suggests that follistatin may impair insulin sensitivity of adipose tissue, predominantly in leg fat, the fat compartment that is considered to be a reservoir for fatty acid storage 23,24 . In  this respect, it is important to note that in our study adipose tissue insulin sensitivity correlated positively with the percentage of leg fat mass, but negatively with visceral fat mass. Consequently, it could be speculated that follistatin-induced adipose tissue insulin resistance shifts fatty acids not only to the visceral fat depots, but also to the liver and, thereby, aggravates NAFLD. This view was further supported by our in vitro adipocyte data that excess follistatin attenuated insulin-inhibited lipolysis and elevated FFA release, which in turn may predispose to NAFLD. Increased FFAs per se also stimulate glycogenolysis, gluconeogenesis and, thereby, contribute to increased risk of T2D 25 . However, it is not fully understood how follistatin regulates lipolysis in adipocytes. It has been shown in a previous investigation in mice that follistatin suppresses the interaction between insulin receptor substrate 1 (Irs1) and p110a subunit of PI3kinase in adipose tissue of LDKO mice, which was consistent with reduced AKT activation, attenuated phosphorylation, and inactivation of hormone-sensitive lipase (HSL) 15 . The exact mechanisms by which follistatin inhibits the association between Irs1 and p110a requires further investigation. Furthermore, in our in vitro experiments on human adipocytes, the concentrations of follistatin were relatively higher than those in the serum of healthy subjects during an overnight fast. Nevertheless, it has also been shown that serum follistatin levels may vary considerably in different situations, such as pregnancy and in various disease states 26 , as well as under a mixed meal test, exercise, and prolonged fasting [26][27][28] . We may speculate that under certain conditions higher follistatin concentrations may also be present in vivo. Besides our previous and present findings on the impact of follistatin on the regulation of glucose and lipid metabolism in animals and in vitro, to further investigate causal relationships of follistatin with an incidence of diabetes, Mendelian randomization (MR) analysis would be an instrumental approach. Using variants of GCKR in MR analysis, which we identified to be most strongly associated with follistatin levels, is nevertheless very problematic, because these variants of GCKR were found to be the most pleiotropic variants in our exome sequencing study of close to 10,000 individuals (unpublished data in the DIRECT-METSIM study). In this respect, the variant rs780094 in GCKR, which most strongly associated with follistatin levels in our study, was found to associate with elevated fasting and random glycemia and increased risk of T2D in several large studies. However, it also strongly associated with lower CRP, triglyceride, LDL cholesterol levels, and other diabetes risk parameters (https:// t2d.hugeamp.org/variant.html?variant=rs780094).
Our GWAS analyses in two independent cohorts identified the SNP rs1260326 in the GCKR gene to strongly associate with plasma follistatin levels. This SNP has also been shown in previous studies to be associated with more than 25 metabolic traits, including T2D risk, NAFLD, fasting insulin, total cholesterol, as well as circulating levels of various metabolites 29 . It is noteworthy that functional variants in the GCKR gene have been associated with opposite effects on fasting plasma triglyceride and glucose concentrations 16 . Experiments in diabetic animals also showed that the small molecule disruptor of the GCKR-GCK complex AMG-1694 lowered blood glucose, but increased triglyceride levels 21 . Based on our findings, this phenomenon may now be explained by the fact that the GCKR-GCK disruptor may have increased liver cell follistatin secretion, which in turn, may have promoted adipose tissue insulin resistance and an increase of triglyceride levels.
The molecular mechanisms by which the GCK-GCKR system regulates follistatin secretion in hepatic cells is not known. GCKR regulates GCK, playing an important role in regulating the rate of glucose metabolism in the hepatocyte. Thus, we can hypothesize that the association of GCKR locus with follistatin levels may be explained by the effects of GCKR regulatory SNPs on GCKR function in the liver. To examine how GCKR-GCK may regulate the secretion of follistatin from the liver, we employed in vitro experiments in the human hepatocyte cell line HepG2. Although the HepG2 cell line shows altered metabolic activity compared to primary hepatocytes, the cell line serves as an acceptable model to study how GCKR-GCK nuclear dissociation and GCK translocation to the cytoplasm regulate follistatin secretion. Transfection of GCK alone did not increase follistatin secretion in HepG2 cells. This may be explained by the very low endogenous expression of GCKR in HepG2 cells, in the absence of which GCK is decreased, likely through degradation 30-32 . In the liver, GCK facilitates the storage of glucose as glycogen, which is under the control of GCKR via binding and subsequently inactivation of GCK in the nucleus of the hepatocyte. Disruption of the GCKR-GCK complex enables the translocation of GCK from the nucleus to the cytoplasm where it facilitates the conversion of glucose to glucose-6-P 33 . In the postprandial state with increased glucose levels, GCKR-GCK complex is disrupted and GCK is released to the cytoplasm where it stimulates glycolysis, glycogen formation, and de novo lipogenesis, while in the fasting state, GCKR suppresses GCK and subsequently hepatic glycolysis. Notably, we provide a mechanism linking glucagon-and insulinregulated follistatin secretion to GCKR. Here we observed that glucagon was only capable of stimulating follistatin secretion from the liver cells upon GCKR-GCK disruption. This may induce a vicious cycle in that both the high glucose and high glucagon, increase follistatin secretion, resulting in increased adipose tissue lipolysis and circulating FFAs. Furthermore, we also observed that insulin suppresses follistatin secretion independently of glucagon-GCK-GCKR signaling. Nevertheless, as the GCKR variants are associated with multiple molecular traits, including multiple other protein markers, at this stage, we do not know if the malfunction of GCKR is peculiar to follistatin, or if other proteins are involved.
The GCKR gene is in close proximity to ZNF512, C2orfl6, and GPN1 on chromosome 2, which also contains SNPs showing significant association with plasma follistatin levels. Other SNPs showing significant association with plasma follistatin include exm1435650 in the ZNF333 gene (p = 1.98E-11), exm831645 in ADAMTS14 (p = 5.50E-11), exm373967 in TEME44 (p = 2.05E-09) and exm646262 in CUX1 which have been shown to be genetic factors influencing elevated markers of death receptor-activated apoptosis associated with increased diabetes and cardiovascular disease risks 34 . The effect of these identified genes strongly associated with circulating follistatin requires further investigation.
In conclusion, elevated circulating follistatin associates with an increased risk of incident T2D, independently of established diabetes risk markers. Among the mechanisms that may explain this relationship in humans we found supporting evidence that follistatin induces insulin resistance in adipose tissue, thereby, promoting adipose tissue lipolysis and NAFLD. Furthermore, we found that the GCK-GCKR complex may be involved in the regulation of plasma follistatin levels in humans. Thus, follistatin may be an attractive target for therapeutic interventions to prevent T2D and NAFLD.   35 . This method is based on a matched pair of antibodies linked to unique oligonucleotides binds to the respective protein target, and DNA amplicon can be subsequently quantified by quantitative real-time PCR. Specifically, follistatin was measured using a Proseek Multiplex CVD I 96 × 96 Kit (Olink Bioscience, Uppsala, Sweden) assay based on the Proximity Extension Ligation technology on a Fluidigm BioMark HD real-time PCR platform in 54 chip runs 35 . The lower and upper limits of follistatin were 1.91 and 62,500 pg/mL, respectively. The samples were consecutively aliquoted on plates, regardless of future diabetes status. The plates were analysed in random order. Follistatin concentrations are presented as normalized protein expression (NPX) arbitrary units (AU) with log2 scale calculated from Ct values which was converted into the linear scale (2 NPX = linear NPX). The NPX measurement provides relative qualification, and samples can be compared within a cohort but not in two separate cohort studies with different scales as seen in MDC-CC and IMI-DIRECT-METSIM cohorts, where epidemiological findings based on follistatin measurements were robustly reproduced independently. Furthermore, the PEA offers highly specific and sensitive large-scale measurement and has been shown to be comparable to ELISA method 36 . The intra-assay coefficient of variation (CV) was 9%, and the inter-assay CV was 15%. CV was calculated per assay using the assumption of a log-normal distribution. The CV was then averaged across panels (www.olink.com for details). For smaller sample size in the TDFS cohort, follistatin was measured by a human Follistatin ELISA kit (DFN00, R&D Systems).

Genotyping
In MDC-CC, non-fasting blood samples were drawn at the baseline examination and stored in the biobank at −80°C. All individuals with information on follistatin plasma levels and genotypes (n = 4239) were included. Genotyping was performed using the Illumina HumanOmniExpressExome Bead Chip on the iScan system and using the Autocall calling algorithm (Illumina, San Diego, CA, USA). For GWAS, we included all individuals with information on follistatin plasma levels and genotypes (n = 4239).
Quality control was performed by exclusion of missingness > 0.05 (both individual and genotypic); identity-by-descent (IBD) match; heterozygosity (absolute cryptic relatedness inbreeding coefficient > 0.2); sex mismatch; population outliers. The minor allele frequency limit was 0.01. A total of 628526 SNPs was included in the analysis.
Linear regression models, with an additive genetic model, were used to test the association between genetic variants and follistatin levels, with adjustment for age and sex. A p value < 5 × 10 −8 was considered as genome-wide significant, corresponding to a Bonferroni correction for one million tests. Version 1.07 of PLINK software was used for association analyses and QC. Manhattan plots and Q-Q plots were drawn with the R software version 3.1.2. Regional significance plots were drawn using LocusZoom (http:// locuszoom.sph.umich.edu/locuszoom/).

SUMMIT
Raw sequencing data were subjected to quality control and imputation using an array of well-developed and optimized software before the data analysis started. For quality control samples were filtered based on individual characteristics; genotyping rate, sex check, population stratification, identified by descent and heterozygosity and variant sequencing quality; genotyping rate, Hardy-Weinberg equilibrium, minor allele frequency, and minor allele count. QC Protocols using PLINK produced efficient and reproducible QC that can be customized to suit different genotype datasets. Quality control filtering cleaned the data from individuals that can cause errors in the analysis; duplicated individual samples, first-degree relatives, individuals from different ancestry, and individuals with low genotyping rate. Rare variants are those single-nucleotide polymorphisms (SNPs) that deviate significantly from Hardy-Weinberg equilibrium and can cause false positives. SUMMIT dataset was imputed by HRC Michigan Imputation Server. Haplotype Reference Consortium (HRC) is a large reference panels that has 64970 human haplotypes and 39,235,157 snps from 20 cohorts mainly from European ancestry. HRC server offers an imputation platform for genotype data. Genotype data that passed quality control were separated to chromosome files, each file has the SNPs of a single chromosome to be imputed and they are usually in VCF format. The imputed chromosome files along with a quality report were downloaded from the server. GWAS was performed for the HRC imputed SUMMIT dataset using SNPTEST (snptest version 2.5.2). The SNPTEST output was filtered for the minor allele frequency (MAF > 0.05), INFO score (> 0.4), and Hardy-Weinberg equilibrium (HWE > 0.0000057). Manhattan plots and Q-Q plots were generated in R (R version 3.4.3).
Statistics. In MDC-CC and IMI-DIRECT-METSIM analyses, one-way analysis of variance for continuous variables and Pearson's Chi-squared test for dichotomous variables were used to assess the cross-sectional relationships between plasma follistatin quartiles and diabetes risk factors. Multiple linear regression was used to analyze the association between follistatin and glucose, HbA 1c , HOMA2, and insulin at baseline and reexamination, adjusted for potential confounding factors. Natural log-transformed values for HOMA2, insulin and CRP were used due to skewed distributions.
Cox proportional hazards regression was used to examine hazard ratios (HR) with 95% confidence interval (CI) for incidence of diabetes, by quartiles of follistatin and per 1 standard deviation (SD) increment, using the lowest quartile as the reference category. Potential confounders were age, sex, waist circumference, smoking habits, LDL, HDL cholesterol, fasting glucose, systolic blood pressure, antihypertensive medications, lipid-lowering medications, CRP, BMI, physical activity, alcohol intake, and fiber intake. The fit of the proportional hazards model was confirmed by plotting the incidence rate over time. The Kaplan-Meier curve was used to illustrate the incidence of diabetes in relation to the follistatin quartiles. Time axis was follow-up time until death, emigration, incident diabetes or end of follow-up. SPSS Statistics (version 22) and Stata software version 12.0 (Stata Corp, College Station, TX, USA) were used for statistical analyses.
In TDFS cohort analysis data are given as mean ± SD. For statistical analyses data that were not normally distributed (Shapiro-Wilk W Test) were inverse normal transformed to achieve a normal distribution for the investigation of univariate and multivariate relationships. Pearson correlation and the nonparametric Wilcoxon test were used to investigate univariate relationships between the parameters. Multivariate models were used to investigate independent relationships. The statistical software package JMP 13.0.0 (SAS Institute Inc, Cary, North Carolina) was used.
Ethics. The study was performed in accordance with the Declaration of Helsinki. The ethics committee at Lund University approved the MDC; and SUMMIT study was approved by ethics committees in each of the centers in Malmö (Sweden), Pisa (Italy), Dundee (U.K.), and Exeter (U.K.). For DIRECT-METSIM cohort, approval for the study protocol was obtained from the Ethics Committee of the University of Eastern Finland and Kuopio University Hospital and all participants provided written informed consent at enrolment. The research conformed to the ethical principles for medical research involving human participants outlined in the declaration of Helsinki. For TDFS cohort, informed written consent was obtained from all participants and the Medical Ethics Committee of the University of Tübingen had approved the protocol.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All the relevant data supporting the findings of this study are available within this article, in the supplementary material, the source data file, or relevant repositories. For MDC-CC, SUMMIT and TDFS cohorts, Swedish, European and German legislation impose restrictions on public availability of datasets containing pseudonymized information. The full datasets including genome-wide data and phenotypes can be accessed for the MDC-CC through an institutional repository at Lund University (https://www.malmokohorter.lu.se/english), and SUMMIT through the SUMMIT vascular imaging project steering committee (jan.nilsson@med.lu.se), and University of Tübingen (Norbert.Stefan@med.uni-tuebingen.de) with pertinent permissions. Details on how to requests access to IMI-DIRECT data, including data presented here, can be found through https://directdiabetes.org/contacts/. Requestors will be provided with information and assistance on how data can be accessed via the DIRECT Computerome following submission of appropriate documentation. Source data are provided with this paper.